A Framework in CRM Customer Lifecycle: Identify Downward Trend and Potential Issues Detection
Customer retention is one of the primary goals in the area of customer relationship management. A mass of work exists in which machine learning models or business rules are established to predict churn. However, targeting users at an early stage when they start to show a downward trend is a better strategy. In downward trend prediction, the reasons why customers show a downward trend is of great interest in the industry as it helps the business to understand the pain points that customers suffer and to take early action to prevent them from churning. A commonly used method is to collect feedback from customers by either aggressively reaching out to them or by passively hearing from them. However, it is believed that there are a large number of customers who have unpleasant experiences and never speak out. In the literature, there is limited research work that provides a comprehensive and scientific approach to identify these “silent suffers”. In this study, we propose a novel two-part framework: developing the downward prediction process and establishing the methodology to identify the reasons why customers are in the downward trend. In the first prediction part, we focus on predicting the downward trend, which is an earlier stage of the customer lifecycle compared to churn. In the second part, we propose an approach to figuring out the cause (of the downward trend) based on a causal inference method and semi-supervised learning. The proposed approach is capable of identifying potential silent sufferers. We take bad shopping experiences as inputs to develop the framework and validate it via a marketing A/B test in the real world. The test readout demonstrates the effectiveness of the framework by driving 88.5% incremental lift in purchase volume.
💡 Research Summary
The paper addresses a gap in customer relationship management (CRM) research by shifting focus from traditional churn prediction to the earlier “downward trend” stage, where customers are still active but exhibit declining gross merchandise volume (GMV), bought item count (BI), or purchase days (PD). The authors propose a two‑part framework: (1) a predictive component that identifies customers likely to enter a downward trend, and (2) a diagnostic component that uncovers the underlying causes, specifically “bad customer experience” (BCE), and discovers silent sufferers who do not report their problems.
Downward Trend Definition
For each metric, a “norm box” is constructed using the customer’s past 12‑month mean (μ) and standard deviation (σ). A downward flag is set when the metric in the target month falls below μ − α·σ, where α is a tunable sensitivity parameter. Separate α values are used for frequent buyers (FB) and infrequent buyers (IB) to control the event rate.
Predictive Modeling
Gradient Boosting Machine (GBM) classifiers are trained for GMV, BI, and PD separately. From an initial pool of ~200 candidate features (transaction details, BCE history, engagement, site behavior, demographics, etc.), variable‑importance analysis narrows each model to 13 key features such as recent‑month ratios, purchase‑day counts, and gap‑between‑purchases statistics. Models are limited to depth 5 and 150 trees to avoid over‑fitting; validation AUCs exceed 0.90 for all three models, with maximum F1 scores around 0.47. The authors recommend using decile or percentile buckets of the prediction scores rather than a fixed cutoff, and suggest an ensemble of the three bucket outputs for a comprehensive downward‑trend score.
Causal Inference and Golden Set Construction
To isolate customers whose downward trend is truly caused by BCE, the authors apply causal inference. They start with a small seed set of customers in the high‑decile (7‑10) who reported a BCE on their last purchase day, assuming these are genuine sufferers. Linear regression of cumulative BCE count (X) on downward‑trend decile (Y) is performed separately for FB and IB groups. The results show asymmetry: X → Y is statistically significant (p < 0.001) while Y → X is not, indicating BCE precedes the decline. This seed set is termed the “golden set.”
Semi‑Supervised Learning for Silent Sufferers
Recognizing that many customers experience BCE without reporting it, the authors employ an Expectation‑Maximization (EM) based semi‑supervised algorithm. The golden set is labeled as positive (1), all other customers initially as negative (0). In each iteration, a random subset of the unlabeled data is treated as “spies,” a binary classifier is trained on the mixed set, and the classifier’s scores are used to re‑label high‑probability samples as positive. The process repeats until the label‑change rate falls below a predefined threshold or a maximum number of iterations is reached. This yields a set of “silent sufferers” who are likely impacted by BCE but have not voiced complaints.
Real‑World Validation
The framework is implemented on eBay’s e‑commerce data, using BCE as the primary lever. An A/B test is conducted: the treatment group receives targeted marketing actions (personalized messages, coupons, recommendations) based on the combined downward‑trend score and identified BCE risk, while the control group receives standard communications. The treatment yields an 88.5 % lift in purchase volume, demonstrating that early detection of downward trends coupled with cause‑specific interventions can substantially improve revenue and retention.
Contributions
- Formal, quantitative definition of downward trend and high‑performing GBM predictors (AUC > 0.90).
- Application of causal inference to isolate a reliable “golden set” of customers whose decline is driven by a specific negative experience.
- Development of a semi‑supervised EM algorithm to expand the golden set and uncover silent sufferers despite limited labeled data.
- Empirical evidence of business impact through a large‑scale A/B experiment.
Limitations and Future Work
The study focuses on a single lever (BCE); extending the methodology to multiple concurrent levers (price, delivery, UI) would increase robustness. Linking downward‑trend predictions to long‑term customer lifetime value (CLV) could provide deeper strategic insights. Finally, testing the framework in other domains (telecom, finance) would assess its generalizability.
Overall, the paper presents a practical, data‑driven approach that moves CRM from reactive churn mitigation to proactive early‑stage intervention, offering both methodological novelty and measurable commercial benefit.
Comments & Academic Discussion
Loading comments...
Leave a Comment