Comparative models in customer base analysis: parametric model and observation-driven model
Abstract
This study conducts a dynamic rolling comparison between the Pareto/NBD model (parametric model) and machine learning algorithms (observation-driven models) in customer base analysis, which the literature has not comprehensively investigated before. The aim is to find the comparative edge of these two approaches under customer base analysis and to define the implementation timing of these two paradigms. This research utilizes Pareto/NBD (Abe) as representative of Buy-Till-You-Die (BTYD) models in order to compete with machine learning algorithms and presents the following results. (1) The parametric model wins in transaction frequency prediction, whereas it loses in inactivity prediction. (2) The BTYD model outperforms machine learning in inactivity prediction when the customer base is active, performs better in an inactive customer base when competing with Poisson regression, and wins in a short-term active customer base when competing with a neural network algorithm in transaction frequency prediction. (3) The parametric model benefits more from a short calibration length and a long holdout/target period, which exhibit uncertainty. (4) The covariate effect helps Pareto/NBD (Abe) gain a better predictive result. These findings assist in defining the comparative edge and implementation timing of these two approaches and are useful for modeling and business decision making.
Keyword : BTYD, parametric model, Pareto/NBD model, observation-driven model, machine learning, customer base analysis, non-contractual setting
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Ahmad, A. K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data, 6(1), 28. https://doi.org/10.1186/s40537-019-0191-6
Batislam, E. P., Denizel, M., & Filiztekin, A. (2007). Empirical validation and comparison of models for customer base analysis. International Journal of Research in Marketing, 24(3), 201–209. https://doi.org/10.1016/j.ijresmar.2006.12.005
Benoit, D. F., & Van den Poel, D. (2009). Benefits of quantile regression for the analysis of customer lifetime value in a contractual setting: An application in financial services. Expert Systems with Applications, 36(7), 10475–10484. https://doi.org/10.1016/j.eswa.2009.01.031
Bernat, J. R. (2019). Modelling customer lifetime value in a continuous, non-contractual time setting. http://hdl.handle.net/2105/45923
Buckinx, W., Baesens, B., Van den Poel, D., Van Kenhove, P., & Vanthienen, J. (2002). Using machine learning techniques to predict defection of top clients. WIT Transactions on Information Communication Technologies, 28.
Buckinx, W., & Van den Poel, D. (2005). Customer base analysis: Partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting. European Journal of Operational Research, 164(1), 252–268. https://doi.org/10.1016/j.ejor.2003.12.010
Burez, J., & Van den Poel, D. (2009). Handling class imbalance in customer churn prediction. Expert Systems with Applications, 36(3), 4626–4636. https://doi.org/10.1016/j.eswa.2008.05.027
Chen, P. P., Guitart, A., del Río, A. F., & Periáñez, Á. (2018). Customer lifetime value in video games using deep learning and parametric models. In 2018 IEEE International Conference on Big Data (Big Data), (pp. 2134–2140). IEEE. https://doi.org/10.1109/BigData.2018.8622151
Chen, Z. Y., Fan, Z. P., & Sun, M. H. (2012). A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data. European Journal of Operational Research, 223(2), 461–472. https://doi.org/10.1016/j.ejor.2012.06.040
Coussement, K., & De Bock, K. W. (2013). Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning. Journal of Business Research, 66(9), 1629–1636. https://doi.org/10.1016/j.jbusres.2012.12.008
Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327. https://doi.org/10.1016/j.eswa.2006.09.038
Coxe, S., West, S. G., & Aiken, L. S. (2009). The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of Personality Assessment, 91(2), 121–136. https://doi.org/10.1080/00223890802634175
Cui, G., Wong, M. L., & Lui, H.-K. (2006). Machine learning for direct marketing response models: Bayesian networks with evolutionary programming. Management Science, 52(4), 597–612. https://doi.org/10.1287/mnsc.1060.0514
Dew, R., & Ansari, A. (2018). Bayesian nonparametric customer base analysis with model-based visualizations. Marketing Science, 37(2), 216–235. https://doi.org/10.1287/mksc.2017.1050
Fader, P. S., & Hardie, B. G. (2001). Forecasting repeat sales at CDNOW: A case study. Interfaces, 31(3_suppl.), S94-S107. https://doi.org/10.1287/inte.31.4.94.9683
Fader, P. S., Hardie, B. G., & Lee, K. L. (2005a). “Counting your customers” the easy way: An alternative to the Pareto/NBD model. Marketing Science, 24(2), 275–284. https://doi.org/10.1287/mksc.1040.0098
Fader, P. S., Hardie, B. G., & Lee, K. L. (2005b). RFM and CLV: Using iso-value curves for customer base analysis. Journal of Marketing Research, 42(4), 415–430. https://doi.org/10.1509/jmkr.2005.42.4.415
Fader, P. S., Hardie, B. G., & Shang, J. (2010). Customer-base analysis in a discrete-time noncontractual setting. Marketing Science, 29(6), 1086–1108. https://doi.org/10.1287/mksc.1100.0580
Ferreira, J., Vellasco, M. M., Pacheco, M. A. C., Carlos, R., & Barbosa, H. (2004). Data mining techniques on the evaluation of wireless churn [Conference presentation]. European Symposium on Artificial Neural Networks, Bruges, Belgium.
Gardner, W., Mulvey, E. P., & Shaw, E. C. (1995). Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin, 118(3), 392. https://doi.org/10.1037/0033-2909.118.3.392
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). Taylor & Francis. https://doi.org/10.1201/b16018
Gupta, S., Hanssens, D., Hardie, B., Kahn, W., Kumar, V., Lin, N., Ravishanker, N., & Sriram, S. (2006). Modeling customer lifetime value. Journal of Service Research, 9(2), 139–155. https://doi.org/10.1177/1094670506293810
Hadden, J., Tiwari, A., Roy, R., & Ruta, D. (2007). Computer assisted customer churn management: State-of-the-art and future trends. Computers & Operations Research, 34(10), 2902–2917. https://doi.org/10.1016/j.cor.2005.11.007
Hadiji, F., Sifa, R., Drachen, A., Thurau, C., Kersting, K., & Bauckhage, C. (2014). Predicting player churn in the wild. In 2014 IEEE Conference on Computational Intelligence and Games (pp.1–8). IEEE. https://doi.org/10.1109/CIG.2014.6932876
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer Science & Business Media.
Huang, B., Kechadi, M. T., & Buckley, B. (2012). Customer churn prediction in telecommunications. Expert Systems with Applications, 39(1), 1414–1425. https://doi.org/10.1016/j.eswa.2011.08.024
Hung, S. Y., Yen, D. C., & Wang, H. Y. (2006). Applying data mining to telecom churn management. Expert Systems with Applications, 31(3), 515–524. https://doi.org/10.1016/j.eswa.2005.09.080
Jerath, K., Fader, P. S., & Hardie, B. G. (2011). New perspectives on customer “death” using a generalization of the Pareto/NBD model. Marketing Science, 30(5), 866–880. https://doi.org/10.1287/mksc.1110.0654
Keramati, A., Ghaneei, H., & Mirmohammadi, S. M. (2016). Developing a prediction model for customer churn from electronic banking services using data mining. Financial Innovation, 2(1), 10. https://doi.org/10.1186/s40854-016-0029-6
Korkmaz, E., Kuik, R., & Fok, D. (2013). “Counting Your Customers”: When will they buy next? An empirical validation of probabilistic customer base analysis models based on purchase timing (ERIM Report Series Research in Management, ERS-2013-2001-LIS). Erasmus Research Institute of Management. http://hdl.handle.net/1765/38235
Kumar, S., & Zymbler, M. (2019). A machine learning approach to analyze customer satisfaction from airline tweets. Journal of Big Data, 6(1), 62. https://doi.org/10.1186/s40537-019-0224-1
Ma, S.-H., & Liu, J.-L. (2007). The MCMC approach for solving the Pareto/NBD model and possible extensions. In Third International Conference on Natural Computation (ICNC 2007). (pp. 505–512). IEEE. https://doi.org/10.1109/ICNC.2007.728
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.
Neslin, S. A., Gupta, S., Kamakura, W., Lu, J., & Mason, C. H. (2006). Defection detection: Measuring and understanding the predictive accuracy of customer churn models. Journal of Marketing Research, 43(2), 204–211. https://doi.org/10.1509/jmkr.43.2.204
Ngai, E. W., Xiu, L., & Chau, D. C. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert Systems with Applications, 36(2), 2592–2602. https://doi.org/10.1016/j.eswa.2008.02.021
Nie, G., Rowe, W., Zhang, L., Tian, Y., & Shi, Y. (2011). Credit card churn forecasting by logistic regression and decision tree. Expert Systems with Applications, 38(12), 15273–15285. https://doi.org/10.1016/j.eswa.2011.06.028
Platzer, M., & Reutterer, T. (2016). Ticking away the moments: Timing regularity helps to better predict customer activity. Marketing Science, 35(5), 779–799. https://doi.org/10.1287/mksc.2015.0963
Reinartz, W. J., & Kumar, V. (2000). On the profitability of long-life customers in a noncontractual setting: An empirical investigation and implications for marketing. Journal of Marketing, 64(4), 17–35. https://doi.org/10.1509/jmkg.64.4.17.18077
Romero, J., Van der Lans, R., & Wierenga, B. (2013). A partially hidden Markov model of customer dynamics for CLV measurement. Journal of Interactive Marketing, 27(3), 185–208. https://doi.org/10.1016/j.intmar.2013.04.003
Saradhi, V. V., & Palshikar, G. K. (2011). Employee churn prediction. Expert Systems with Applications, 38(3), 1999–2006. https://doi.org/10.1016/j.eswa.2010.07.134
Schmittlein, D. C., Morrison, D. G., & Colombo, R. (1987). Counting your customers: Who-are they and what will they do next? Management Science, 33(1), 1–24. https://doi.org/10.1287/mnsc.33.1.1
Sharma, A., & Panigrahi, D. (2011). A neural network based approach for predicting customer churn in cellular network services. International Journal of Computer Applications, 27(11), 26–31. https://doi.org/10.5120/3344-4605
Sifa, R., Hadiji, F., Runge, J., Drachen, A., Kersting, K., & Bauckhage, C. (2015). Predicting purchase decisions in mobile free-to-play games [Conference presentation]. Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference.
Sifa, R., Runge, J., Bauckhage, C., & Klapper, D. (2018). Customer lifetime value prediction in noncontractual freemium settings: Chasing high-value users using deep neural networks and SMOTE. In Proceedings of the 51st Hawaii International Conference on System Sciences. https://doi.org/10.24251/HICSS.2018.115
Singh, S. S., Borle, S., & Jain, D. C. (2009). A generalized framework for estimating customer lifetime value when customer lifetimes are not observed. Quantitative Marketing and Economics, 7(2), 181–205. https://doi.org/10.1007/s11129-009-9065-0
Smeureanu, I., Ruxanda, G., & Badea, L. M. (2013). Customer segmentation in private banking sector using machine learning techniques. Journal of Business Economics and Management, 14(5), 923–939. https://doi.org/10.3846/16111699.2012.749807
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82(398), 528–540. https://doi.org/10.1080/01621459.1987.10478458
Timoshenko, A., & Hauser, J. R. (2019). Identifying customer needs from user-generated content. Marketing Science, 38(1), 1–20. https://doi.org/10.1287/mksc.2018.1123
Trinh, G., Rungie, C., Wright, M., Driesener, C., & Dawes, J. (2014). Predicting future purchases with the Poisson log-normal model. Marketing Letters, 25(2), 219–234. https://doi.org/10.1007/s11002-013-9254-1
Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G., & Chatzisavvas, K. C. (2015). A comparison of machine learning techniques for customer churn prediction. Simulation Modelling Practice and Theory, 55, 1–9. https://doi.org/10.1016/j.simpat.2015.03.003
Ver Hoef, J. M., & Boveng, P. L. (2007). Quasi‐Poisson vs. negative binomial regression: How should we model overdispersed count data? Ecology, 88(11), 2766–2772. https://doi.org/10.1890/07-0043.1
West, P. M., Brockett, P. L., & Golden, L. L. (1997). A comparative analysis of neural networks and statistical methods for predicting consumer choice. Marketing Science, 16(4), 370–391. https://doi.org/10.1287/mksc.16.4.370
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data mining: Practical machine learning tools and techniques (4th ed.). Morgan Kaufmann.
Wübben, M., & Wangenheim, F. v. (2008). Instant customer base analysis: Managerial heuristics often “get it right”. Journal of Marketing, 72(3), 82–93. https://doi.org/10.1509/jmkg.72.3.082
Xia, G. E., & Jin, W. D. (2008). Model of customer churn prediction on support vector machine. Systems Engineering – Theory & Practice, 28(1), 71–77. https://doi.org/10.1016/S1874-8651(09)60003-X
Xie, Y. Y., Li, X., Ngai, E., & Ying, W. Y. (2009). Customer churn prediction using improved balanced random forests. Expert Systems with Applications, 36(3), 5445–5449. https://doi.org/10.1016/j.eswa.2008.06.121
Zhang, Y., Bradlow, E. T., & Small, D. S. (2014). Predicting customer value using clumpiness: From RFM to RFMC. Marketing Science, 34(2), 195–208. https://doi.org/10.1287/mksc.2014.0873
Zhao, Y., Yao, L., & Zhang, Y. (2016). Purchase prediction using Tmall‐specific features. Concurrency Computation: Practice Experience, 28(14), 3879–3894. https://doi.org/10.1002/cpe.3720