Share:


Application of machine learning algorithms to predict hotel occupancy

    Konstantins Kozlovskis Affiliation
    ; Yuanyuan Liu Affiliation
    ; Natalja Lace Affiliation
    ; Yun Meng Affiliation

Abstract

The development and availability of information technology and the possibility of deep integration of internal IT systems with external ones gives a powerful opportunity to analyze data online based on external data providers. Recently, machine learning algorithms play a significant role in predicting different processes. This research aims to apply several machine learning algorithms to predict high frequent daily hotel occupancy at a Chinese hotel. Five machine learning models (bagged CART, bagged MARS, XGBoost, random forest, SVM) were optimized and applied for predicting occupancy. All models are compared using different model accuracy measures and with an ARDL model chosen as a benchmark for comparison. It was found that the bagged CART model showed the most relevant results (R2 > 0.50) in all periods, but the model could not beat the traditional ARDL model. Thus, despite the original use of machine learning algorithms in solving regression tasks, the models used in this research could have been more effective than the benchmark model. In addition, the variables’ importance was used to check the hypothesis that the Baidu search index and its components can be used in machine learning models to predict hotel occupancy.

Keyword : bagged CART, bagged MARS, XGBoost, random forest, SVM, hotel occupancy

How to Cite
Kozlovskis, K., Liu, Y., Lace, N., & Meng, Y. (2023). Application of machine learning algorithms to predict hotel occupancy. Journal of Business Economics and Management, 24(3), 594–613. https://doi.org/10.3846/jbem.2023.19775
Published in Issue
Sep 28, 2023
Abstract Views
720
PDF Downloads
545
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Afriyie, J. K., Tawiah, K., Pels, W. A., Addai-Henne, S., Dwamena, H. A., Owiredu, E. O., Ayeh, S. A., & Eshun, J. (2023). A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal, 6, 100163. https://doi.org/10.1016/j.dajour.2023.100163

Ahani, A., Nilashi, M., Ibrahim, O., Sanzogni, L., & Weaven, S. (2019). Market segmentation and travel choice prediction in Spa hotels through TripAdvisor’s online reviews. International Journal of Hospitality Management, 80, 52–77. https://doi.org/10.1016/j.ijhm.2019.01.003

Al Shehhi, M., & Karathanasopoulos, A. (2020). Forecasting hotel room prices in selected GCC cities using deep learning. Journal of Hospitality and Tourism Management, 42, 40–50. https://doi.org/10.1016/j.jhtm.2019.11.003

Aryai, V., & Glodsworthy, M. (2023). Day ahead carbon emission forecasting of regional National Electricity Market using machine learning methods. Engneering Application of Artificial Intelligence, 123, 106314. https://doi.org/10.1016/j.engappai.2023.106314

Boriratrit, S., Fuangfoo, P., Srithapon, C., & Chatthaworn, R. (2023). Adaptive meta-learning extreme learning machine with golden eagle optimization and logistic map for forecasting the incomplete data of solar irradiance. Energy and AI, 13, 100243. https://doi.org/10.1016/j.egyai.2023.100243

Breiman, L. (1984). Classification and regression trees (1st ed.). Routledge. https://doi.org/10.1201/9781315139470

Buja, A., & Stuetzle, W. (2006). Observations on bagging. Statistica Sinica, 16(2), 323–351. http://www.jstor.org/stable/24307547

Caicedo-Torres, W., & Payares, F. (2016). A machine learning model for occupancy rates and demand forecasting in the hospitality industry. In M. Montes y Gómez, H. Escalante, A. Segura, & J. Murillo (Eds.), Lecture notes in computer science: Vol. 10022. Advances in Artificial Intelligence – IBERAMIA 2016 (pp. 201–211). Springer. https://doi.org/10.1007/978-3-319-47955-2_17

Calero-Sanz, J., Orea-Giner, A., Villacé-Molinero, T., Muñoz-Mazón, A., & Fuentes-Moraleda, L. (2022). Predicting a new hotel rating system by analysing UGC content from Tripadvisor: Machine learning application to analyse service robots influence, Procedia Computer Science, 200, 1078–1083. https://doi.org/10.1016/j.procs.2022.01.307

Chen, T., & He, T. (2023). xgboost: eXtreme Gradient Boosting. R package version 1.7.5.1. https://cran.r-project.org/web/packages/xgboost/vignettes/xgboost.pdf

Divasón, J., Ceniceros, J. F., Sanz-Garcia, A., Pernia-Espinoza, A., & Martinez-de-Pison, F. J. (2023). PSO-PARSIMONY: A method for finding parsimonious and accurate machine learning models with particle swarm optimization. Application for predicting force-displacement curves in T-stub steel connections. Neurocomputing, 548, 126414. https://doi.org/10.1016/j.neucom.2023.126414

Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337–407. https://doi.org/10.1214/aos/1016218223

Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67. https://doi.org/10.1214/aos/1176347963

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451

Gong, Y., Liu, G., Xue, Y., Li, R., & Meng, L. (2023). A survey on dataset quality on machine learning. Information and Software Technology, 162, 107268. https://doi.org/10.1016/j.infsof.2023.107268

Huang, L., & Zheng, W. (2023). Novel deep learning approach for forecasting daily hotel demand with agglomeration effect. International Journal of Hospitality Management, 98, 103038. https://doi.org/10.1016/j.ijhm.2021.103038

Jiang, Y., Tran, T. H., & Williams, L. (2023). Machine learning and mixed reality for smart aviation: Applications and challenges. Journal of Air Transport Management, 111, 102437. https://doi.org/10.1016/j.jairtraman.2023.102437

Kamm, S., Veekati, S. S., Müller, T., Jazdi, N., & Weyrich, M. (2023). A survey on machine learning based analysis of heterogeneous data in industrial automation. Computers in Industry, 149, 103930. https://doi.org/10.1016/j.compind.2023.103930

Kaya, K., Yılmaz, Y., Yaslan, Y., Öğüdücü, S. G., & Çıngı, F. (2022). Demand forecasting model using hotel clustering findings for hospitality industry. Information Processing and Management, 59(1), 102816. https://doi.org/10.1016/j.ipm.2021.102816

Khalil, M., McGough, A. S., Pourmirza, Z., Pazhoohesh, M., & Walker, S. (2022). Machine Learning, Deep Learning and Statistical Analysis for forecasting building energy consumption – A systematic review. Engineering Applications of Artificial Intelligence, 115, 105287. https://doi.org/10.1016/j.engappai.2022.105287

Kim, H. S. (2010). hotel property characteristics and occupancy rate: Examining super deluxe 1st class hotels in Seoul, Korea. International Journal of Tourism Sciences, 10(3), 25–47. https://doi.org/10.1080/15980634.2010.11434630

Kolomoyets, Y., & Dickinger, A. (2023). Understanding value perceptions and propositions: A machine learning approach. Journal of Business Research, 154, 113355. https://doi.org/10.1016/j.jbusres.2022.113355

Koupriouchina, L., van der Rest, J. P., & Schwartz, A. (2014). On revenue management and the use of occupancy forecasting error measures. International Journal of Hospitality Management, 41, 104–114. https://doi.org/10.1016/j.ijhm.2014.05.002

Li, X., Li, H., Pan, B., & Law, R. (2020). Machine learning in internet search query selection for tourism forecasting. Journal of Travel Research, 60(6), 1213–1231. https://doi.org/10.1177/0047287520934871

Lim, C. (1997). Review of international tourism demand models. Annals of Tourism Research, 24(4), 835–849. https://doi.org/10.1016/S0160-7383(97)00049-2

Mehmood, F., Ghani, M. U., Ghafoor, H., Shahzadi, R., Asim, M. N., & Mahmood, W. (2022). EGD-SNet: A computational search engine for predicting an end-to-end machine learning pipeline for Energy Generation & Demand Forecasting. Applied Energy, 324, 119754. https://doi.org/10.1016/j.apenergy.2022.119754

Prajwala, T. R. (2015). A comparative study on decision tree and random forest using R tool. International Journal of Advanced Research in Computer and Communication Engineering, 4(1), 196–199. https://doi.org/10.17148/IJARCCE.2015.4142

Qin, Q., Huang, Z., Zhou, Z., Chen, C., & Liu, R. (2023). Crude oil price forecasting with machine learning and Google search data: An accuracy comparison of single-model versus multiple-model. Engineering Applications of Artificial Intelligence, 123, 106266. https://doi.org/10.1016/j.engappai.2023.106266

Sánchez, E. C., Sánchez-Medina, A. J., & Pellejero, M. (2020). Identifying critical hotel cancellations using artificial intelligence. Tourism Management Perspectives, 35, 100718. https://doi.org/10.1016/j.tmp.2020.100718

Sánchez-Medina, A. J., & Sánchez, E. C. (2020). Using machine learning and big data for efficient forecasting of hotel booking cancellations. International Journal of Hospitality Management, 89, 102546. https://doi.org/10.1016/j.ijhm.2020.102546

Sayed, Y. A. K., Ibrahim, A. A., Tamrazyan, A. G., & Fahmy, M. F. M. (2023). Machine-learning-based models versus design-oriented models for predicting the axial compressive load of FRP-confined rectangular RC columns. Engineering Structures, 285, 116030. https://doi.org/10.1016/j.engstruct.2023.116030

Strielkowski, W., Vlasov, A., Selivanov, K., Muraviev, K., & Shakhnov, V. (2023). Prospects and challenges of the machine learning and data-driven methods for the predictive analysis of power systems: A review. Energies, 16(10), 4025. https://doi.org/10.3390/en16104025

Sun, C., & Lu, J. (2023). The relative roles of different land-use types in bike-sharing demand: A machine learning-based multiple interpolation fusion method. Information Fusion, 95, 384–400. https://doi.org/10.1016/j.inffus.2023.02.033

Sun, J., Dang, W., Wang, F., Nie, H., Wei, X., Li, P., Zhang, S., Feng, Y., & Li, F. (2023). Prediction of TOC content in organic-rich shale using machine learning algorithms: Comparative study of random forest, support vector machine, and XGBoost. Energies, 16(10), 4159. https://doi.org/10.3390/en16104159

van Eck, N. J., & Waltman, L. (2023). VOSviewer manual. https://www.vosviewer.com/documentation/Manual_VOSviewer_1.6.19.pdf

Viverit, L., Heo, C. Y., Pereira, L. N., & Tiana, G. (2023). Application of machine learning to cluster hotel booking curves for hotel demand forecasting. International Journal of Hospitality Management, 111, 103455. https://doi.org/10.1016/j.ijhm.2023.103455

Yang, Y., Pan, B., & Song, H. (2014). Predicting hotel demand using destination marketing organization’s web traffic data. Journal of Travel Research, 53(4), 433–447. https://doi.org/10.1177/0047287513500391

Yang, Y., Tang, J., Luo, H., & Law, R. (2015). Hotel location evaluation: A combination of machine learning tools and web GIS. International Journal of Hospitality Management, 47, 14–24. https://doi.org/10.1016/j.ijhm.2015.02.008

Zhai, Q., Tian, Y., Luo, J., & Zhou, J. (2023). Hotel overbooking based on no-show probability forecasts. Computers & Industrial Engineering, 180, 109226. https://doi.org/10.1016/j.cie.2023.109226