Customer Churn Prediction Using Machine Learning
DOI:
https://doi.org/10.64751/Abstract
Customer churn, the voluntary discontinuation of services by a customer, represents one of the most consequential challenges in the banking and financial services industry. Predicting churn prior to its occurrence enables organizations to implement targeted retention strategies, reducing acquisition overhead and stabilizing revenue streams. This paper introduces an end-to-end machine learning framework for customer churn prediction centered on the CatBoost gradient boosting classifier, which natively handles categorical features and delivers competitive predictive accuracy with reduced preprocessing burden. The system ingests ten customer attributes—credit score, geography, gender, age, tenure, account balance, number of products, credit card ownership, activity status, and estimated salary—and augments these through feature engineering, producing derived interaction variables such as age-balance product, tenure-product ratio, and a composite churn risk index. Model training is conducted on a publicly available bank customer dataset, and performance is evaluated through accuracy, precision, recall, F1-score, and the ROC-AUC metric. A Flask web application provides a real-time prediction interface, and results are persisted in Firebase Realtime Database for longitudinal monitoring. Experimental outcomes demonstrate that CatBoost achieves superior balanced accuracy and F1-score relative to logistic regression, random forest, and standard gradient boosting baselines, affirming its suitability for operationalized churn management in data-constrained enterprise environments.







