October 17, 2018
Customer churn, which occurs when clients decide to cancel or not renew their subscription, can be a nightmare for most businesses. New client wins can cost 5-25 times more than customer retention, so its important to identify which customers are at risk of churning and take the right actions to keep them on side.
Mind Foundry is an automated data science platform which aims to allow anyone, with or without a background in data science to easily build and deploy quality controlled Machine Learning pipelines. Mind Foundry empowers business analysts and data scientists by allowing them to easily insert their domain expertise in the model building process and extract actionable insights.
In this tutorial, we'll show how we can build a classification pipeline in minutes using Mind Foundry, with the goal of predicting Telco customer churn using data from Kaggle.
Customer churn is a costly issue for Telcos, but a predictive model can empower them to take pro-active steps.
In this tutorial, we will follow the standard data science process:
- Data preparation
- Pipeline construction and tuning
- Interpretation and deployment
First we are going to load the data into Mind Foundry, which in this case is a simple csv with 19 columns and 6,666 rows:
Each row represents a customer and each column an attribute, which include the number of voice mails, total minutes (day/night) and total calls (day/night).
Mind Foundry automatically scans the data, detects the type of each column and provides data preparation advice highlighted by the light bulbs. This is where the business analyst or data scientist can introduce their domain knowledge by acting on the relevant advice with the appropriate answers.
In this case, we know that the missing values in “number voice mail messages” column should actually be filled in with 0 and can be done very easily by simply clicking:
After following the advice, we will then join more information on a customer on the Customer ID column.
Mind Foundry also generates histograms which are useful for eye-balling the data and identifying high-level relationships.
We can also see how the customers who churned distribute across the other columns.
Finally, we will remove gender from our churn prediction model.
Processing the data
Mind Foundry allows you to quickly set up your classification process by selecting the target column you wish to predict. Mind Foundry will then hold 10% out from the data for final validation purposes and perform 10-fold cross validation on the remaining 90%. This helps to reduce the risk of over-fitting the model to the training data.
Mind Foundry will then launch and search the solution space of possible pipelines (feature engineering and machine learning models) and their associated hyper-parameters using its proprietary Bayesian Optimizer. Mind Foundry also keeps an audit trail of all the pipelines it has evaluated which you can query if required.
Deploying the solution
Once Mind Foundry has found the best pipeline, it will run final quality checks on 10% of the data which was held out right from the start and never used during any of the model training. The performance metrics on this 10% hold out are presented at the end and Mind Foundry provides full transparency of the final pipeline it has chosen (feature engineering, models and hyper-parameter values).
In our case, the model health is good. However, if it were bad, Mind Foundry would tell the user and provide suggestions on how to improve it, for example by sourcing more data.
The relative influence of each feature on the forecasts can be explored in more detail and allows us to generate "rules" for interpreting the model.
The model can then be integrated into your website, internal dashboard, products or business process via an automatically generated RESTful API. The feature relevance is provided by LIME. This allow us to provide explanations for individual predictions and therefore tailor appropriate offers to the customers who are likely to churn.
In this example, chances of churn increase significantly when a customer has an international plan, spends a day charge of more than 40 and makes an increased number of customer service calls.
In conclusion, it's entirely possible to address the problem of customer churn with machine learning models, and discover actionable insights that really impact on the bottom line of your business.