Categorizing Models for Predictive Analytics
Models are necessary to perform predictive analytics. A model is nothing but a mathematical representation of a segment of the world people are interested in. A model can mimic behavioral aspects of our customers. It can represent the different customer segments. A well-made, well-tuned model can forecast — predict with high accuracy — the next outcome of a given event.
You have various ways to categorize the models used for predictive analytics. In general, you can sort them out by
- The business problems they solve and the primary business functions they serve (such as sales, advertising, human resources, or risk management).
- The mathematical implementation used in the model (such as statistics, data mining, and machine learning).
Every model will have some combination of these aspects; more often than not, one or the other will dominate. The intended function of the model can take one of various directions — predictive, classification, clustering, decision-oriented, or associative.
Predictive models analyze data and predict the next outcome. This is the big contribution of predictive analytics, as distinct from business intelligence. Business intelligence monitors what’s going on in an organization now. Predictive models analyze historical data to make an informed decision about the likelihood of future outcomes.
Given certain conditions (recent number and frequency of customers complaints, the date of renewal of service approaching, and the availability of cheaper options by the competition) how likely is this customer to churn?
The output of the predictive model can also be a binary, yes/no or 0/1 answer: whether a transaction is fraudulent, for example. A predictive model can generate multiple results, sometimes combining yes/no results with a probability that a certain event will happen. A customer’s creditworthiness, for example, could be rated as yes or no, and a probability assigned that describes how likely that customer is to pay off a loan on time.
Clustering and classification models
When a model uses clustering and classification, it identifies different groupings within existing data. You can still build a predictive model on top of the output of your clustering model using the clustering to classify new data points.
If, for example, you run a clustering algorithm on your customers’ data and thereby separate them into well-defined groups, you can then use classification to learn about a new customer and clearly identify his group. Then you can tailor your response (for example, a targeted marketing campaign) and your handling of the new customer.
Classification uses a combination of characteristics and features to indicate whether an item of data belongs to a particular class.
Many applications or business problems can be formulated as classification problems. At the very basic level, for example, you can classify outcomes as desired and undesired. For example, you can classify an insurance claim as legitimate or fraudulent.
Given a complex scenario, what is the best decision to make — and if you were to take that action, what would the outcome be? Decision-oriented models (simply called decision models) address such questions by building strategic plans so as to identify the best course of action, given certain events. Decision models can be risk mitigations strategies, helping to identify your best response to unlikely events.
Decision models probe various scenarios and select the best of all courses. To make an informed decision, you need deep understanding of the complex relationships in the data and the context you’re operating in. A decision model serves as a tool to help you develop that understanding.
Associative models (called association models) are built on the underlying associations and relationships present in the data. If (for example) a customer is subscribed to a particular service, it’s most likely that she will order another specific service. If a customer is looking to buy Product A (a sports car), and that product is associated with Product B (say, sunglasses branded by the carmaker), he is more likely to buy Product B.
Some of these associations can easily be identified; others may not be so obvious. Stumbling over an interesting association, previously unknown, can lead to dramatic benefits.
Another way of finding an association is to determine whether a given event increases the probability that another event will take place. If, for example, a company that leads a certain industrial sector just reported stellar earnings, what is the probability that a basket of stocks in that same sector to go up or down in value?