How to Introduce Predictive Analytics Data Classifications to Your Business - dummies

How to Introduce Predictive Analytics Data Classifications to Your Business

By Anasse Bari, Mohamed Chaouchi, Tommy Jung

If your business has yet to use data classification utilized in predictive analytics, maybe it’s time to introduce it as a way to make better management or operating decisions. This process starts with an investigative step: Identifying a problem area in the business where ample data is available but currently isn’t being used to drive business decisions.

One way to identify such a problem area is to hold a meeting with your analysts, managers, and other decision-makers to ask them what risky or difficult decisions they repeatedly make — and what kind of data they need to support their decisions. If you have data that reflects the results of past decisions, be prepared to draw on it. This process of identifying the problem is called the discovery phase.

After the discovery phase, you’ll want to follow up with individual questionnaires addressed to the business stakeholders. Consider asking the following types of questions:

  • What do you want to know from the data?

  • What action will you take when you get your answer?

  • How will you measure the results from the actions taken?

If the predictive analytical model’s results produce meaningful insights, then someone must do something with it — take action. Obviously, you’ll want to see whether the results of that action add business value to the organization. So you’ll have to find a method of measuring that value — whether in terms of savings from operational costs, increased sales, or better customer retention.

As you conduct these interviews, seek to understand why certain tasks are done and how they’re being used in the business process. Asking why things are the way they are may help you uncover unexpected realizations. No point in gathering and analyzing data just for the sake of creating more data. You want to use that data to answer specific business needs.

For the data scientist or modeler, this exercise defines what kinds of data must be classified and analyzed — a step essential to developing a data classification model. A basic distinction to begin with is whether the data you’ll use to train the model is internal or external:

  • Internal data is specific to your company, usually draws from your company’s data sources, and can include many data types — such as structured, semi-structured, or unstructured.

  • External data comes from outside the company, often as data bought from other companies.

Regardless of whether the data you use for your model is internal or external, you’ll want to evaluate it first. Several questions are likely to crop up in that evaluation:

  • How critical and accurate is the data in question? If it’s too sensitive, it may not serve your purposes.

  • How accurate is the data in question and if its accuracy is questionable, then its utility is limited.

  • How do company policy and applicable laws allow the data to be used and processed? You may want to clear the use of the data with your legal department for any legal issues that could arise. (See the accompanying sidebar for a famous recent example.) .

When you’ve identified data that is appropriate to use in the building of your model, the next step is to classify it — to create and apply useful labels to your data elements. For instance, if you’re working on data about customers’ buying behavior, the labels could define data categories according to how some groups of customers buy, along these lines:

  • Seasonal customers could be those who shop regularly or semi-regularly.

  • Discount-oriented customers could be those who tend to shop only when major discounts are offered.

  • Faithful customers are those who have bought many of your products over time.

Predicting the category that a new customer will fit can be of great value to the marketing team. The idea is to spend time and money efficiently on identifying which customers to advertise to, determining which products to recommend to them, and choosing the best time to do so.

A lot of time and money can be wasted if you target the wrong customers, probably making them less likely to buy than if you hadn’t marketed to them in the first place. Using predictive analytics for targeted marketing should aim not only at more successful campaigns, but also at the avoidance of pitfalls and unintended consequences.