Basics of Major Technological Trends in Predictive Analytics

By Anasse Bari, Mohamed Chaouchi, Tommy Jung

Traditional predictive analytical techniques can only provide insights on the basis of historical data. Your data — both past and incoming — can provide you with a reliable predictor that can help you make better decisions to achieve your business goals. The tool for accomplishing that goal is predictive analytics.

How to explore predictive analytics as a service

As the use of predictive analytics has become more common and widespread, an emerging trend is (understandably) toward greater ease of use. Arguably the easiest way to use predictive analytics is as software — whether as a standalone product or as a cloud-based service provided by a company whose business is providing predictive analytics solutions for other companies.

If your company’s business is to offer predictive analytics, you can provide that capability in two major ways:

  • As a standalone software application with an easy-to-use graphical user interface: The customer buys the predictive analytics product and uses it to build customized predictive models.

  • As a cloud-based set of software tools that help the user choose a predictive model to use: The customer applies the tools to fulfill the requirements and specifications of the project at hand, and the type of data that the model will be applied to. The tools can offer predictions quickly, without involving the client in the workings of the algorithms in use or the data management involved.

A simple example can be as straightforward as these three steps:

  1. A client uploads data to your servers, or chooses data that already resides in the cloud.

  2. The customer applies some of the available predictive model to that data.

  3. The customer reviews visualized insights and predictions from the results of the analysis or service.

How to aggregate distributed data for analysis

A growing trend is to apply predictive analytics to data gathered from diverse sources. Deploying a typical predictive analytics solution in a distributed environment requires collecting data — sometimes big data — from different sources; an approach that must rely on data management capabilities. Data needs to be collected, pre-processed, and managed before it can be considered usable for generating actionable predictions.

The architects of predictive analytics solutions must always face the problem of how to collect and process data from different data sources. Consider, for example, a company that wants to predict the success of a business decision that affects one of its products by evaluating one of the following options:

  • To put company resources into increasing the sales volume

  • To terminate manufacture of the product

  • To change the current sales strategy for the product

The predictive analytics architect must engineer a model that helps the company make this decision, using data about the product from different departments:

  • Technical data: The engineering department has data about the product’s specifications, its lifecycle, and the resources and time needed to produce it.

  • Sales data: The sales department has information about the product’s sales volume, the number of sales per region, and profits generated by those sales.

  • Customer data from surveys, reviews, and posts: The company may have no dedicated department that analyzes how customers feel about the product. Tools exist, however, that can automatically analyze data posted online and extract the attitudes of authors, speakers, or customers toward a topic, a phenomenon, or (in this case) a product.

    image0.jpg

For instance, if a user posts a review about Product X that says, “I really like Product X and I’m happy with the price,” a sentiment extractor automatically labels this comment as positive.

Such tools can classify responses as “happy,” “sad,” “angry,” and so on, basing the classification on the words that an author uses in text posted online. In the case of Product X, the predictive analytics solution would need to aggregate customer reviews from external sources.

The example is an aggregation of data from multiple sources, both internal and external — from the engineering and sales divisions (internal), and from customer reviews gleaned from social networks (external) — which is also an instance of using big data in predictive analytics.

Basics of real-time data-driven analytics

Delivering insights as new events occur in real time is a challenging task because so much is happening so fast. Modern high-speed processing has shifted the quest for business insight away from traditional data warehousing and toward real-time processing.

But the volume of data is also high — a tremendous amount of varied data, from multiple sources, generated constantly and at different speeds. Companies are eager for scalable predictive analytics solutions that can derive real-time insights from a flood of data that seems to carry “the world and all it contains.”

The demand is intensifying for analyzing data in real time and generating predictions quickly. Consider the real-life example of encountering an online ad placement that corresponds to a purchase you were already about to make. Companies are interested in predictive analytics solutions that can provide such capabilities as the following:

  • Predict — in real time — the specific ad that a site visitor would most likely click (an approach called real-time ad placement).

  • Speculate accurately on which customers are about to quit a service or product in order to target those customers with a retention campaign (customer retention and churn modeling).

  • Identify voters who can be influenced through a specific communication strategy such as a home visit, TV ad, phone call, or e-mail. (You can imagine the impact on political campaigning.)

In addition to encouraging buying and voting along desired lines, real-time predictive analytics can serve as a critical tool for the automatic detection of cyber-attacks.