Predictive Analytics For Dummies
Book image
Explore Book Buy On Amazon
A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.

Building a predictive analytics model

A successful predictive analytics project is executed step by step. As you immerse yourself in the details of the project, watch for these major milestones:

  1. Defining Business Objectives

    The project starts with using a well-defined business objective. The model is supposed to address a business question. Clearly stating that objective will allow you to define the scope of your project, and will provide you with the exact test to measure its success.

  2. Preparing Data

    You’ll use historical data to train your model. The data is usually scattered across multiple sources and may require cleansing and preparation. Data may contain duplicate records and outliers; depending on the analysis and the business objective, you decide whether to keep or remove them. Also, the data could have missing values, may need to undergo some transformation, and may be used to generate derived attributes that have more predictive power for your objective. Overall, the quality of the data indicates the quality of the model.

  3. Sampling Your Data

    You’ll need to split your data into two sets: training and test datasets. You build the model using the training dataset. You use the test data set to verify the accuracy of the model’s output. Doing so is absolutely crucial. Otherwise you run the risk of overfitting your model — training the model with a limited dataset, to the point that it picks all the characteristics (both the signal and the noise) that are only true for that particular dataset. An model that’s overfitted for a specific data set will perform miserably when you run it on other datasets. A test dataset ensures a valid way to accurately measure your model’s performance.

  4. Building the Model

    Sometimes the data or the business objectives lend themselves to a specific algorithm or model. Other times the best approach is not so clear-cut. As you explore the data, run as many algorithms as you can; compare their outputs. Base your choice of the final model on the overall results. Sometimes you’re better off running an ensemble of models simultaneously on the data and choosing a final model by comparing their outputs.

  5. Deploying the Model

    After building the model, you have to deploy it in order to reap its benefits. That process may require co-ordination with other departments. Aim at building a deployable model. Also be sure you know how to present your results to the business stakeholders in an understandable and convincing way so they adopt your model. After the model is deployed, you’ll need to monitor its performance and continue improving it. Most models decay after a certain period of time. Keep your model up to date by refreshing it with newly available data.

Data sources for predictive analytics projects

Data for a predictive analytics project can come from many different sources. Some of the most common sources are within your own organization; other common sources include data purchased from outside vendors.

Internal data sources include

  • Transactional data, such as customer purchases

  • Customer profiles, such as user-entered information from registration forms

  • Campaign histories, including whether customers responded to advertisements

  • Clickstream data, including the patterns of customers’ web clicks

  • Customer interactions, such as those from e-mails, chats, surveys, and customer-service calls

  • Machine-generated data, such as that from telematics, sensors, and smart meters

External data sources include

  • Social media such as Facebook, Twitter, and LinkedIn

  • Subscription services such as Bloomberg, Thompson Reuters, Esri, and Westlaw

By combining data from several disparate data sources in your predictive models, you may get a better overall view of your customer, thus a more accurate model.

Ensuring success when using predictive analytics

Think of predictive analytics as a bright bulb powered by your data. The light (insight) from predictive analytics can empower your strategy, streamline your operations, and improve your bottom line. The followings four recommendations can help you ensure success for your predictive analytics initiatives.

Foster a culture of change

Predictive analytics should be adopted across the organization as a whole. The organization should embrace change. Business stakeholders should be ready to incorporate recommendations and adopt findings derived from the predictive analytics projects. The outcomes of a predictive analytics projects are only valuable if the business leaders are willing to act on them.

Create a data-science team

Hire a data-science team whose sole job is to establish and support your predictive analytics solutions. This team of talented professionals— comprising business analysts, data scientists, and information technologists — is better equipped to work on the project full-time. Including a range of professional backgrounds can bring valuable insights to the team from other domains. Selecting team members from different departments in your organization can help ensure a widespread buy-in.

Use visualization tools effectively

Visualization is a powerful way to conveying complex ideas efficiently. Using visualization effectively can help you initially explore and understand the data you’re working with. Visual aids such as charts can also help you evaluate the model’s output or compare the performance of predictive models.

Use predictive analytics tools

Powerful predictive analytics tools are available as software packages in the marketplace. They’re designed to make the whole process a lot easier. Without the use of such tools, building a model from scratch quickly becomes time-intensive. Using a good predictive analytics tool enables you to run multiple scenarios and instantaneously compare the results — all with a few clicks. A tool can quickly automate many of time-consuming steps required to build and evaluate one or more models.

About This Article

This article is from the book:

About the book authors:

Anasse Bari, Ph.D. is data science expert and a university professor who has many years of predictive modeling and data analytics experience.

Mohamed Chaouchi is a veteran software engineer who has conducted extensive research using data mining methods.

Tommy Jung is a software engineer with expertise in enterprise web applications and analytics.

This article can be found in the category: