|
Published:
October 31, 2016

Predictive Analytics For Dummies

Overview

Use Big Data and technology to uncover real-world insights

You don't need a time machine to predict the future. All it takes is a little knowledge and know-how, and Predictive Analytics For Dummies gets you there fast. With the help of this friendly guide, you'll discover the core of predictive analytics and get started putting it to use with readily available tools to collect and analyze data. In no time, you'll learn how to incorporate algorithms through data models, identify similarities and relationships in your data, and predict the future through data classification. Along the way, you'll develop a roadmap by preparing your data, creating goals, processing your data, and building a predictive model that will get you stakeholder buy-in.

Big Data has taken the marketplace by storm, and companies are seeking qualified talent to quickly fill positions to analyze the massive amount of data that are being collected each day. If you want to get in on the action and either learn or deepen your understanding of how to use predictive analytics to find real relationships between what you know and what you want to know, everything you need is a page away!

  • Offers common use cases to help you get started
  • Covers details on modeling, k-means clustering, and more
  • Includes information on structuring your data
  • Provides tips on outlining business goals and approaches

The future starts today with the help of Predictive Analytics For Dummies.

Read More

About The Author

Anasse Bari, Ph.D. is data science expert and a university professor who has many years of predictive modeling and data analytics experience.

Mohamed Chaouchi is a veteran software engineer who has conducted extensive research using data mining methods.

Tommy Jung is a software engineer with expertise in enterprise web applications and analytics.

Sample Chapters

predictive analytics for dummies

CHEAT SHEET

A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.Building a predictive analytics modelA successful predictive analytics project is executed step by step. As you immerse yourself in the details of the project, watch for these major milestones: Defining Business Objectives The project starts with using a well-defined business objective.

HAVE THIS BOOK?

Articles from
the book

Principal component analysis (PCA) is a valuable technique that is widely used in predictive analytics and data science. It studies a dataset to learn the most relevant variables responsible for the highest variation in that dataset. PCA is mostly used as a data reduction technique.While building predictive models, you may need to reduce the number of features describing your dataset.
Big data has the potential to inspire businesses to make better decisions through predictive analytics. It's important to be aware of the tools that can quickly help you create good visualization. You want to always keep your audience engaged and interested.Here are some popular visualization tools for large scale enterprise analytics.
A successful predictive analytics project is executed step by step. As you immerse yourself in the details of the project, watch for these major milestones: Defining Business Objectives The project starts with using a well-defined business objective. The model is supposed to address a business question. Clearly stating that objective will allow you to define the scope of your project, and will provide you with the exact test to measure its success.
Models are necessary to perform predictive analytics. A model is nothing but a mathematical representation of a segment of the world people are interested in. A model can mimic behavioral aspects of our customers. It can represent the different customer segments. A well-made, well-tuned model can forecast — predict with high accuracy — the next outcome of a given event.
Various statistical, data-mining, and machine-learning algorithms are available for use in your predictive analytics model. You're in a better position to select an algorithm after you've defined the objectives of your model and selected the data you'll work on. Some of these algorithms were developed to solve specific business problems, enhance existing algorithms, or provide new capabilities — which may make some of them more appropriate for your purposes than others.
You'll need to make sure that the data is clean of extraneous stuff before you can use it in your predictive analysis model. This includes finding and correcting any records that contain erroneous values, and attempting to fill in any missing values. You'll also need to decide whether to include duplicate records (two customer accounts, for example).
Predictive analytics makes heavy use of three related disciplines: data mining, statistics, and machine learning. All four disciplines intersect to such a large degree that their names are often used interchangeably. Just to keep the record straight, there are some distinctions: predictive analytics combines many of the techniques, tools, and algorithms that data mining, statistics, and machine learning have in common.
Data for a predictive analytics project can come from many different sources. Some of the most common sources are within your own organization; other common sources include data purchased from outside vendors.Internal data sources include Transactional data, such as customer purchases Customer profiles, such as user-entered information from registration forms Campaign histories, including whether customers responded to advertisements Clickstream data, including the patterns of customers’ web clicks Customer interactions, such as those from e-mails, chats, surveys, and customer-service calls Machine-generated data, such as that from telematics, sensors, and smart meters External data sources include Social media such as Facebook, Twitter, and LinkedIn Subscription services such as Bloomberg, Thompson Reuters, Esri, and Westlaw By combining data from several disparate data sources in your predictive models, you may get a better overall view of your customer, thus a more accurate model.
When you are dealing with predictive analytics, make sure you understand the demands associated with big data. Be sure to make a clear distinction between business intelligence and data mining. Here are the basics of the distinction: Business intelligence (BI) is about building a model that answers specific business questions.
Think of predictive analytics as a bright bulb powered by your data. The light (insight) from predictive analytics can empower your strategy, streamline your operations, and improve your bottom line. The followings four recommendations can help you ensure success for your predictive analytics initiatives. Foster a culture of change Predictive analytics should be adopted across the organization as a whole.
In perspective, the goal for designing an architecture for data analytics comes down to building a framework for capturing, sorting, and analyzing big data for the purpose of discovering actionable results. Thinking of the architecture that will transform big data into actionable results.There is no one correct way to design the architectural environment for big data analytics.
To assemble your predictive analytics team, you'll need to recruit business analysts, data scientists, and information technologists. Regardless of their particular areas of expertise, your team members should be curious, engaged, motivated, and excited to dig as deep as necessary to make the project — and the business — succeed.
The random forest model is an ensemble model that can be used in predictive analytics; it takes an ensemble (selection) of decision trees to create its model. The idea is to take a random sample of weak learners (a random subset of the training data) and have them vote to select the strongest and best model. The random forest model can be used for either classification or regression.
One clustering algorithm offered in scikit-learn that can be used in predictive analytics is the mean shift algorithm. This algorithm, like DBSCAN, doesn't require you to specify the number of clusters, or any other parameters, when you create the model. The primary tuning parameter for this algorithm is called the bandwidth parameter.
There are two main challenges of big data as it applies to predictive analytics: velocity and volume. These are (respectively) the rate at which data is being generated, received, and analyzed, and the growing mass of data. Data velocity Velocity is the speed of an object moving in a specific direction. Data velocity refers to another challenge of big data: the rate at which data is being generated, captured, or delivered.
Predictive analytics can help websites with personalization. You may have noticed that websites remember what you did or which pages you looked at on their website last week or last month. Such websites are tracking your behavior, from clicks on certain parts of the page to the order of the pages you viewed for a session, to offer you the most relevant advertisements, products, or news articles.
You can leverage singular value decomposition for predictive analytics. Singular value decomposition (SVD) represents a dataset by eliminating the less important parts and generating an accurate approximation of the original dataset. In this regard, SVD and PCA are methods of data reduction.SVD will take a matrix as an input and decompose it into a product of three simpler matrices.
A visualization can represent a simulation (a pictorial representation of a what-if scenario) in predictive analytics. You can follow up a visualization of a prediction with a simulation that overlaps and supports the prediction. For example, what happens if the company stops manufacturing Product D? What happens if a natural disaster strikes the home office?
A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.Building a predictive analytics modelA successful predictive analytics project is executed step by step. As you immerse yourself in the details of the project, watch for these major milestones: Defining Business Objectives The project starts with using a well-defined business objective.
As much as you may not like it, your predictive analytics job is not over when your model goes live. Successful deployment of the model in production is no time to relax. You'll need to closely monitor its accuracy and performance over time. A model tends to degrade over time (some faster than others); and a new infusion of energy is required from time to time to keep that model up and running.
When you've defined the objectives of the model, the next step in predictive analytics is to identify and prepare the data you'll use to build your model. The following information touches upon the most important activities. The general sequence of steps looks like this: Identify your data sources. Data could be in different formats or reside in various locations.
Prognostics is an engineering field that aims at predicting the future state of a system. Prognostics improves the process of scheduling maintenance, ordering parts, and using resources. Prof. David Nagel, a renowned expert in nuclear energy, educator and researcher derived an interesting correlation between the field of Predictive Analytics and the old field of Prognostics.
Raw data is a potential resource for predictive analytics, but it can't be usefully analyzed until it's been given a consistent structure. Data residing in multiple systems has to be collected and transformed to get it ready for analysis. The collected data should reside in a separate system so it won't interfere with the live production system.
Open Data could become a very useful tool for predictive analytics. Bob Lytle, the CEO of rel8ed.to, and most recently known as the former CIO of TransUnion Canada, is leading efforts on the use of public information as an alternative and strategic data source for predictive modeling in the financial services and insurance sectors.
In order to ensure a successful deployment of the predictive model you're building, you'll need to think about deployment very early on. The business stakeholders should have a say in what the final model looks like. Thus, at the beginning of the project, be sure your team discusses the required accuracy of the intended model and how best to interpret its results.
Data mining is a necessary part of predictive analytics. In data mining, data classification is the process of labeling a data item as belonging to a class or category. A data item is also referred to (in the data-mining vocabulary) as data object, observation, or instance. Data clustering is different from data classification: Data clustering is used to describe data by extracting meaningful groupings or categories from a body of data that contains similar elements.
Predictive analytics begins with good data. More data doesn't necessarily mean better data. A successful predictive analytics project requires, first and foremost, relevant and accurate data. Keeping it simple isn't stupid If you're trying to address a complex business decision, you may have to develop equally complex models.
Often, you need to be able to show the results of your predictive analytics to those who matter. Here are some ways to use visualization techniques to report the results of your models to the stakeholders. Visualizing hidden groupings in your data Data clustering is the process of discovering hidden groups of related items within your data.
https://cdn.prod.website-files.com/6630d85d73068bc09c7c436c/69195ee32d5c606051d9f433_4.%20All%20For%20You.mp3

Frequently Asked Questions

No items found.