Coding All-in-One For Dummies
Book image
Explore Book Buy On Amazon
Both averaging and voting systems can also work fine when you use a mix of different machine learning algorithms. This is the averaging approach, and it’s widely used when you can’t reduce the estimate variance.

As you try to learn from data, you have to try different solutions, thus modeling your data using different machine learning solutions. It’s good practice to check whether you can put some of them successfully into ensembles using prediction averages or by counting the predicted classes. The principle is the same as in bagging noncorrelated predictions, when models mixed together can produce less variance-affected predictions. To achieve effective averaging, you have to

  1. Divide your data into training and test sets.
  2. Use the training data with different machine learning algorithms.
  3. Record predictions from each algorithm and evaluate the viability of the result using the test set.
  4. Correlate all the predictions available with each other.
  5. Pick the predictions that least correlate and average their result. Or, if you’re classifying, pick a group of least correlated predictions and, for each example, pick as a new class prediction the class that the majority of them predicted.
  6. Test the newly averaged or voted-by-majority prediction against the test data. If successful, you create your final model by averaging the results of the models part of the successful ensemble.

To understand which models correlate the least, take the predictions one by one, correlate each one against the others, and average the correlations to obtain an averaged correlation. Use the averaged correlation to rank the selected predictions that are most suitable for averaging.

About This Article

This article is from the book:

About the book author:

This All-in-One includes work by expert coders and coding educators, including Chris Minnick and Eva Holland coauthors of Coding with JavaScript For Dummies; Nikhil Abraham, author of Coding For Dummies and Getting a Coding Job For Dummies; John Paul Mueller and Luca Massaron, coauthors of Python for Data Science For Dummies and Machine Learning For Dummies; and Barry Burd, author of Flutter For Dummies.

This article can be found in the category: