How to New Analytical Predictions with R Regression

By Anasse Bari, Mohamed Chaouchi, Tommy Jung

To make analytical predictions with new data, you simply use the function with a list of the seven attribute values. The following code does that job:

> newPrediction <- predict(model,   
list(cylinders=factor(4), displacement=370,
horsepower=150, weight=3904, acceleration=12, modelYear=factor(70), origin=factor(1)),
interval="predict", level=.95)

This is the code and output of the new prediction value:

> newPrediction
  fit      lwr     upr
1 14.90128 8.12795 21.67462

What you have here is your first real prediction from the regression model. Because it’s from unseen data and you don’t know the outcome, you can’t compare it against anything else to find out whether it was correct.

After you’ve evaluated the model with the testing dataset, and you’re happy with its accuracy, you can have confidence that you built a good predictive model. You’ll have to wait for business results to measure the effectiveness of your predictive model.

There may be optimizations you can make to build a better and more efficient predictive model. By experimenting, you may find the best combination of predictors to create a faster and more accurate model.

One way to construct a subset of the features is to find the correlation between the variables and remove the highly correlated variables. Removing the redundant variables that add nothing (or add very little information) to the fit, you can increase the speed of the model. This is especially true when you’re dealing with many observations (rows of data) where processing power or speed might be an issue.

For a big dataset, more attributes in a row of data will slow down the processing. So you should try to eliminate as much redundant information as possible.