R Projects For Dummies
Book image
Explore Book Buy On Amazon
Try out this R project to see how one variable might affect an outcome. It’s conceivable that weather conditions could influence flight delays. How do you incorporate weather information into the assessment of delay?

One nycflights13 data frame called weather provides the weather data for every day and hour at each of the three origin airports. Here’s a glimpse of exactly what it has:

> glimpse(weather,60)
Observations: 26,130
Variables: 15
$ origin      "EWR", "EWR", "EWR", "EWR", "EWR", "...
$ year        2013, 2013, 2013, 2013, 2013, 2013, ...
$ month       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ day         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ hour        0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 1...
$ temp        37.04, 37.04, 37.94, 37.94, 37.94, 3...
$ dewp        21.92, 21.92, 21.92, 23.00, 24.08, 2...
$ humid       53.97, 53.97, 52.09, 54.51, 57.04, 5...
$ wind_dir    230, 230, 230, 230, 240, 270, 250, 2...
$ wind_speed  10.35702, 13.80936, 12.65858, 13.809...
$ wind_gust   11.918651, 15.891535, 14.567241, 15....
$ precip      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ pressure    1013.9, 1013.0, 1012.6, 1012.7, 1012...
$ visib       10, 10, 10, 10, 10, 10, 10, 10, 10, ...
$ time_hour   2012-12-31 19:00:00, 2012-12-31 20:...
So the variables it has in common with flites_name_day are the first six and the last one. To join the two data frames, use this code:
flites_day_weather <- flites_day %>%
  inner_join(weather, by = c("origin","year","month","day","hour","time_hour"))
Now you can use flites_day_weather to start answering questions about departure delay and the weather.

What questions will you ask? How will you answer them? What plots will you draw? What regression lines will you create? Will scale() help?

And, when you’re all done, take a look at arrival delay (arr_delay).

About This Article

This article is from the book:

About the book author:

Joseph Schmuller, PhD, is a veteran of more than 25 years in Information Technology. He is the author of several books, including Statistical Analysis with R For Dummies and four editions of Statistical Analysis with Excel For Dummies. In addition, he has written numerous articles and created online coursework for Lynda.com.

This article can be found in the category: