R Projects For Dummies book cover

R Projects For Dummies

Overview

Make the most of R’s extensive toolset

R Projects For Dummies offers a unique learn-by-doing approach. You will increase the depth and breadth of your R skillset by completing a wide variety of projects. By using R’s graphics, interactive, and machine learning tools, you’ll learn to apply R’s extensive capabilities in an array of scenarios. The depth of the project experience is unmatched by any other content online or in print. And you just might increase your statistics knowledge along the way, too!

R is a free tool, and it’s the basis of a huge amount of work in data science. It's taking the place of costly statistical software that sometimes takes a long time to learn. One reason is that you can use just a few R commands to create sophisticated analyses. Another is that easy-to-learn R graphics enable you make the results of those analyses available to a wide audience.

This book will help you sharpen your skills by applying them in the context of projects with R, including dashboards, image processing, data reduction, mapping, and more.

  • Appropriate for R users at all levels
  • Helps R programmers plan and complete their own projects
  • Focuses on R functions and packages
  • Shows how to carry out complex analyses by just entering a few commands

If you’re brand new to R or just want to brush up on your skills, R Projects For Dummies will help you complete your projects with ease.

Make the most of R’s extensive toolset

R Projects For Dummies offers a unique learn-by-doing approach. You will increase the depth and breadth of your R skillset by completing a wide variety of projects. By using R’s graphics, interactive, and machine learning tools, you’ll learn to apply R’s extensive capabilities in an array of scenarios. The depth of the project experience is unmatched by any other content online or in print. And you just might increase your statistics knowledge along the way, too!

R is a free tool, and it’s the basis of a huge amount of work in data science. It's taking the place of costly statistical software that sometimes takes a long time to learn. One reason is that you can use just a

few R commands to create sophisticated analyses. Another is that easy-to-learn R graphics enable you make the results of those analyses available to a wide audience.

This book will help you sharpen your skills by applying them in the context of projects with R, including dashboards, image processing, data reduction, mapping, and more.

  • Appropriate for R users at all levels
  • Helps R programmers plan and complete their own projects
  • Focuses on R functions and packages
  • Shows how to carry out complex analyses by just entering a few commands

If you’re brand new to R or just want to brush up on your skills, R Projects For Dummies will help you complete your projects with ease.

R Projects For Dummies Cheat Sheet

To complete any project using R, you work with functions that live in packages designed for specific areas. This cheat sheet provides some information about these functions.

Articles From The Book

26 results

R Articles

R Project: Combining an Image with an Animated Image

If you’ve been working with images, animated images, and combined stationary images in R, it may be time to take the next step. This project walks you through the next step: Combine an image with an animated image. This image shows the end product — the plot of the iris data set with comedy icons Laurel and Hardy positioned in front of the plot legend. When you open this combined image in the Viewer, you see Stan and Ollie dancing their little derbies off. (The derbies don’t actually come off in the animation, but you get the drift.)

Getting Stan and Ollie

Check out the Laurel and Hardy GIF. Right-click the image and select Save Image As from the pop-up menu that appears. Save it as animated-dancing-image-0243 in your Documents folder. Then read it into R: l_and_h <- image_read("animated-dancing-image-0243.gif") Applying the length() function to l_and_h > length(l_and_h) [1] 10 indicates that this GIF consists of ten frames.

To add a coolness factor, make the background of the GIF transparent before image_read() works with it. This free online image editor does the job quite nicely.

Combining the boys with the background

If you use the image combination technique, the code looks like this: image_composite(image=background, composite_image=l_and_h, offset = "+510+200") The picture it produces looks like the image above but with one problem: The boys aren’t dancing. Why is that? The reason is that image_composite() combined the background with just the first frame of l_and_h, not with all ten. It’s exactly the same as if you had run
image_composite(image=background, composite_image=l_and_h[1], 
                offset = "+510+200")
The length() function verifies this:
> length(image_composite(image=background, composite_image=l_and_h, 
         offset = "+510+200"))
[1] 1
If all ten frames were involved, the length() function would have returned 10. To get this done properly, you have to use a magick function called image_apply().

Explaining image_apply()

So that you fully understand how this important function works, let's describe an analogous function called lapply(). If you want to apply a function (like mean()) to the variables of a data frame, like iris, one way to do that is with a for loop: Start with the first column and calculate its mean, go to the next column and calculate its mean, and so on until you calculate all the column means. For technical reasons, it’s faster and more efficient to use lapply() to apply mean() to all the variables:
> lapply(iris, mean)
$Sepal.Length
[1] 5.843333

$Sepal.Width
[1] 3.057333

$Petal.Length
[1] 3.758

$Petal.Width
[1] 1.199333

$Species
[1] NA
A warning message comes with that last one, but that’s okay. Another way to write lapply(iris, mean) is lapply(iris, function(x){mean(x)}). This second way comes in handy when the function becomes more complicated. If, for some reason, you want to square the value of each score in the data set and then multiply the result by three, and then calculate the mean of each column, here’s how to code it: lapply(iris, function(x){mean(3*(x^2))}) In a similar way, image_apply() applies a function to every frame in an animated GIF. In this project, the function that gets applied to every frame is image_composite(): function(frame){image_composite(image=background, composite_image=frame, offset = "+510+200")} So, within image_apply(), that’s
frames <- image_apply(image=l_and_h, function(frame) {
  image_composite(image=background, composite_image=frame, offset = "+510+200")
})
After you run that code, length(frames) verifies the ten frames: > length(frames) [1] 10

Getting back to the animation

The image_animate() function puts it all in motion at ten frames per second: animation <- image_animate(frames, fps = 10) To put the show on the screen, it’s print(animation) All together now:
l_and_h <- image_read("animated-dancing-image-0243.gif")
background <- image_background(iris_plot, "white)

frames <- image_apply(image=l_and_h, function(frame) {
  image_composite(image=background, composite_image=frame, offset = "+510+200")
})

animation <- image_animate(frames, fps = 10)
print(animation)
And that’s the code for the image above. One more thing. The image_write() function saves the animation as a handy little reusable GIF: image_write(animation, "LHirises.gif")

R Articles

11 Useful Resources for R Programmers

Here, you learn about books and websites that help you learn more about R programming. Without further ado. . .

Interacting with users

If you want to delve deeper into R applications that interact with users, start with this tutorial by shiny guiding force Garrett Grolemund. For a helpful book on the subject, consider Chris Beeley’s web Application Development with R Using Shiny, 2nd Edition (Packt Publishing, 2016).

Machine learning

For the lowdown on all things Rattle, go directly to the source: Rattle creator Graham Williams has written Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery (Springer, 2011). Check out the companion website. The University of California-Irvine Machine Learning Repository plays such a huge role in the R programming world. Here’s how its creator prefers that you look for the material: Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. Thank you, UCI Anteaters! If machine learning interests you, take a comprehensive look at the field (under its other name, “statistical learning”): Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani’s An Introduction to Statistical Learning with Applications in R (Springer, 2017). An Introduction to Neural Networks, by Ben Krose and Patrick van der Smagt, is a little dated, but you can get it for the low, low price of nothing:

After you download a large PDF, it’s a good idea to upload it into an ebook app, like Google Play Books. That turns the PDF into an ebook and makes it easier to navigate on a tablet.

Databases

The R-bloggers website has a nice article on working with databases. Of course, R-bloggers has terrific articles on a lot of R-related topics! You can learn quite a bit about RFM (Recency Frequency Money) analysis and customer segmentation at www.putler.com/rfm-analysis.

Maps and images

The area of maps is a fascinating one. You might be interested in something at a higher level. If so, read Introduction to visualising spatial data in R by Robin Lovelace, James Cheshire, Rachel Oldroyd (and others). David Kahle and Hadley Wickham’s ggmap: Spatial Visualization with ggplot2 is also at a higher level. Fascinated by magick? The best place to go is the primary source. Check it out.

R Articles

R Project: Delay and Weather

Try out this R project to see how one variable might affect an outcome. It’s conceivable that weather conditions could influence flight delays. How do you incorporate weather information into the assessment of delay? One nycflights13 data frame called weather provides the weather data for every day and hour at each of the three origin airports. Here’s a glimpse of exactly what it has:

> glimpse(weather,60)
Observations: 26,130
Variables: 15
$ origin      "EWR", "EWR", "EWR", "EWR", "EWR", "...
$ year        2013, 2013, 2013, 2013, 2013, 2013, ...
$ month       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ day         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ hour        0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 1...
$ temp        37.04, 37.04, 37.94, 37.94, 37.94, 3...
$ dewp        21.92, 21.92, 21.92, 23.00, 24.08, 2...
$ humid       53.97, 53.97, 52.09, 54.51, 57.04, 5...
$ wind_dir    230, 230, 230, 230, 240, 270, 250, 2...
$ wind_speed  10.35702, 13.80936, 12.65858, 13.809...
$ wind_gust   11.918651, 15.891535, 14.567241, 15....
$ precip      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ pressure    1013.9, 1013.0, 1012.6, 1012.7, 1012...
$ visib       10, 10, 10, 10, 10, 10, 10, 10, 10, ...
$ time_hour   2012-12-31 19:00:00, 2012-12-31 20:...
So the variables it has in common with flites_name_day are the first six and the last one. To join the two data frames, use this code:
flites_day_weather <- flites_day %>%
  inner_join(weather, by = c("origin","year","month","day","hour","time_hour"))
Now you can use flites_day_weather to start answering questions about departure delay and the weather. What questions will you ask? How will you answer them? What plots will you draw? What regression lines will you create? Will scale() help? And, when you’re all done, take a look at arrival delay (arr_delay).