A Summer of Learning, R Part 2

2015MLB-color

Last week I talked about setting up vectors and querying them using R code. Today I want to delve into a bit more detail about using R to produce visualizations. I was trying to answer the question of which Major League Baseball league had the highest number of wins for the 2015 season.

To begin, I got the regular season standings data from MLB.com. Since I don’t yet know how to scrape data from the website into R, I manually entered it into R Studio by creating a vector with all wins values for both leagues. I named one vector “a” for the American League and the other vector “n” for the National League. Then I made a vector “z” to combine the data from both leagues into one single vector.

 

Next, I assigned the name ‘dataList’ to the ‘z’ vector. Since I wanted the box plots to be side-by-side and be different colors of blue for the American League and lavender for the National League, I used the following code. Note that you need to assign names to the x and y axis as well as the main title. I think it’s pretty cool that with one line of code you can do so many things in the boxplot.

R2

For more references on basic boxplots, here are a few good resources: QuickR and Cookbook for R.

Advertisements

One thought on “A Summer of Learning, R Part 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s