PSYCH 018 Index > Introduction to R (Part 6)

Introduction to R: Basic Graphing

If you haven’t already done so, read in the demographics information from the last section and create the height column:

To read in the data, you use the read.csv() function. If you have the file on your local disk, you can give the path name in the parentheses. Or, in this case, you can read it straight from the website. (In the example below, we show only a few of the rows rather than waste the space here on this page.)

> student <- read.csv("http://evc-cit.info/psych018/r_intro/demographics.csv")
> student$height <- student$feet * 12 + student$inches

To get a histogram of the heights, just do this:

> hist(student$height)

You get an image like this (click image to see full size). Note that the title and x-axis label leave something to be desired.

Histogram of student heights

To fix this, you need to put some extra parameters into the call to hist. Don’t type the plus sign—R puts that in for you because the first line isn’t a complete R command.

> hist(student$height, main="Distribution of Heights",
+ xlab="Height in Inches")

Now the labels are much nicer:

Histogram of student heights with given labels

Notice that the x-axis divisions have been chosen by R, and they aren’t exactly wonderful. You can specify the lower and upper limits for the x-axis with the xlim parameter. It requires two numbers, so you need to use c( ) to put those numbers together. In this example, the plot will give the x-axis a range from 50 to 80 inches.

> hist(student$height, main="Distribution of Heights",
+ xlab="Height in Inches", xlim=c(50,80))
Histogram of student heights from 50 to 80 inches

Frequency vs. Density

This shows us the frequency of each height range. Sometimes, though, you want to see a percentage of the total. To do this, add freq=F to the command. In R, F is an abbreviation for FALSE (you may type either one), and T is short for TRUE. By typing freq=F, you are telling R that you do not want frequencies, but the density instead. Try this:

> hist(student$height, main="Distribution of Heights",
+ xlab="Height in Inches", xlim=c(50,80),
+ freq=F)

Click the picture to see it in full size, and note the y-axis. This plot was done on a different system than the others, so the lettering looks a bit different.

Histogram of student heights from 50 to 80 inches showing density on y-axis

Bar Plots

Let’s create bar plots for the mean height and weight for males and females. First, separate out the males and females into two new data frames:

> males <- student[student$gender=="M",]
> females <- student[student$gender=="F",]

Now create a data frame with the numbers you want. This time, rather than reading in a data frame from an external file, you are creating it one column at a time. The first column is gender, the second is height, and the third is weight.

> hwmean <- data.frame( gender=c("F", "M"),
+ height=c(mean(females$height), mean(males$height)),
+ weight=c(mean(females$weight), mean(males$weight)))
>hwmean
  gender   height weight
1      F 62.76000 126.42
2      M 68.46154 161.00

You can then do a barplot of the heights, with axes labeled properly:

> barplot(hwmean$height, xlab="Gender", ylab="Mean Height in Inches", names=hwmean$gender)
Bar plot of mean height (gender on X axis, height in inches on Y axis)

...more to come