PSYCH 018 Index > Introduction to R (Part 2)

Introduction to R: Vectors; Saving Your Work

Vectors

In the previous part of the tutorial, you created an object that held just one number. In dealing with statistics, though, you will most often have a series of related numbers; for example, the scores people get on a psychological test, or the ages and genders of the participants. We call such a set of numbers a vector, and here’s how you create them. Let’s start by creating a vector of five days worth of high temperatures in Washington DC during December 2008, measured in ° Fahrenheit. We will call this vector highs. It is always a good idea to use descriptive names for your objects when possible. Type this:

> highs <- c(33,49,66,56,44)
> highs
[1] 33 49 66 56 44

The c stands for concatenate, which is a fancy word for “combine.” The c function combines all the numbers you give it into a single vector, which gets assigned <- to the object named highs. You will find that reading from right to left sometimes makes R easier to understand!

Note that the result starts with [1]; this means that R is showing you the data starting with element 1 of the result. See more information about this number in brackets.

Similarly, let’s enter the low temperature data for those five days:

> lows <- c(22, 33, 48, 39, 32)
> lows
[1] 22 33 48 39 32

We can do basic statistics on a vector:

> mean(highs)
[1] 49.6
> var(highs) # variance
[1] 154.3
> max(highs) # maximum value
[1] 66
> median(lows)
[1] 33
> sd(lows) # standard deviation
[1] 9.576012
> min(lows) # minimum value
[1] 22
> max(highs) - min(highs) # gives range of highs
[1] 33
> summary(highs) # get lots of info!
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   33.0    44.0    49.0    49.6    56.0    66.0
> # Qu. stands for "quartile"

Making Errors on Purpose

Before going further, let’s talk about mistakes. When you do your own work with R, rather than just copying what is written here, you are likely to make mistakes, and R’s error messages can be sometimes cryptic. Rather than having you freak out when you see them, you’re going to make some errors on purpose now so that you can get used to the kind of error messages that R produces. You’ll notice that the error message doesn’t tell you exactly what’s wrong. Find out why.

You typeThe error
x <- (3,4,5) Missing the c before the opening parenthesis
x < - c(3,4,5) Put a blank between the < and -
x <- c(3 4 5) Missing the commas between the items
x <- c 3, 4, 5 No parentheses at all
x <- c(3, 4, 5 No closing parentheses

The last one is interesting; you don’t get an error message; instead, R prompts you with a plus sign (+) that means “please type more.” You can type the closing parenthesis, and all will be well with the world.

Arithmetic with Vectors

If you want to find the difference between the low and high temperatures for each day, you just type this:

> highs - lows
[1] 11 16 7 17 12

Here is how you would add one to all the low temperatures:

> lows # show current values
[1] 22 33 48 39 32
> lows + 1
[1] 23 34 49 40 33

And we could create a vector of the high and low temperatures in Celsius by doing this:

> cHighs <- (highs - 32) / 9 * 5
> cLows <- (lows - 32) / 9 * 5
> cHighs
[1]  0.5555556  9.4444444 18.8888889 13.3333333  6.6666667
> cLows
[1] -5.5555556  0.5555556  8.8888889  3.8888889  0.0000000

It would be nice to be able to round off those numbers. Is there such a function? Let’s see if there is help available for anything by that name:

> help("round")

Sure enough, there is; a window opens up with the help file. If you read through it, you will see that the function rounds a number to the specified number of places. So let’s round the highs to zero decimals (the default) and the lows to two decimals. Ordinarily you would do the same rounding on both, but I want to make an example here.

> cHighs <- round(cHighs)
> cLows <- round(cLows, 2)
> cHighs
[1]  1  9 19 13  7
> cLows
[1] -5.56  0.56  8.89  3.89  0.00

In this example, we “reused” an object. Here’s how to read cHighs <- round(cHighs); from right to left:

Vectors of Strings

It is possible to create a vector whose elements are all strings of letters rather than numbers. These vectors work just like numeric vectors:

> color <- c("Red", "Green", "Blue", "Yellow")
> color
[1] "Red"    "Green"  "Blue"   "Yellow"

You Do It!

Now you figure out the following in R. Use parentheses where you see them in the text. See the solution.

Saving Your Work

Before you leave this session, you will want to save your work. You can’t save it to hard disk in our computer labs, because anything saved to the hard disk gets wiped out when the machine is rebooted. Saving data to a flash drive is your best alternative. In this example, I have a flash drive in the machine, and it shows up as drive F:. I have created a directory named psych018 on that drive. I tell R to use that directory by typing this:

> setwd("F:\\psych018")

setwd means “set working directory,”–the directory where your work is going to go. You must put the name of the folder in quotemarks, and you have to put two backslashes \\ instead of one (that’s because backslash is used for something special in R). R doesn’t give you any output if you did everything correctly; with R, “no news is good news.”

Now type q() to quit R. You will once again get the dialog box asking if you want to save your workspace image. Click Yes. If you look in the folder you used for setwd, you will see two files: .RData and .Rhistory; .RData contains all the objects you have defined, and .Rhistory contains the commands you have typed in during your session.