An Introduction to R

Download Report

Transcript An Introduction to R

An Introduction to R
POL 51
October 13, 2008
Malcolm Easton
Office hours: Thursday 3:30-5:30 in Rm 245
Email: [email protected]
1
Section Outline
• What is R?
• Presentation of some R basics
• Set of exercises done in class if time permits
2
What can R do?
• R can act as an overgrown calculator
• R can run some very high end statistical
models
3
Loading R
• Go to Class website:
http://psfaculty.ucdavis.edu/bsjjones/pol51fal
l2008.html
• Follow instructions to load R
4
Why R is so user-friendly
• R is an object oriented environment!
• Everything is an object. You write code to
create objects or variables and then define
how these objects relate to each other.
• xbar<-mean(x)
• xbar=mean(x)
• You create a variable name (xbar)and then
you describe its “value”, in this case the mean
of whatever x is.
5
Inputting data manually
• It is easy!
• weight<-c(60, 72, 57, 90, 95, 72)
• c(…) is used to define a vector of numbers.
• You can generate a matrix (two-dimensional
array of numbers—rows and columns) by
binding two vectors together.
• height<-c(6, 5, 7, 5, 7, 5)
6
cbind and rbind
• You can now “glue” both of these vectors
together either as two columns with the
variable name on top or two rows with the
variable names on the left hand side.
• xmat1<-cbind(weight, height)
• xmat2<-rbind(weight, height)
7
This is what it looks like
•
•
•
•
•
•
•
•
•
•
•
•
•
•
xmat1=cbind(weight, height)
xmat1
weight height
[1,] 60 6
[2,] 72 5
[3,] 57 7
[4,] 90 5
[5,] 95 7
[6,] 72 5
xmat2=rbind(weight, height)
xmat2
[,1] [,2] [,3] [,4] [,5] [,6]
weight 60 72 57 90 95 72
height 6 5 7 5 7 5
8
How to ask for help
• What if you want to use data that is provided on
the website? Specifically the Congressional
Control Pricing data in excel format.
• You have to “read” the data into R.
• Well, if you are a beginner just ask R how to read
in data.
• Ex: ?read.table
• If you cannot be specific (why did he write
.table?) you can use a built-in search by typing
help.search(“table”)
9
Ok, now this is how you read a .csv file
into R
• General code: data.set<-read.table(“C:/file
name.ext”, header=TRUE)
• Note another oddity about R, you must use
forward slashes.
• congress<-read.csv("C:/Documents and
Settings/Malcolm Easton/Desktop/Pol
51/congressprice.csv", header=TRUE)
• Also notice that header=TRUE is just telling R
that the first line is a header containing the
names of variables in the file.
10
Now you can play with summary
statistics!
• summary(congress$rhdsprice)
• Or if writing all of that is a pain you can
choose to assign that path to an object.
• price1<-congress$rhdsprice
• summary(price1)
• Or easier still, just “attach” your data set
which tells R to look for objects among the
variables in a given data frame.
11
Attaching a data frame
• If you type attach(congress)you can now
summarize your data by typing
summary(rhdsprice)
12
More fun with summary stats
• The summary() function gives you some basic
summary stats, but you can get more specific
if you like.
• Mean: mean(rhdsprice)
• Median: median(rhdsprice)
• Maximum: max(rhdsprice)
• Minimum: min(rhdsprice)
13
Is that all R can do?
• No, that is just the tip of the iceberg.
• You can code functions into R or use a large
number of pre-coded functions.
• You can use R to calculate the variance, and
standard deviation of a variable as well as a slew
of graphical options as well.
• Built in code: var(rhdsprice)
• Manually coding: n=length(rhdsprice)
• Manual.var=(1/(n-1))*sum((rhdspricemean(rhdsprice))^2)
14