Transcript pres03a2

Chapter 2
Analysis using R
Few Tips for R
•
Commands included here CANNOT ALWAYS be
copied and pasted directly without alteration.
–
One major reason is the type of quotations…
–
R only accepts double-straight vertical quotations(''
standard “word” type of quotations (
•
Also, R is very case sensitive
“”
)
''), not the
Example 2.3.1
• Introduction to the data (pg 21)
– “Shortly after…each of a group of 44 students
[in meters]…another group of 69
students…width [in feet]”
– Will there be a significant difference?
Initiate the data
• To initiate the data that we will use,
activate from HSAUR package:
• R> data(“roomwidth”, package = “HSAUR”)
Conversion
• Since some of the units are in meters, and some
are in feet, we need to convert
• R> convert <- ifelse(roomwidth$unit == “feet”, 1, 3.28)
– Convert is the variable that we are creating now.
– In it, we will store two possible values through the use
of an ifelse statement.
• How to read: “If the column ‘unit’ found in data ‘roomwidth’
(which was previously initiated) reads as “feet”, then convert
= 1. However [else], if ‘unit’ does NOT equal feet, then
convert = 3.28
• 3.28 = conversion factor, meters  feet.
Summary Statistics
• Let’s say that we want to bring up general
statistics about the data…
– Min, quartiles, max, mean, median, standard
deviation…
• Use summary or sd command.
• tapply(roomwidth$width * convert, roomwidth$unit, summary)
– Reads: “Apply the following function to each cell of
our data: take the width column, found in data
roomwidth, and multiple it by convert (a vector with
two possible values depending on the value of width).
Then, depending on the column ‘unit” found in data
roomwidth, display the summary statistics about it.”
Output (R)
SD Statistic
• Since the standard deviation is not
included in the displayed statistics for
some reason, a unique comment must be
added.
• The command reads the same as before,
except the “sd” function is applied.
• tapply(roomwidth$width * convert, roomwidth$unit, sd)
Graphical Analysis (Input)
•
•
•
•
•
•
> data("roomwidth", package = "HSAUR")
> convert <- ifelse(roomwidth$unit == "feet", 1, 3.28)
> tapply(roomwidth$width * convert, roomwidth$unit, summary)
$feet
Min. 1st Qu. Median Mean 3rd Qu. Max.
24.0 36.0 42.0 43.7 48.0 94.0
•
•
•
$metres
Min. 1st Qu. Median Mean 3rd Qu. Max.
26.24 36.08 49.20 52.55 55.76 131.20
•
•
•
•
•
> tapply(roomwidth$width * convert, roomwidth$unit, sd)
feet metres
12.49742 23.43444
> layout(matrix(c(1,2,1,3), nrow=2, ncol=2, byrow=FALSE))
> boxplot(I(width * convert) ~ unit, data = roomwidth, ylab = "Estimated width (feet)", varwidth = TRUE, names =
c(“Estimates in feet", "Estimates in meteres (converted to feet)"))
> feet <- roomwidth$unit == "feet"
> qqnorm(roomwidth$width[feet], ylab = "Estimated width (feet)")
> qqline(roomwidth$width[feet])
> qqnorm(roomwidth$width[!feet], ylab = "Estimated width (metres)")
> qqline(roomwidth$width[!feet])
•
•
•
•
•
Output
Graphical Analysis - Layout
• Input sets up the layout of the box plot and
graphs
•> layout(matrix(c(1,2,1,3), nrow=2, ncol=2, byrow=FALSE))
• Input creates top boxplot on previous
screen
– boxplot(I(width * convert) ~ unit, data = roomwidth, ylab = "Estimated
width (feet)", varwidth = TRUE, names = c(“Estimates in feet",
"Estimates in metres (converted to feet)"))
• Lables
– ylab
– varwidth
• Histograms
– >hist()
Two Sample t-test using R
• We are performing a two sample t-test
assuming equal variances
Wilcoxon Mann-Whitney Rank Sum Test
Using R
Notice additional command at end of function to display
confidence interval
Example 2.3.2
One Sample t-test
• Example of paired t-test using difference in two
mooring methods in wave energy experiment
(p 22)
Example 2.3.3
Correlation Test Using R
• Choose two factors that you want to test for
correlation
• ex. mortality + hardness in water data