Transcript Class-06

R- LOGICAL EXPRESSIONS
Typical logical expressions used in
programs with if statements:
If(EXPRESSION) {
COMMANDS EXECUTED WHEN TRUE
} else {
COMMANDS EXECUTED WHEN FALSE
}
x<-3
y<-4.0
x==4.0 returns TRUE
x==y returns TRUE
y==y returns TRUE
y==4 may return TRUE
Better
abs(y-4)<0.00001
When the numbers are
sufficently close , then TRUE
R- LOGICAL EXPRESSIONS
Typical logical expressions used in
programs with if statements:
If(EXPRESSION) {
COMMANDS EXECUTED WHEN TRUE
} else {
COMMANDS EXECUTED WHEN FALSE
}
Note:
Practice the logical operations:
Be careful with vectors! Inform yourself about more
operators and how they behave with vector objects! The
proper use requires experience!
x<0
x>0
x<(-1) # negative numbers
must be put in ()
Logical combinations:
Logical AND: (x<y & y>3)
Logical OR: (x<y | y>3)
Logical NOT: !(x<y)
Strings can also be compared
for equivalence
month<-”Dec”
!(month==“Jan”) returns TRUE
R-PROGRAM REVISITED

albany_climatology_snow.R
Changes: filename
and the variable
name (better called
‘object name’)
Changes: all data
and variables are
derived from object
‘snow’: vector ‘time’
R-PROGRAM REVISITED

albany_climatology_snow.R
Changes: plot()
function call: update
the y-axis label
Changes: ‘res’ must be
assigned the monthly mean
snow data!
it is used below for plotting
in the function lines()
R-PROGRAM REVISITED

albany_climatology_snow.R
Changes: use the
year information
from object snow
Changes:mhelp is a new
object thatstores the
months data, but only
for the selected years
within our climatological
period
R-PROGRAM REVISITED

albany_climatology_snow.R
Changes: buffer
stores the monthly
mean snow data
of the climatological
period
Changes: snowclim
A new object to calculate
the seasonal cycle
(climatological mean).
R-PROGRAM REVISITED

albany_climatology_snow.R
Changes: snowclim
object to calculate
the seasonal cycle
(climatological mean).
Changes:plot()
function call with
adjusted y-axis label
string
Changes: lines()
plots values of
snowclim
R-PROGRAM REVISITED

albany_climatology_snow.R
Outlier,
Erroneous data
or record snow?
R-PROGRAM REVISITED

albany_climatology_snow.R
30-yr climatology
1981-2010
No snow between May-September
(but in previous years May had snow!)
SAMPLE SIZE AND ACCURACY OF
THE STATISTICAL ESTIMATES
Statistical estimates are attempts to quantify the
underlying true properties of the sample
population
 Random samples are incomplete description of
the full population
 The mean of a sample is an estimate of the true
mean
 The variance is also only an estimate of the true
variance of the population
 (Any other estimates, of course too)

SAMPLE SIZE: THE LAW OF LARGE NUMBERS

We have seen that the sample mean is the
arithmetic mean of the observations
Albany Airport
Monthly mean temperature
anomalies from the 30-yr
climatology:
Sample size: 360
Mean: 0 C (degree Celsius)
Standard Deviation = 1.744 C
SAMPLE SIZE: THE LAW OF LARGE NUMBERS

We have seen that the sample mean is the
arithmetic mean of the observations
‘online algorithm’
New incoming data:
anomaly
with respect to (w.r.t.)
previous estimated mean
With larger sample size each individual sample becomes
less influential for updating the mean and it converges to the true mean
(for Independent Identically Distributed (IID) samples)
SAMPLE SIZE: THE LAW OF LARGE NUMBERS
Created with R-program
scripts/online_average.R
(needs data/USW00014735_tavg_mon_mean_ano.csv
SAMPLE SIZE: THE LAW OF LARGE NUMBERS

Histograms give an overview of the sample distribution: mean and standard
deviation describe only in parts of the character of sample distributions
(we will learn about the skewness and tails of distributions in this course)
Albany Airport
Monthly mean temperature
anomalies from the 30-yr
climatology:
Sample size: 360
Mean: 0 C (degree Celsius)
Standard Deviation = 1.744 C
SAMPLE SIZE: THE LAW OF LARGE NUMBERS

We have seen that the sample mean is the
arithmetic mean of the observations
Albany Airport
Monthly mean temperature
anomalies from the 30-yr
climatology:
Sample size: 360
Mean: 0 C (degree Celsius)
Standard Deviation = 1.744 C
SAMPLE SIZE: THE LAW OF LARGE NUMBERS

Estimate an unknown mean of the random
process (a “population mean”):
 we
only have a sample with limited number of
observations
 Sample is drawn randomly from the population
 The larger the sample size the better the estimate
 That is if we repeated an experiment several times
each time with new samples of size n, then
the variance among the estimated means will
decrease the larger the sample size n.