Class23_Intro_to_R-1x

Download Report

Transcript Class23_Intro_to_R-1x

Introduction to R
Hydroinformatics – Fall 2016
Programming is Good for You
“It teaches you how to think.” – Steve Jobs
Integrating Different Languages
Python because the language of the
Earth Engine API is Python
Statistical analysis and
visuals for publication in R
We promised…
Python
Data
Management
Life Cycle
Data Analysis
and
Implementation
MySQL and
Databases
R
Why use R?
• R most-used data science language after SQL (O’Reilly survey,
January 2014)
• R growing faster than any other data science language
(KDNuggets survey, August 2013)
• Documentation and peer support
• Open Source
You may need to switch back and forth between (or integrate!)
coding languages to accomplish a specific task. In some cases, R
may be the right/best/most useful/most convenient tool for the
job!
Quiz
Open the “MyFirstRProject” that you created in the preclass exercise.
Open the script that you saved and run it.
Report the greatest value of “z” on canvas.
Learning Objectives
• Understand the R environment
• Describe different data types in R and how to work with them
• Use R to load and analyze data
R Basics
• Mathematical Expressions
• Variables
• Comments
• Vectors
• R Data Frames
Simple Mathematical Expressions in R
> 1 + 1
[1] 2
# Simple arithmetic
> 2 + 3 * 4
[1] 14
# Operator precedence
> exp(1)
[1] 2.718282
> sqrt(10)
[1] 3.162278
# Basic mathematical functions are available
Variables in R
> a
> b
> c
> a
[1]
= 1
# Variables are assigned values
= 30
# Using the “<-” or “=“ operator
<- 3.5
* b * c
105
> A * b * c # Important: Variable names are case sensitive
Error: object 'A' not found
Vectors in R
• Created with
• c() to concatenate elements or sub-vectors
• rep() to repeat elements or patterns
• seq() to generate sequences from m:n
Try:
• Creating a vector of every 20th element between 0 and 500. Use
help(seq) to get the syntax
• Finding the square root (sqrt()) of each element of the vector using a
single command.
• Most mathematical functions and operations can be applied to
vectors – which can eliminate the need for looping!
Element Indexing
Create a vector called x, which contains the names of Dr. Horsburgh’s musical idols.
>artists <- c(‘Justin Bieber’,’Beyonce’,’Taylor Swift’)
# Create a vector called x
Dr. Horsburgh is fickle, so his all-time favorite changes from day-to-day. Today, it happens to be the second person in the list.
> artists[2] # Select the second element
[1] “Beyonce”
Note that in Python, the first element is indexed as 0, but in R, the first element is indexed as 1.
But Dr. Ames feels this doesn’t show enough respect, so he renames the second artist with a more appropriate title:
> artists[2] <- “Queen Bey”
# Set the value of an element in a vector
> artists
[1] "Justin Bieber" “Queen Bey" “Taylor Swift"
Note that your elements can be numerical or strings, but you can’t mix the two
Data Frames
• A group of related vectors
• The equivalent of a table in R
• Create from scratch using data.frame()
> newDataFrame <- data.frame(Time=c(0,30,60),
Discharge=c(65,72,103))
> newDataFrame
Time Discharge
1 0
65
2 30
72
3 60
103
Data Frames Indexing
• Multiple ways to index columns of data
• The following are all equivalent:
newDataFrame[“columnName”]
newDataFrame[,n] – where n is the column index
newDataFrame$columnName
Try:
Averaging the flow measurements from newDataFrame using
mean()
mean(newDataFrame$Discharge)
Data Frames
• Read into R from a text file:
newData <read.csv(“table.csv”,header=TRUE)
• If first line of the file has a name for each column, then
header=TRUE
Visualization
• plot()
• barplot()
• hist()
• Easy to customize labels, markers, line types, etc,.
• Use help(…) to determine syntax and arguments
Group Coding Challenge
• In the same groups as your semester project teams:
https://goo.gl/1420T2
Next time…
• Practice, review, and be comfortable with R and the
environment
• Get data using packages and get into visualization