Using R for Statistical Instruction

Download Report

Transcript Using R for Statistical Instruction

Using R for Statistical
Instruction
Getting Started
By Buddy Bilbrey, Lander University
[email protected]
Advantages of R
• Open source (free)
• (2016) KD Nuggets survey stated that R is the top software among
professionals
• Industry Professionals are using because of price and capabilities
• Extremely powerful (many different contributors)
• Stable on virtually any platform
• Performs more than statistics (data mining and other analytic
functions)
• Gives students a skill that is extremely desirable
Disadvantages of R
• STEEP learning curve
• Uses typed commands instead of GUI
• Newer versions sometimes force upgrading the software
• No warranty or defined help available (must use the web)
• Solutions are difficult to find due to limited online examples (this gets
better every day)
Finding and Installing the Software
• https://www.r-project.org/
• Must choose a mirror site for downloading
• Must add each package individually
• Throwback to 90’s command line software
Downloading R
https://cran.r-project.org/mirrors.html
• All mirror sites have identical copies
• This prevents one server from becoming overloaded
Mirror Sites List by Country
From inside the R Software –
select mirror sites for each new package installed
List of Mirror Sites by country
List of Packages Available for Installing
Some Available GUIs
• R-Commander (most common)
• Sciviews-K
• RKWard
• PMG
• Red-R
• R Analytic Flow
R-Commander (GUI)
• Must be installed separately as a package
• Very common commands included
• Does graphing including 3D graphics
• Incomplete
• Some commands need to ran in R on command line
• Other packages may need to be installed (qcc (quality control), DOE, etc.)
• Unavailable functions are greyed out automatically
• NOTE: It is difficult to go from a GUI to typing in commands on
command line
Opening R-Commander (case sensitive)
Type the following AFTER the Rcmdr package is installed in R
> library(Rcmdr)
R-Commander Examples
Importing Data into Rcmdr
• Can import from clipboard
• Can import from files
• Can copy/paste from clipboard
• .csv files work best for importing entire file
• Excel files sometimes are difficult
Importing into Rcmdr with Clipboard or file
Name the
Dataset for
later
referencing
Clipboard
or
.csv file
Reference Books for use with R instruction
• R for Business Analytics by A. Ohri (recommended for beginners)
• Business Analytics for Managers by Wolfgang Jank
• An Introduction to Statistical Learning (ISLR) –
• Free .pdf book (advanced Analytics)
• (www.StatLearning.com ) which takes you to -> (http://wwwbcf.usc.edu/~gareth/ISL/ )
ISLR
Online
Book
eBook
.pdf
download
ISLR Basic Commands 2.3.1
• type in single column of data
> x = c(1,6,2)
> y = c(1,4,3)
>x
[1] 1 6 2
> length(x)
[1] 3
# stores the three values in variable x
# stores the three values in variable y
# print values stored in x
# number of values in x
ISLR Basic Commands 2.3.1 (cont.’)
> x=matrix (data=c(1,2,3,4) , nrow=2, ncol =2) # create a table of data
>x
[,1] [,2]
[1,] 1 3
[2,] 2 4
ISLR Basic Commands 2.3.1 (cont.’)
> sqrt(x) # take the square root and square values
[,1] [,2]
[1,] 1.00 1.73
[2,] 1.41 2.00
> x^2
[,1] [,2]
[1,] 1 9
[2,] 4 16
Other Common Commands
• [Create a linear regression model* – store in variable model1]
> priceData <- read.csv(file.choose())
# input the data with dialog
> priceModel <- lm(Price~Qty, data = priceData) # build the regression model
> summary(priceModel)
# print the results
Data came from “Business Analytics for Managers” by Wolfgang Jank.
Experiences with Teaching Statistics in R
• Small bite examples (one-sample t-test, two-sample t-test, regression,
etc.)
• Students actually retain the commands well with practice
• Difficult to go backwards from GUI to command lines