Top Banner
An Introduction to
76

An Introduction to. Web resources R home page: R Archive:

Dec 28, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to. Web resources R home page:   R Archive:

An Introduction to

Page 2: An Introduction to. Web resources R home page:   R Archive:

Web resources

• R home page: http://www.r-project.org/

• R Archive: http://cran.r-project.org/

• R FAQ (frequently asked questions about R): http://cran.r-project.org/doc/FAQ/R-FAQ.html

• R manuals: http://cran.r-project.org/manuals.html

Page 3: An Introduction to. Web resources R home page:   R Archive:

The R environment• R command window (console) or

Graphical User Interface (GUI)

– Used for entering commands, data manipulations, analyses, graphing

– Output: results of analyses, queries, etc. are written here

– Toggle through previous commands by using the up and down arrow keys

Page 4: An Introduction to. Web resources R home page:   R Archive:

The R environment

• The R workspace

– Current working environment

– Comprised primarily of variables, datasets, functions

Page 5: An Introduction to. Web resources R home page:   R Archive:

The R environment

• R scripts– A text file containing commands that you

would enter on the command line of R

– To place a comment in a R script, use a hash mark (#) at the beginning of the line

Page 6: An Introduction to. Web resources R home page:   R Archive:

Some tips for getting started in R

1. Create a new folder on your hard drive for your current R session

2. Open R and set the working directory to that folder

3. Save the workspace with a descriptive name and date

4. Open a new script and save the script with a descriptive name and date

Page 7: An Introduction to. Web resources R home page:   R Archive:

Executing simple commands

• The assignment operator <-• x <- 5 assigns the value of 5 to the variable x

• y <- 2*x assigns the value of 2 times x (10 in this case) to the variable y

• r <- 4

• area.circle <- pi*r^2

• NOTE: R is case-sensitive (y ≠ Y)

Page 8: An Introduction to. Web resources R home page:   R Archive:

R object types

• Vector

• Matrix

• Array

• Data frame

• Function

• List

Page 9: An Introduction to. Web resources R home page:   R Archive:

Vectors and arrays

• Vector: a one-dimensional array, all elements of a vector must be of the same type (numerical, character, etc)

• Matrix: a two-dimensional array with rows and columns

• Array: as a matrix, but of arbitrary dimension

Page 10: An Introduction to. Web resources R home page:   R Archive:

Entering data into R

• Vectors:• The “c” command

– combine or concatenate data

• Data can be character or numeric

v1 <- c(12, 5, 6, 8, 24)

v2 <- c("Yellow perch", "Largemouth bass", "Rainbow trout", "Lake whitefish“)

Page 11: An Introduction to. Web resources R home page:   R Archive:

Entering data into R

• Vectors:• Sequences of numbers

c()seq()

years<-c(1990:2007)

x<-seq(0, 200, length=100)

x<-seq(0,100,10)

Page 12: An Introduction to. Web resources R home page:   R Archive:

Entering data into R

• Arrays:

array()

matrix()

m1<-array(1:20, dim=c(4,5))

m2<-matrix(1:20, ncol=5, nrow=4)

Page 13: An Introduction to. Web resources R home page:   R Archive:

Entering data into R

• Arrays:

• Combine vectors as columns or rows

cbind()

rbind()

Matrix1 <- cbind(v1, v2)

Matrix2 <- rbind(v1, v2)

Page 14: An Introduction to. Web resources R home page:   R Archive:

Data frames

• A data frame is a list of variables of the same length with unique row names

• A collection of variables which share many of the properties of matrices and of lists

• Used as the fundamental data structure by most of R's modeling software

Page 15: An Introduction to. Web resources R home page:   R Archive:

Data frames

• Convert vectors or matrices into a data frame

data.frame()

df1<-data.frame(v1, v2)

df2<-data.frame(matrix1)

Page 16: An Introduction to. Web resources R home page:   R Archive:

Data frames

• Editing data frames in spreadsheet-like view

edit()

df2<-edit(df1)

Page 17: An Introduction to. Web resources R home page:   R Archive:

• Let’s go to R and enter some vectors, arrays, and data frames

Script : QFC R short course R object types_1_vectors and arrays.R

Page 18: An Introduction to. Web resources R home page:   R Archive:

Placing variables in the R search path

• When variables in a data frame are used in R, the data frame name followed by a $ sign and then the variable name is required

query1<-df1$v3 > 20

Page 19: An Introduction to. Web resources R home page:   R Archive:

Placing variables in the R search path

• Alternatively, the attach() function can be used

attach(df1)

query1 <- v3 > 20

detach(df1)

Page 20: An Introduction to. Web resources R home page:   R Archive:

Accessing data from an array, vector, or data frame

• Subscripts are used to extract data from objects in R

• Subscripts appear in square brackets and reference rows and columns, respectively

Page 21: An Introduction to. Web resources R home page:   R Archive:

Subscripts

C1 C2 C3 C4 C5

R1 25 Mon 56 45 Cat

R2 2 Tues 84 2 Dog

R3 24 Wed 7 15 Dog

R4 15 Thurs 56 236 Cat

R5 26 Fri 89 6 Cat

R6 25 Sat 23 58 Dog

R7 2 Sun 11 8 Dog

df1df[3,5]

df[,3]

df[5,]

df[2:5,]

Page 22: An Introduction to. Web resources R home page:   R Archive:

Queries in R: Common logical arguments

> Greater than

< Less than

== Equals

!x ! Indicates logical negation (not), not x

x & y Logical and, x and y

x | y Logical or, x or y

Page 23: An Introduction to. Web resources R home page:   R Archive:

Queries in R• The use of logical tests

query1<-df1$v3 > 20

df1[query1,]

query2<-df1$v3 > 20 & df1$v4 < 30 (&=and)

query2<-df1$v3 > 20 | df1$v4 < 30 (| = or)

Page 24: An Introduction to. Web resources R home page:   R Archive:

Queries in R

Script : QFC R short course R object types_2_query arrays data frames.R

Page 25: An Introduction to. Web resources R home page:   R Archive:

Exercise 1

Page 26: An Introduction to. Web resources R home page:   R Archive:

Importing data from Excel (or other database management programs)

• Export as text file (.txt)

• Tips– Avoid spaces in variable and character

names, use a period (e.g., fish.weight, not fish weight and Round.lake not Round lake)

– Replace missing data with “NA”– See Excel example (MI STORET data

RAW.xls)

Page 27: An Introduction to. Web resources R home page:   R Archive:

Importing data from Excel

• read.table()data.frame.name <- read.table(“file path”,

na.strings=”NA”, header=TRUE)

df1<-read.table("C:\\R\\Example\\datafile1.txt", na.strings="NA", header=TRUE)

Note the use of \\ instead of \ in path name

Page 28: An Introduction to. Web resources R home page:   R Archive:

Importing data from Excel

• If your working directory is set, R will automatically look for the data text file there.

• So, the read.table syntax can be simplified by excluding the file path name:

• read.table(“data.txt”, na.strings=“NA”, header=T)

Page 29: An Introduction to. Web resources R home page:   R Archive:

Exporting data from R

write.table()

write.table(df, file = "Path Name\\file_name.csv", sep = ",", col.names = NA)

Page 30: An Introduction to. Web resources R home page:   R Archive:

Introduction to R functions

• R has many built-in functions and many more that can be downloaded from CRAN sites (Comprehensive R Archive Network)

• User-defined functions can also be created

Page 31: An Introduction to. Web resources R home page:   R Archive:

The R base package

Page 32: An Introduction to. Web resources R home page:   R Archive:

Introduction to R functions

• Common functions

names(): obtain variable names of a df

summary(): summary of all variables in a df

mean(): Mean

var(): Variance

sd(): standard deviation

Script: QFC R short course R functions_1.R

Page 33: An Introduction to. Web resources R home page:   R Archive:

Introduction to R functions, cont

head(): print first few rows of data frame

sapply() and tapply(): column-wise summaries

levels(): obtain levels of a character variable

by(): produce summaries by group

Page 34: An Introduction to. Web resources R home page:   R Archive:

Introduction to R functions, cont

tapply(variable, list(group1, group2), mean)Applies function to each element in ragged arrays

sapply(variable, FUN=)

Applies a function to elements in a list

by(data, INDICES, FUN)

Page 35: An Introduction to. Web resources R home page:   R Archive:

Introduction to R loops

Basin syntax:

for (i in 1:n){

some code

}

*Excel example

Script: QFC R short course R functions_2.R

Page 36: An Introduction to. Web resources R home page:   R Archive:

User-defined functions

Function name <- function(x){

argument }

Script: QFC R short course R user defined functions_3.R

Page 37: An Introduction to. Web resources R home page:   R Archive:

Exercise 2 (Part 1)

Page 38: An Introduction to. Web resources R home page:   R Archive:

R functions part 2: subset data

• subset() function

sub<- subset(data frame, criteria)

sub1<-subset(fish, no.fish > 50)

sub2<-subset(fish, no.fish>50 & position=="Below")

Page 39: An Introduction to. Web resources R home page:   R Archive:

R functions part 2: subset data

• Select specific columns

sub3<-subset(fish, select=c(stream, site, no.fish))

Script: QFC R short course R subset_4.R

Page 40: An Introduction to. Web resources R home page:   R Archive:

Exercise 2 (Part 2)

Page 41: An Introduction to. Web resources R home page:   R Archive:

Introduction to basic graphing

http://addictedtor.free.fr/graphiques/

Page 42: An Introduction to. Web resources R home page:   R Archive:
Page 43: An Introduction to. Web resources R home page:   R Archive:

Graphing basics

Plotting commands1. High-level functions: Create a new plot

on the graphics device2. Low-level functions: Add more

information to an already existing plot, such as extra points, lines, and labels

3. Interactive graphing functions: Allow you to interactively add information to a graph

Page 44: An Introduction to. Web resources R home page:   R Archive:

Common high-level functions

• plot(): A generic function that produces a type of plot that is dependent on the type of the first arguement

• hist(): Creates a histogram of frequencies

• barplot(): Creates a histogram of values

• boxplot(): Creates a boxplot

• pairs(): Creates a scatter plot matrix

Page 45: An Introduction to. Web resources R home page:   R Archive:

Common high-level functions

plot()

plot(x)

plot(x,y) : scatter plot

plot(y~x) : scatter plot

plot(group, x) : box plot

Page 46: An Introduction to. Web resources R home page:   R Archive:

Common high-level functions

hist(x)

boxplot(x~group)

pairs(z)

pairs(df1[,3:7])

Page 47: An Introduction to. Web resources R home page:   R Archive:

Common high-level functions

Script:

QFC R course Graphing Basics_1.R

Page 48: An Introduction to. Web resources R home page:   R Archive:

Exercise 3

Page 49: An Introduction to. Web resources R home page:   R Archive:

END OF DAY 1

Page 50: An Introduction to. Web resources R home page:   R Archive:

Lower-level graphing functionsLength (mm) histogram

Chinook slamon lengths

Den

sity

100 120 140 160 180

0.00

0.02

0.04

110

130

150

170

Boxplot of length

Leng

th (

mm

)

DWOR MCCA RAPH

050

010

0015

0020

00

Chinook triglyceride levels for three hatcheries

Trig

lyce

rides

(m

g/dL

)

110 120 130 140 150 160 170

2030

4050

Scatter plot of length-weight

Length (mm)

Wei

ght

(g)

DWOR

MCCA

RAPHBig Fish

Page 51: An Introduction to. Web resources R home page:   R Archive:

Lower-level graphing functions

• Axis scales and labelsxlim=c(0,50)ylim=c(0,100)xlab=“text”ylab=“text”main=“text”cex= <1 will make font smaller than default, >1 will increase font size

Page 52: An Introduction to. Web resources R home page:   R Archive:

Lower-level graphing functions

0 5 10 15 20 25

05

1015

2025 pch = symbol types

col = color types

12

34

56

78

910

1112

1314

1516

1718

1920

2122

2324

25

Symbol shapes and colors

Page 53: An Introduction to. Web resources R home page:   R Archive:

Lower-level graphing functions

• Adding lines and text, and points

abline()

abline(a,b) a= intercept, b = slope

abline(h=mean(x, na.rm=T)

text()

text(x,y, “text”, options)

points()

points(x,y, options)

Page 54: An Introduction to. Web resources R home page:   R Archive:

Lower-level graphing functions

Scripts:

QFC R course Graphing Basics_2.R

QFC R course Graphing Basics_3.R

Page 55: An Introduction to. Web resources R home page:   R Archive:

Exercise 4

Page 56: An Introduction to. Web resources R home page:   R Archive:

Introduction to statistical analyses

• R provides many functions for statistical analyses– Descriptive– Univariate– Multivariate– Mixed models– Spatial– Bayesian

Page 57: An Introduction to. Web resources R home page:   R Archive:

Introduction to statistical analyses

•Descriptive statistics

Correlations: cor()

cor(df1[,2:6])

t-tests: t.test()

t.test(y~group)

Script:

QFC R short course correlations and t-test.R

Page 58: An Introduction to. Web resources R home page:   R Archive:

Basic model structure in R

response variable ~ predictor variable(s)

Page 59: An Introduction to. Web resources R home page:   R Archive:

Symbols in model statements are used differently compared to arithmetic expressions

Symbol Meaning

+ Indicates inclusion of a predictor variable, not addition

- Indicates the deletion of a predictor variable, not subtraction

* Indicates inclusion of a predictor variable and an interaction, not multiplication

/ Indicates nesting of predictor variables, not division

| Indicates conditioning

: Indicates an interaction (e.g., A:B is a two-way interaction between A and B)

Page 60: An Introduction to. Web resources R home page:   R Archive:

Specifying models in R

• Linear regression example:

iii exy 10 i = 1, 2,…n

Page 61: An Introduction to. Web resources R home page:   R Archive:

Linear regression

ny

y

y

...2

1

= 0

1

...

1

1

+ 1

nx

x

x

...2

1

+

ne

e

e

...2

1

=

nx

x

x

1

......

1

1

2

1

1

0

Design matrix X

Page 62: An Introduction to. Web resources R home page:   R Archive:

Linear regression

YXXX TT 1

We can solve for and by0 1

Script:

QFC R short course simple linear regression example 1.R

Page 63: An Introduction to. Web resources R home page:   R Archive:

Simple linear regression lm()Examples

lm(y~x), with intercept and where x and y are continuous

lm(y~1+x), with intercept

lm(y~0+x), regression through the origin (no intercept)

lm(y~A), where A is a categorical variable

lm(y~x + A)

lm(y~A*B) = lm(y~A+B+A:B)

Page 64: An Introduction to. Web resources R home page:   R Archive:

Simple linear regression lm()Example:

Model1<-lm(length~weight, data=reg)

Model1<-lm(log(length)~log(weight), data=reg)

Page 65: An Introduction to. Web resources R home page:   R Archive:

Linear regression

• Model diagnostics– summary()– residual()– fitted()– plot()

Script:QFC R short course simple linear regression

example_2.R

Page 66: An Introduction to. Web resources R home page:   R Archive:

Linear regression

• Subset data for regression

• Model1<-lm(length ~ wgt, subset=hatchery=="DWOR", data=reg1)

• Running models through a loop

Script:QFC R short course simple linear regression

example_3.R

Page 67: An Introduction to. Web resources R home page:   R Archive:

Analysis of variance

• aov()

• Categorical explanatory variables

• Compare the mean values of multiple group

anova<-aov(y~groups)

Script:

QFC short course ANOVA 1.R

Page 68: An Introduction to. Web resources R home page:   R Archive:

Exercise 5

Page 69: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

• Estimating parameters is more tricky compared to linear regression models

• Iterative search procedure required

• Must provide starting values of parameters

• Convergence issues

Page 70: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

Differences in R between linear and nonlinear regression

1.For nonlinear regression models the user must specify the exact equation as part of the model statement

2.The user must specify initial guesses as to the value of the parameters that are being estimated

Page 71: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression• Von Bertalanffy growth model

erroreLL ttkt

][ 01

tL is length at age t

L is the asymptotic average maximum length

k is the growth rate coefficient that determines how quickly the maximum size is attained

0t is the hypothetical age which the species has zero length

Page 72: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

0 5 10 15

400

500

600

700

800

900

Age (yrs)

Leng

th (

mm

)

erroreLL ttkt

][ 01

Page 73: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

nls()

Least-squares estimates of the parameters of a nonlinear model

Page 74: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

vonB1<- nls(length~Linf*(1-exp(-k*(age-to))), data=length.age, start=list(Linf=1000, k=0.05, t0=-2))

erroreLL ttkt

][ 01

Page 75: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regressionGraphing fitted lines:1. Generate a sequence of numbers that cover

the range of the x-axis

2. Generate predicted values for the sequence of x-values

3. Plot original data

4. Overlay predicted values

Page 76: An Introduction to. Web resources R home page:   R Archive:

Nonlinear regression

Script:

QFC R course Von Bertalanffy Nonlinear regression.R

Exercise 6