Stat 437 Lecture Notes 1 Xiongzhi Chen Washington State University Contents 3 Set up RStudio 3 Install R and Rstudio ......................................... 3 Rstudio: a sanpshot .......................................... 3 Rstudio ................................................. 3 Objects in R: I 3 Scalars in R ............................................... 3 Vectors in R: I ............................................. 4 Vectors in R: II ............................................. 4 The seq command ........................................... 5 Matrices in R: I ............................................. 5 Matrices in R: II ............................................ 5 Matrices in R: III ............................................ 6 Matrices in R: IV ............................................ 6 Data frames in R: I ........................................... 6 Data frames in R: II .......................................... 7 Data frames in R: III .......................................... 7 Data frames in R: IV .......................................... 7 Objects in R: II 8 Character vectors in R ......................................... 8 Strings in R ............................................... 8 Factors in R: I ............................................. 8 Factors in R: II ............................................. 9 Logic operators in R: I ......................................... 9 Logic operators in R: II ........................................ 10 Logic operators in R: III ........................................ 10 Logic operators in R: IV ........................................ 10 Lists in R: I ............................................... 10 Lists in R: II .............................................. 11 Lists in R: III .............................................. 11 Set operations in R: I ......................................... 11 Set operations in R: II ......................................... 12 “Coerce” in R .............................................. 12 length and dim ............................................. 12 R markdown 13 Install R markdown .......................................... 13 Create a R markdown file ....................................... 13 Structure of a markdown file ..................................... 13 A sample markdown file ........................................ 14 Basic syntax: I ............................................. 14 Basic syntax: II ............................................. 14 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Rstudio free version at: https://www.rstudio.com/products/rstudio/download/• R at: https://www.r-project.org/• install a R package by install.packages("package_name")• install R packages “tidyverse”, “ggplot2”, “markdown”, “igraph”, “plotly”, “ggmap”
Rstudio: a sanpshot
Rstudio
• Upper Left panel: R scripts, R markdown file, R project file, View data, etc
• Lower Left panel: R console, R markdown log, etc
• Upper Right panel: R workspace, History, etc
• Lower Right panel: Files in working directory, Plots, Help, etc
> matrix(1:6,nrow=2,ncol=3) # a 2-by-3 matrix[,1] [,2] [,3]
[1,] 1 3 5[2,] 2 4 6> x = c(1,3,5) # a 3-component vector> y = c(2,4,6) #a 3-component vector> # stack x and y as 2 rows to obtain a 2-by-3 matrix> rbind(x,y)
[,1] [,2] [,3]x 1 3 5y 2 4 6> # stack x and y as 2 columns to obtain a 3-by-2 matrix> cbind(x,y)
x y[1,] 1 2[2,] 3 4[3,] 5 6
Matrices in R: II
> x=matrix(1:6,nrow=2,ncol=3) # a 2-by-3 matrix> x
[,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> x[,1] # 1st column of x[1] 1 2> x[2,] # 2nd row of x[1] 2 4 6> x[1,2] # (1,2)-entry of x[1] 3> t(x) # transpose of x
[,1] [,2][1,] 1 2[2,] 3 4[3,] 5 6
5
Matrices in R: III
> x=matrix(1:6,nrow=2,ncol=3) # a 2-by-3 matrix> x
[,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> y = rbind(c(0,1,0),c(1,1,1))> y
> x <- data.frame("SN" = 1:2, "Age" = c(21,15),+ "Name" = c("John","Dora"))> x
SN Age Name1 1 21 John2 2 15 Dora> x$SN #access SN[1] 1 2> x[,1] # access SN
6
[1] 1 2> class(x$SN) # check object type for SN[1] "integer"> class(x$Name) # check object type for Name[1] "factor"
Data frames in R: II
> x <- data.frame("SN" = 1:2, "Age" = c(21,15),+ "Name" = c("John","Dora"))> x
SN Age Name1 1 21 John2 2 15 Dora> x$SN[2] #access the 2nd entry of SN[1] 2> x[1,2] #access the 1st entry of Age[1] 21
Caution: do not transpose a data.frame when it contains different types of objects
Data frames in R: III
Import (malaria related death) data as data.frame:> Y = read.csv("dataMalyria.csv",header = TRUE,sep=",",+ colClasses=c("country"=NA,"percent"="numeric",+ "labels"=NA))> head(Y)
> w = c("a","b","c") # a vector of 3 character components> w[2] # access the 2nd component[1] "b"> # 1st 10 upper case letters in the alphabet> LETTERS[seq( from = 1, to = 10 )][1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
> # 1st 10 lower case letters in the alphabet> letters[seq( from = 1, to = 10 )][1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
> Q = c("Go","WSU","Cougs","!")> Q[1] "Go" "WSU" "Cougs" "!"> # concatenate two character vectors> c(w,Q)[1] "a" "b" "c" "Go" "WSU" "Cougs" "!"
Strings in R
> w = "Go cougs!"> w[1] "Go cougs!">> v = "Data analytics"> v[1] "Data analytics">> # concatenate two strings> paste(w,v,sep = " ")[1] "Go cougs! Data analytics"
Factors in R: I
> grades = c("A","F","D","C","B") # character vector> grades[1] "A" "F" "D" "C" "B"> class(grades)[1] "character"
8
> gradesF = factor(grades) # gradesF is a now factor> gradesF[1] A F D C BLevels: A B C D F> class(gradesF)[1] "factor"> # levels of the factor "gradesF"> levels(gradesF)[1] "A" "B" "C" "D" "F"> # levels are ordered alphabetically
Factors in R: II
> x = c(1,3,2) # numeric vector> b = factor(x) # change x into a factor> b[1] 1 3 2Levels: 1 2 3> levels(b) # levels are ordered from smallest to largest[1] "1" "2" "3"> # relabel levels of b> d = factor(x,labels = c("3Level","1Level","2Level"))> d[1] 3Level 2Level 1LevelLevels: 3Level 1Level 2Level
Logic operators in R: I
> x = 0 # assign 0 to x> x >0[1] FALSE> x == 0[1] TRUE> !x # return TRUE[1] TRUE> y = 1> y >= 1[1] TRUE> !y # return FALSE[1] FALSE> x & y # "and"; return FALSE[1] FALSE> x | y # "or"; return TRUE[1] TRUE
9
Logic operators in R: II
> x = 1> y = -1> x >0 & y > 0 # "and"[1] FALSE> x > 0 | y > 0 # "or"[1] TRUE> x >0 & !(y>0)[1] TRUE
Logic operators in R: III
> x = c(1,2,3) # a 3-component vector> x >0 # returns a 3-component logic vector[1] TRUE TRUE TRUE> x > 2 # returns a 3-component logic vector[1] FALSE FALSE TRUE> # return indices of entries of x that are greater than 2> which(x>2)[1] 3> # take the subvector of x whose entries not smaller than 2> x[x >=2][1] 2 3
Logic operators in R: IV
> x = c(1,2,3) # a 3-component vector> y = c(-1,4,-1) # a 3-component vector> # compare x and y entrywise; return a 3-component vector> x > y[1] TRUE FALSE TRUE> x == y[1] FALSE FALSE FALSE> x >= y[1] TRUE FALSE TRUE> any(x>y)[1] TRUE> all(x>y)[1] FALSE
Lists in R: I
> x = vector("list",3) # a list with 3 components> # assign a vector to its 1st component> x[[1]] = c(1,2,3)> # assign a string to its 2nd component> x[[2]] = "Second part of x"
10
> # assign a matrix to its 3rd component> x[[3]] = matrix(1:6,nrow=3)> x[[1]][1] 1 2 3
[[2]][1] "Second part of x"
[[3]][,1] [,2]
[1,] 1 4[2,] 2 5[3,] 3 6
Lists in R: II
> x = vector("list",3) # a list with 3 components> x[[1]] = c(1,2,3)> x[[2]] = "Second part of x"> x[[3]] = matrix(1:6,nrow=3)> x[[2]] # show 2nd component of x[1] "Second part of x"
Lists in R: III
> a = c(1,2,3)> b = "Second part of x"> c = matrix(1:6,nrow=3)> y = list("vector" = a, "string" = b, "matrix" = c)> y$vector[1] 1 2 3
$string[1] "Second part of x"
$matrix[,1] [,2]
[1,] 1 4[2,] 2 5[3,] 3 6
Set operations in R: I
> x = c(1,2,3) # a 3-component vector> 1 %in% x # check membership
> x = c(1,2,3) # a 3-component vector> y = c(-1,4,-1) # a 3-component vector> union(x, y)[1] 1 2 3 -1 4> intersect(x, y)numeric(0)> setdiff(x, y)[1] 1 2 3
“Coerce” in R
• as.numeric coerces an object to be numeric• as.factor coerces an object to be a factor• as.marix . . .• as.logical . . .• as.data.frame . . .• so on . . .
length and dim
• length returns the number of components of a vector> a = 1:10> length(a)[1] 10
• dim returns the dimension of matrix or data frame> x=dim(matrix(1:6,nrow=3,ncol=2))> x[1] 3 2> x[1][1] 3
Use dplyr and piping:> library(dplyr)> dB = diamonds %>%+ filter(color %in% c("E","J","G")) %>%+ filter(cut %in% c("Ideal","Premium"))> head(dB)# A tibble: 6 x 10