Top Banner
Introduction to R nitesh chhabria
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Itroroduction to R language

Introduction to Rnitesh chhabria

Page 2: Itroroduction to R language

The R environment

!

R is integrated suite of software facilities for data manipulation, calculation and graphical display.

its also open source - “cheers”

Page 3: Itroroduction to R language

How to get and use R?

http://r-project.org/ is place where u can get the all the core packages.

How to start using it?

— Terminal :

> R // Type this command

— GUI :

Find the installed R application and double click it

Page 4: Itroroduction to R language

Some basic commands

!

source(“file.r”) Used for executing commands stored in file.r

sink(“record.lis”) All the subsequent outputs will be stored in record file

ls() Used to display the names of objects stored in within R

rm(ob) Removes the object ob from memory

Page 5: Itroroduction to R language

Vectors

!

R works on data structures. Vector is simplest of them.

> vec <- c(1, 2, 3, 4, 5) // <- is assignment operator and c is function used for creating vectors

> vec

[1] 1 2 3 4 5

> vec +1 // can u guess what will be output?

[1] 2 3 4 5 6

Page 6: Itroroduction to R language

Generating sequences

!

> vec1 <- 1:10 // Used for generating vector having elements from 1 to 10. > vec2 <- seq(-5, 5, by= 0.2) // Used for generation vector from -5 to 5 with difference of 0.2

> vec3 <- rep( vec1, times=10) // Will generate 10 copies of vec1

> temp <- vec > 3 // Will check condition for all the elements in vec

> vec4 <- c(“hello”, “there”) // Will create vector of strings

!

!

Page 7: Itroroduction to R language

Matrices

> X <- matrix(NA, nrow= 7 , ncol= 3)

> X

[,1] [,2] [,3] [1,] NA NA NA [2,] NA NA NA //This will create a matrix with values not available [3,] NA NA NA . Indexing starts from 1 [4,] NA NA NA [5,] NA NA NA [6,] NA NA NA [7,] NA NA NA

!

> X[row, col] syntax is used for accessing values of the cell

Page 8: Itroroduction to R language

Lists

!

List is used to make parcel of unrelated items

> result <- list(mu = 0.3, sigma = 0.45, x =1:3)

> result$mu 0.3

> result$x [1] 1 2 3

> result.new <- “hello there” // Will add the string in new variable

Page 9: Itroroduction to R language

Regression

!

Linear regression is used to find the best fit curve from the the given values so that the residual error is minimum.

Steps needed to find the best fit curve:

#collect data #define model #apply regression #use the generated values to predict

Page 10: Itroroduction to R language

Linear model in R

Modelling is technique to represent the data mathematically

General form: response ~ op1 term1 op2 term 2 op3 term3...

Models and syntax: -Independent Variables - Y , A , B -Coefficients - β

Page 11: Itroroduction to R language

!

Model Syntax

Y=βo +β1A Y~A

Y = β1A Y ~ -1 + A

Y = βo+ β1A + β2A2 Y ~ A + I(A^2)

Y = βo+ β1A + β2B Y~A+B

Y=βo +β1AB Y ~ A:B

Page 12: Itroroduction to R language

Example

Data : > conc [1]0 10 20 30 40 50 > signal [1] 4 22 44 60 82 95

!

Expected model: signal = βo + β1×conc#

#

Page 13: Itroroduction to R language

!

!

> lm(signal ~ conc)

Call: lm(formula = signal ~ conc) Coefficients: (Intercept) conc 3.60 1.94

> lm.r <- lm(signal ~ conc) !

Carrying out regression

Page 14: Itroroduction to R language
Page 15: Itroroduction to R language

> layout(matrix(1:4, 2, 2) !> plot(lm.r)

! !!!!

Page 16: Itroroduction to R language
Page 17: Itroroduction to R language

Uniform vs Normal Distribution

Normal

Uniform

Page 18: Itroroduction to R language

Uniform Distribution

> runif(5000) // Will generate 5000 uniform dist points

> plot(runif(5000)) // Will plot all the points and produce UD

> plot(density(runif(5000))) // Density of all the numbers

Some statics:

> summary(runif(5000)) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0004056 0.2701000 0.5072000 0.5124000 0.7514000 0.9995000

Page 19: Itroroduction to R language

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

density.default(x = runif(5000))

N = 5000 Bandwidth = 0.04717

Density

0 1000 2000 3000 4000 5000

0.0

0.2

0.4

0.6

0.8

1.0

Index

runif(5000)

Page 20: Itroroduction to R language

Normal Distribution

> rnorm(5000) // Will generate 5000 uniform dist points

> plot(rnorm(5000)) // Will plot all the points and produce UD

> plot(density(rnorm(5000))) // Density of all the numbers

Some statics:

> summary(runif(5000)) Min. 1st Qu. Median Mean 3rd Qu. Max. -4.549000 -0.674800 0.005506 -0.001849 0.666600 3.629000

Page 21: Itroroduction to R language

-4 -2 0 2 4

0.0

0.1

0.2

0.3

density.default(x = rnorm(5000))

N = 5000 Bandwidth = 0.1643

Density

0 1000 2000 3000 4000 5000

-4-2

02

4

Index

rnorm(5000)

rnorm(5000)

Page 22: Itroroduction to R language

Auto correlation function

> acf(rnorm(100))

0 5 10 15 20

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

ACF

Series rnorm(100)

Page 23: Itroroduction to R language

!

!

references: http://www.montefiore.ulg.ac.be/~kvansteen/GBIO0009-1/ac20092010/Class8/Using%20R %20for%20linear%20regression.pdf

An Introduction to R - W. N. Venables, D. M. Smith and the R Core Team !

Thank You