Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford

Post on 17-Jun-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Ten things I DON’T hate about you:some things I didnt know when I started using R that I wish I had

Ty Stanford

May 22, 2012

ADELAIDE R

USERS GROUP

We all here use R but...

I thought I might give a little self-affirmatory spruiking of R

As a statistician, R ticks almost all the boxes

I amazing data handling, easy to use

I allows to some extent low-level programming

I — as well as vectorised code

I extensive range of packages

I works in memory only downside - but there are packages forthat (memory is becoming cheaper too!)

Wish I knew... #1

R-bloggers

• A ‘blog’ of R-articles widely sourced from the webernet• Also a mailing list

www.r-bloggers.com

“R news and tutorials contributed by (X ) R bloggers”

X = 365 as of 19th May 2012

Wish I knew... #2

We know cran.r-project.org where we download R...

But two of the pages contained within are one-stop-shops:

• Task views: cran.r-project.org/web/views• R language defn: cran.r-project.org/doc/manuals/R-lang.html

Wish I knew... #3

Indexing syntax

> ### create a matrix and play with indexes> A<-matrix(101:108,nrow=2)> colnames(A)<-paste("C",1:4,sep="")> rownames(A)<-paste("R",1:2,sep="")> A

C1 C2 C3 C4R1 101 103 105 107R2 102 104 106 108> A[2,4][1] 108> A[1:2,4]R1 R2

107 108> A[3:5][1] 103 104 105> A[-(3:5)][1] 101 102 106 107 108> A[,"C3"]R1 R2

105 106> ### which() is a great fn to get indexes> which(A>105)[1] 6 7 8> A[A>105][1] 106 107 108> A %in% c(106,101,103)[1] TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE> c(106,101,103) %in% A[1] TRUE TRUE TRUE

Wish I knew... #4

Vectorise your code!

I Will run faster

I Makes code easier to read

I But what is vectorised code...

> n_a<-5> (a<-seq(101,length=n_a))[1] 101 102 103 104 105> (a<-sample(a))[1] 104 102 103 105 101> diff(a)[1] -2 1 2 -4> ### how can we get diff(a)?> ### #1 - for loop over elements> (diffa1<-rep(0,n_a-1))[1] 0 0 0 0> for(i in 1:(n_a-1)) diffa1[i]<-a[i+1]-a[i]> diffa1[1] -2 1 2 -4> ### #2 let’s vectorise> (indxs1<-1:(n_a-1))[1] 1 2 3 4> (indxs2<-indxs1+1)[1] 2 3 4 5> (diffa2<-a[indxs2]-a[indxs1])[1] -2 1 2 -4

Wish I knew... #5

The list() object

We’re all probably familiar with the data structures:

I c(), matrix, data.frame

A list() is a more generic structure that you can bundle manytypes of objects together

I Handy for different length data structures

Some examples

> #empty initialised list> a.list<-list()> a.list[[2]]<-c("a","b")> a.list[[1]]NULL

[[2]][1] "a" "b"

> #known length list> b.list<-vector(mode="list",length=2)> b.list[[1]]NULL

[[2]]NULL

More examples

> #list with named elements> x<-matrix(1:4,nrow=2)> y<-c("a","b","c")> c.list<-list(x=x,letters=y)> c.list$x

[,1] [,2][1,] 1 3[2,] 2 4

$letters[1] "a" "b" "c"

> #note there are differences to extracting elements to matrices etc> c.list$letters #extract element at pos 2 by using element name[1] "a" "b" "c"> c.list[2] #this returns equivalent to a 1 element list$letters[1] "a" "b" "c"

> c.list[[2]] #this returns the col vec at element 2[1] "a" "b" "c"

Wish I knew... #6

system.time()

### how long does a function take?system.time(y<-somefunc(x))

### OR ###

### how long does some system of statements take?### start!t0<-proc.time()[3]

### <<do stuff>>

### how long did it take?time.taken<-proc.time()[3]-t0cat("The process took",time.taken,"seconds \n")

Wish I knew... #7

You need to re-install ALL of your packages if you upgrade to thenewer R!

How do you remember all the packages you’ve installed?

You don’t.1

1onertipaday.blogspot.com.au

Step 1, before you get rid of your old R version

setwd("<<where you wanna save>>")tmp <- installed.packages()installedpkgs <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])save(installedpkgs, file="installed_old.rda")

Step 2, install new R version and run...

setwd("<<where you saved the .rda file just now>>")load("installed_old.rda")tmp <- installed.packages()installedpkgs.new <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])missing <- setdiff(installedpkgs, installedpkgs.new)install.packages(missing)update.packages()

Wish I knew... #8

A LATEX & R tip

Want to put syntax highlighted code in a LATEX document orBeamer slideshow?

pygments.org

An example

Say we have a file, someRcode.R as seen below

and we want to incorperate into a LATEX document

### some R code### include some maths: $\phi\left[\frac{\pi}{2}\right]=\Delta$

afunc<-function(x) return(xˆ2)cat("We are outputting some text as it highlights nice! \n")x<-seq(-5,5,length=100)plot(x,afunc(x))

Install pygments...

Then in the command line shell

$ cd Dropbox/Rusers/pygments

## get preamble code, put this in Rstyle.tex$ pygmentize -f tex -S autumn -a .syntax > Rstyle.tex

## now pygmentise "someRcode.R"$ pygmentize -O mathescape=True,style=autumn

-P "verboptions=frame=lines,gobble=0,numbers=left,..."-o someRcode.tex someRcode.R

Then your .tex document looks like:

\documentclass[11pt]{article}

%need these packages\usepackage{fancyvrb}\usepackage{color}

%input pygments style commands%(not overly human readable)\input{Rstyle.tex}

\begin{document}

%input the file created - pygments syntax highlighting\input{someRcode.tex}

\end{document}

And the output...

someRcode.R1 ### some R code2 ### include some maths: φ

[π2

]= ∆

3

4 afunc<-function(x) return(xˆ2)5 cat("We are outputting some text as it highlights nice! \n")6 x<-seq(-5,5,length=100)7 plot(x,afunc(x))

Wish I knew... #9

The package compiler

Since R v2.13 there is the package of compiler included

An example

require(compiler)

oursd<-function(x){

nx<-length(x)sdout<-xbar<-0for(i in 1:nx) xbar<-xbar+x[i]xbar<-xbar/nxfor(i in 1:nx) sdout<-sdout+(xbar-x[i])ˆ2sdout<-sdout/(nx-1)return(sqrt(sdout))

}compiledsd<-cmpfun(oursd)

Test it!

> set.seed(87455687)> x<-runif(1e7) ### Ten million obs> system.time(sd1<-oursd(x))# user system elapsed# 31.794 0.268 35.017> system.time(sd2<-compiledsd(x))3 user system elapsed# 7.083 0.062 7.787> system.time(sd3<-sd(x))# user system elapsed# 0.091 0.000 0.095> sd1#[1] 0.2886422> sd2#[1] 0.2886422> sd3#[1] 0.2886422

Wish I knew... #10

You can call to C for computationally intense algorithms

Use the R function

.C("C func name",arg1,arg2,...)

This returns the list

[[1]]

arg1

[[2]]

arg2...

Create a void C function

c sd.c#include <R.h>

void c_sd(double *varout, double *myvec, int *nv){

double xbar=0;double tempval=0;int n=*nv;for(int i=0;i<n;i++)

xbar+=myvec[i];xbar=xbar/n;for(int i=0;i<n;i++)

tempval+=(myvec[i]-xbar)*(myvec[i]-xbar);

*varout=tempval/(n-1);}

Then compile it in the command window

$ R CMD SHLIB c_sd.c

This is how you call it in R

dyn.load("Dropbox/Rusers/Code/c_sd.so")c_sd<-function(x){

sdout<-as.double(0)x<-as.double(x)nx<-as.integer(length(x))sdout<-.C("c_sd",sdout,x,nx)[[1]]return(sqrt(sdout))

}system.time(sd4<-c_sd(x))# user system elapsed# 0.190 0.126 0.334sd4#[1] 0.2886422

Wish I knew... #11

The R function dir.create() is somewhat limited

Sometimes you need to create output dynamically

I and want to create folders accordingly

dir.create() won’t solve all your problems...

A function to help

createMultDir<-function(dirloc,os=.Platform$OS.type){if(os=="unix"){

ourdirs<-strsplit(dirloc,"/")[[1]]if(ourdirs[1]=="") ourdirs<-ourdirs[2:length(ourdirs)] #starts with "/"nd<-length(ourdirs)movingdir<-"/"anyCreate<-FALSEfor(i in 1:nd){

movingdir<-paste(movingdir,ourdirs[i],sep="")if(!file.exists(movingdir)){

dir.create(movingdir)anyCreate<-TRUE

}movingdir<-paste(movingdir,"/",sep="")

}if(anyCreate) return(paste("sucessfully created:",movingdir))else return(paste("The file path:",movingdir,"already exists"))

}else{

return("not a mac - not yet implemented")}

}

Use the function

> createMultDir("/Dropbox/Rusers/TestFolder")[1] "sucessfully created: /Dropbox/Rusers/TestFolder/"

### try again> createMultDir("/Dropbox/Rusers/TestFolder")[1] "The file path: /Dropbox/Rusers/TestFolder/ already exists"

Wish I knew... #12

Nah, I’ll leave it there.

Thank you for your attention.

top related