Ten things I DON’T hate about you: some things I didnt ... · Ten things I DON’T hate about you: some things I didnt know when I started using R that I wish I had Ty Stanford
Post on 17-Jun-2020
2 Views
Preview:
Transcript
Ten things I DON’T hate about you:some things I didnt know when I started using R that I wish I had
Ty Stanford
May 22, 2012
ADELAIDE R
USERS GROUP
We all here use R but...
I thought I might give a little self-affirmatory spruiking of R
As a statistician, R ticks almost all the boxes
I amazing data handling, easy to use
I allows to some extent low-level programming
I — as well as vectorised code
I extensive range of packages
I works in memory only downside - but there are packages forthat (memory is becoming cheaper too!)
Wish I knew... #1
R-bloggers
• A ‘blog’ of R-articles widely sourced from the webernet• Also a mailing list
www.r-bloggers.com
“R news and tutorials contributed by (X ) R bloggers”
X = 365 as of 19th May 2012
Wish I knew... #2
We know cran.r-project.org where we download R...
But two of the pages contained within are one-stop-shops:
• Task views: cran.r-project.org/web/views• R language defn: cran.r-project.org/doc/manuals/R-lang.html
Wish I knew... #3
Indexing syntax
> ### create a matrix and play with indexes> A<-matrix(101:108,nrow=2)> colnames(A)<-paste("C",1:4,sep="")> rownames(A)<-paste("R",1:2,sep="")> A
C1 C2 C3 C4R1 101 103 105 107R2 102 104 106 108> A[2,4][1] 108> A[1:2,4]R1 R2
107 108> A[3:5][1] 103 104 105> A[-(3:5)][1] 101 102 106 107 108> A[,"C3"]R1 R2
105 106> ### which() is a great fn to get indexes> which(A>105)[1] 6 7 8> A[A>105][1] 106 107 108> A %in% c(106,101,103)[1] TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE> c(106,101,103) %in% A[1] TRUE TRUE TRUE
Wish I knew... #4
Vectorise your code!
I Will run faster
I Makes code easier to read
I But what is vectorised code...
> n_a<-5> (a<-seq(101,length=n_a))[1] 101 102 103 104 105> (a<-sample(a))[1] 104 102 103 105 101> diff(a)[1] -2 1 2 -4> ### how can we get diff(a)?> ### #1 - for loop over elements> (diffa1<-rep(0,n_a-1))[1] 0 0 0 0> for(i in 1:(n_a-1)) diffa1[i]<-a[i+1]-a[i]> diffa1[1] -2 1 2 -4> ### #2 let’s vectorise> (indxs1<-1:(n_a-1))[1] 1 2 3 4> (indxs2<-indxs1+1)[1] 2 3 4 5> (diffa2<-a[indxs2]-a[indxs1])[1] -2 1 2 -4
Wish I knew... #5
The list() object
We’re all probably familiar with the data structures:
I c(), matrix, data.frame
A list() is a more generic structure that you can bundle manytypes of objects together
I Handy for different length data structures
Some examples
> #empty initialised list> a.list<-list()> a.list[[2]]<-c("a","b")> a.list[[1]]NULL
[[2]][1] "a" "b"
> #known length list> b.list<-vector(mode="list",length=2)> b.list[[1]]NULL
[[2]]NULL
More examples
> #list with named elements> x<-matrix(1:4,nrow=2)> y<-c("a","b","c")> c.list<-list(x=x,letters=y)> c.list$x
[,1] [,2][1,] 1 3[2,] 2 4
$letters[1] "a" "b" "c"
> #note there are differences to extracting elements to matrices etc> c.list$letters #extract element at pos 2 by using element name[1] "a" "b" "c"> c.list[2] #this returns equivalent to a 1 element list$letters[1] "a" "b" "c"
> c.list[[2]] #this returns the col vec at element 2[1] "a" "b" "c"
Wish I knew... #6
system.time()
### how long does a function take?system.time(y<-somefunc(x))
### OR ###
### how long does some system of statements take?### start!t0<-proc.time()[3]
### <<do stuff>>
### how long did it take?time.taken<-proc.time()[3]-t0cat("The process took",time.taken,"seconds \n")
Wish I knew... #7
You need to re-install ALL of your packages if you upgrade to thenewer R!
How do you remember all the packages you’ve installed?
You don’t.1
1onertipaday.blogspot.com.au
Step 1, before you get rid of your old R version
setwd("<<where you wanna save>>")tmp <- installed.packages()installedpkgs <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])save(installedpkgs, file="installed_old.rda")
Step 2, install new R version and run...
setwd("<<where you saved the .rda file just now>>")load("installed_old.rda")tmp <- installed.packages()installedpkgs.new <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])missing <- setdiff(installedpkgs, installedpkgs.new)install.packages(missing)update.packages()
Wish I knew... #8
A LATEX & R tip
Want to put syntax highlighted code in a LATEX document orBeamer slideshow?
pygments.org
An example
Say we have a file, someRcode.R as seen below
and we want to incorperate into a LATEX document
### some R code### include some maths: $\phi\left[\frac{\pi}{2}\right]=\Delta$
afunc<-function(x) return(xˆ2)cat("We are outputting some text as it highlights nice! \n")x<-seq(-5,5,length=100)plot(x,afunc(x))
Install pygments...
Then in the command line shell
$ cd Dropbox/Rusers/pygments
## get preamble code, put this in Rstyle.tex$ pygmentize -f tex -S autumn -a .syntax > Rstyle.tex
## now pygmentise "someRcode.R"$ pygmentize -O mathescape=True,style=autumn
-P "verboptions=frame=lines,gobble=0,numbers=left,..."-o someRcode.tex someRcode.R
Then your .tex document looks like:
\documentclass[11pt]{article}
%need these packages\usepackage{fancyvrb}\usepackage{color}
%input pygments style commands%(not overly human readable)\input{Rstyle.tex}
\begin{document}
%input the file created - pygments syntax highlighting\input{someRcode.tex}
\end{document}
And the output...
someRcode.R1 ### some R code2 ### include some maths: φ
[π2
]= ∆
3
4 afunc<-function(x) return(xˆ2)5 cat("We are outputting some text as it highlights nice! \n")6 x<-seq(-5,5,length=100)7 plot(x,afunc(x))
Wish I knew... #9
The package compiler
Since R v2.13 there is the package of compiler included
An example
require(compiler)
oursd<-function(x){
nx<-length(x)sdout<-xbar<-0for(i in 1:nx) xbar<-xbar+x[i]xbar<-xbar/nxfor(i in 1:nx) sdout<-sdout+(xbar-x[i])ˆ2sdout<-sdout/(nx-1)return(sqrt(sdout))
}compiledsd<-cmpfun(oursd)
Test it!
> set.seed(87455687)> x<-runif(1e7) ### Ten million obs> system.time(sd1<-oursd(x))# user system elapsed# 31.794 0.268 35.017> system.time(sd2<-compiledsd(x))3 user system elapsed# 7.083 0.062 7.787> system.time(sd3<-sd(x))# user system elapsed# 0.091 0.000 0.095> sd1#[1] 0.2886422> sd2#[1] 0.2886422> sd3#[1] 0.2886422
Wish I knew... #10
You can call to C for computationally intense algorithms
Use the R function
.C("C func name",arg1,arg2,...)
This returns the list
[[1]]
arg1
[[2]]
arg2...
Create a void C function
c sd.c#include <R.h>
void c_sd(double *varout, double *myvec, int *nv){
double xbar=0;double tempval=0;int n=*nv;for(int i=0;i<n;i++)
xbar+=myvec[i];xbar=xbar/n;for(int i=0;i<n;i++)
tempval+=(myvec[i]-xbar)*(myvec[i]-xbar);
*varout=tempval/(n-1);}
Then compile it in the command window
$ R CMD SHLIB c_sd.c
This is how you call it in R
dyn.load("Dropbox/Rusers/Code/c_sd.so")c_sd<-function(x){
sdout<-as.double(0)x<-as.double(x)nx<-as.integer(length(x))sdout<-.C("c_sd",sdout,x,nx)[[1]]return(sqrt(sdout))
}system.time(sd4<-c_sd(x))# user system elapsed# 0.190 0.126 0.334sd4#[1] 0.2886422
Wish I knew... #11
The R function dir.create() is somewhat limited
Sometimes you need to create output dynamically
I and want to create folders accordingly
dir.create() won’t solve all your problems...
A function to help
createMultDir<-function(dirloc,os=.Platform$OS.type){if(os=="unix"){
ourdirs<-strsplit(dirloc,"/")[[1]]if(ourdirs[1]=="") ourdirs<-ourdirs[2:length(ourdirs)] #starts with "/"nd<-length(ourdirs)movingdir<-"/"anyCreate<-FALSEfor(i in 1:nd){
movingdir<-paste(movingdir,ourdirs[i],sep="")if(!file.exists(movingdir)){
dir.create(movingdir)anyCreate<-TRUE
}movingdir<-paste(movingdir,"/",sep="")
}if(anyCreate) return(paste("sucessfully created:",movingdir))else return(paste("The file path:",movingdir,"already exists"))
}else{
return("not a mac - not yet implemented")}
}
Use the function
> createMultDir("/Dropbox/Rusers/TestFolder")[1] "sucessfully created: /Dropbox/Rusers/TestFolder/"
### try again> createMultDir("/Dropbox/Rusers/TestFolder")[1] "The file path: /Dropbox/Rusers/TestFolder/ already exists"
Wish I knew... #12
Nah, I’ll leave it there.
Thank you for your attention.
top related