Outline Introduction R+L A T E X= R-Sweave Conclusions Reproducible Research - R and L A T E X Rebecka J¨ ornsten Mathematical Statistics Chalmers University of Technology University of Gothenburg October 18, 2011 Rebecka J¨ ornsten Reproducible Research - R and L A T E X
62
Embed
Reproducible Research - R and LaTeXjornsten/RJRepResearch.pdf · Rebecka J ornsten Reproducible Research - R and LATEX. Outline Introduction R + LATEX= R-Sweave Conclusions R-Sweave,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
OutlineIntroduction
R + LATEX= R-SweaveConclusions
Reproducible Research - R and LATEX
Rebecka Jornsten
Mathematical StatisticsChalmers University of Technology
University of Gothenburg
October 18, 2011
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
1 Introduction
2 R + LATEX= R-SweaveR-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
3 Conclusions
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
Piled Higher and Deeper by Jorge Cham www.phdcomics.com
title: "Research Diagram/Research Reality" - originally published 1/7/2008
Piled Higher and Deeper http://www.phdcomics.com/comics/archive_print.php?comicid=961
1 of 1 10/13/2011 1:43 PM
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research papers?
1 Data handling
2 Data filtering
3 Data analysis
4 Generating results
5 Report writing
1 excel, file-merging, cut-and-paste,...
2 excel, cut-and-paste, R, ...
3 R, SPSS, matlab, excel,...
4 hard-copy, graphics, tables
5 word, LATEX, manual input
Months later... reviews come back and we are asked to submit amajor revision within X days...
Oh-oh!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research report?
Some other scenarios
Find a bug in the code - want to rerun parts of the analysis
Additional data suddenly available, or only a subset of datato be used
A member of the lab left for another job and you need tobe able to continue/reproduce his/her work
A new member joins the lab - how can she/he most quicklystart working on a lab project?
When a project involves many steps, it is often very difficult toreproduce results exactly. Too many manual steps/human errorelements AND perhaps conflicting versions of codes, lack ofcomplete analysis protocol,...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research report?
Some other scenarios
Find a bug in the code - want to rerun parts of the analysis
Additional data suddenly available, or only a subset of datato be used
A member of the lab left for another job and you need tobe able to continue/reproduce his/her work
A new member joins the lab - how can she/he most quicklystart working on a lab project?
When a project involves many steps, it is often very difficult toreproduce results exactly. Too many manual steps/human errorelements AND perhaps conflicting versions of codes, lack ofcomplete analysis protocol,...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research report?
Some other scenarios
Find a bug in the code - want to rerun parts of the analysis
Additional data suddenly available, or only a subset of datato be used
A member of the lab left for another job and you need tobe able to continue/reproduce his/her work
A new member joins the lab - how can she/he most quicklystart working on a lab project?
When a project involves many steps, it is often very difficult toreproduce results exactly. Too many manual steps/human errorelements AND perhaps conflicting versions of codes, lack ofcomplete analysis protocol,...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research report?
Some other scenarios
Find a bug in the code - want to rerun parts of the analysis
Additional data suddenly available, or only a subset of datato be used
A member of the lab left for another job and you need tobe able to continue/reproduce his/her work
A new member joins the lab - how can she/he most quicklystart working on a lab project?
When a project involves many steps, it is often very difficult toreproduce results exactly. Too many manual steps/human errorelements AND perhaps conflicting versions of codes, lack ofcomplete analysis protocol,...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
What goes into writing a research report?
Some other scenarios
Find a bug in the code - want to rerun parts of the analysis
Additional data suddenly available, or only a subset of datato be used
A member of the lab left for another job and you need tobe able to continue/reproduce his/her work
A new member joins the lab - how can she/he most quicklystart working on a lab project?
When a project involves many steps, it is often very difficult toreproduce results exactly. Too many manual steps/human errorelements AND perhaps conflicting versions of codes, lack ofcomplete analysis protocol,...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R + LATEX= R-Sweave
Sweave is a package in R - available with base installer
Allows you to run the analysis and generate LATEXcode atthe same time
Tables and Figures are generated in the report directly - nomanual input required
If you know some R and some LATEX, there are templatesfor generating both dynamically updated slides and articles
Truly reproducible research!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R + LATEX= R-Sweave
Sweave is a package in R - available with base installer
Allows you to run the analysis and generate LATEXcode atthe same time
Tables and Figures are generated in the report directly - nomanual input required
If you know some R and some LATEX, there are templatesfor generating both dynamically updated slides and articles
Truly reproducible research!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R + LATEX= R-Sweave
Sweave is a package in R - available with base installer
Allows you to run the analysis and generate LATEXcode atthe same time
Tables and Figures are generated in the report directly - nomanual input required
If you know some R and some LATEX, there are templatesfor generating both dynamically updated slides and articles
Truly reproducible research!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R + LATEX= R-Sweave
Sweave is a package in R - available with base installer
Allows you to run the analysis and generate LATEXcode atthe same time
Tables and Figures are generated in the report directly - nomanual input required
If you know some R and some LATEX, there are templatesfor generating both dynamically updated slides and articles
Truly reproducible research!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R + LATEX= R-Sweave
Sweave is a package in R - available with base installer
Allows you to run the analysis and generate LATEXcode atthe same time
Tables and Figures are generated in the report directly - nomanual input required
If you know some R and some LATEX, there are templatesfor generating both dynamically updated slides and articles
Truly reproducible research!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R-Sweave
All editing is done is a .Rnw file which include both R andLATEXcode.
\documentclass[a4paper]{article}\title{A Simple Example}\author{Rebecka J\"ornsten}\begin{document}\maketitleHere is a simple example:Here is a simple example:<<>>=mydata <‐ read.table('datafile.dat')set.seed(5) ##reproducibilityi d t l ( (1 di ( d t )[1]) 25)indextouse <‐ sample(seq(1,dim(mydata)[1]),25)mydata.sub <‐ mydata[indextouse,] ##25 random genescormat <‐ cor(t(mydata.sub),use="complete") ##correlation matrix diag(cormat)<‐0 g( )print(apply(cormat,1,max)) ##top pairwise correlations@\end{document}Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R-Sweave
To generate your report, you issue this command at the RpromptSweave("Example1.Rnw")
which generates the fileExample1.tex.You process this file usingpdflatex Example1.tex
I use MiKTeX and WinEdt to do this all the editing andprocessing in one place...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R-Sweave
To generate your report, you issue this command at the RpromptSweave("Example1.Rnw")
which generates the fileExample1.tex.
You process this file usingpdflatex Example1.tex
I use MiKTeX and WinEdt to do this all the editing andprocessing in one place...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R-Sweave
To generate your report, you issue this command at the RpromptSweave("Example1.Rnw")
which generates the fileExample1.tex.You process this file usingpdflatex Example1.tex
I use MiKTeX and WinEdt to do this all the editing andprocessing in one place...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
R-Sweave
To generate your report, you issue this command at the RpromptSweave("Example1.Rnw")
which generates the fileExample1.tex.You process this file usingpdflatex Example1.tex
I use MiKTeX and WinEdt to do this all the editing andprocessing in one place...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
And the corresponding (cropped) output looks like...
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Code chunks and Latex chunks
Code chunks are delimited by<<options>>=
and@
options include echo, results, fig, cache.
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Code chunks
You can label your code chunks. This comes in handy whenyou’re bug hunting.Example:<<firstchunk,echo=FALSE>>=
mean(nodata)
Here I try to calculate the mean of a nonexisting data set. Theerror looks like thisError: chunk 1 (label=firstchunk)
Error in mean(nodata) : object 'nodata' not found
Execution halted
So.... we know to look in the chunk firstchunk for this error.
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Options
echo *default echo=TRUE means that the Rcommands will appear in the report
results *default is results=verbatim which meansthat the R return is included in the report. results=tex isuseful for table summaries as latex table code is generateddirectly for the report
fig *if the code chunk generates figures fig=TRUE
will save this figure output as .eps and .pdf
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Options
echo *default echo=TRUE means that the Rcommands will appear in the report
results *default is results=verbatim which meansthat the R return is included in the report. results=tex isuseful for table summaries as latex table code is generateddirectly for the report
fig *if the code chunk generates figures fig=TRUE
will save this figure output as .eps and .pdf
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Options
echo *default echo=TRUE means that the Rcommands will appear in the report
results *default is results=verbatim which meansthat the R return is included in the report. results=tex isuseful for table summaries as latex table code is generateddirectly for the report
fig *if the code chunk generates figures fig=TRUE
will save this figure output as .eps and .pdf
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Simple numbers can be plugged into the text with the \Sexpr
command.Example
<<echo=FALSE>>=
out <- lm(y ~ x + x2 + x3)
summary(out)
@
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Including figures
> x <- rnorm(10)
> y <- (-2) * x + rnorm(10)
> plot(x, y)
●
●
●
●
●
●
●
●
●
●
−1.5 −1.0 −0.5 0.0 0.5 1.0
−3
−2
−1
01
x
y
Figure: Scatter plot
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Including tables
The R package xtable() makes it easy to generate latex codefor tables with the results updated automatically when yousweave your files.Here’s an example with a regression summary:
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Including tables
Another example
Tails Heads
Outcome 46 54
Table: Coin tosses
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Using cacheSweave for time consuming steps
If your project includes time consuming calculations, youdon’t really want to have to repeat those just to update alater part of the document.
You can use the cacheSweave package to deal with thesescenarios.
By including the option cache=TRUE in a chunk, you aretelling R-sweave to only run this chunk the first time. Forsubsequent sweaves of the document, the cached results areused.
Reality check: make sure that you run the whole reportwith no cached results before final submission!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Using cacheSweave for time consuming steps
If your project includes time consuming calculations, youdon’t really want to have to repeat those just to update alater part of the document.
You can use the cacheSweave package to deal with thesescenarios.
By including the option cache=TRUE in a chunk, you aretelling R-sweave to only run this chunk the first time. Forsubsequent sweaves of the document, the cached results areused.
Reality check: make sure that you run the whole reportwith no cached results before final submission!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Using cacheSweave for time consuming steps
If your project includes time consuming calculations, youdon’t really want to have to repeat those just to update alater part of the document.
You can use the cacheSweave package to deal with thesescenarios.
By including the option cache=TRUE in a chunk, you aretelling R-sweave to only run this chunk the first time. Forsubsequent sweaves of the document, the cached results areused.
Reality check: make sure that you run the whole reportwith no cached results before final submission!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Using cacheSweave for time consuming steps
If your project includes time consuming calculations, youdon’t really want to have to repeat those just to update alater part of the document.
You can use the cacheSweave package to deal with thesescenarios.
By including the option cache=TRUE in a chunk, you aretelling R-sweave to only run this chunk the first time. Forsubsequent sweaves of the document, the cached results areused.
Reality check: make sure that you run the whole reportwith no cached results before final submission!
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
cacheSweave
Here’s an example:<<cachetry,echo=FALSE, cache=TRUE>>=
library(cacheSweave)
xc <- rnorm(10)
@ <<cacheout>>=
print(mean(xc))
@
> print(mean(xc))
[1] 0.2274117
On subsequent runs of sweave, mean of xc =0.2274 will appearunaltered in the document since the random number generationonly took place the first run.
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
cacheSweave
Caution:
Include only computations in cached chunks, no figures oroutput you want to see.
If you change something in this chunk remember to runSweave not cacheSweave once
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
cacheSweave
Caution:
Include only computations in cached chunks, no figures oroutput you want to see.
If you change something in this chunk remember to runSweave not cacheSweave once
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code
Stangle()
Another bonus: by running the command Stangle instead ofSweave you produce the R stand-alone code, neatly packaged inseparated chunks.
Rebecka Jornsten Reproducible Research - R and LATEX
OutlineIntroduction
R + LATEX= R-SweaveConclusions
R-Sweave, the basicsGrabbing numbers into the textGraphics and TablescacheSweaveProducing R code