In applied data analysis, it is important to publish not only the results but the code that was used to create those results.  In the 'reproducible research' approach, one creates a dynamic document which contains not only the text of the final manuscript but it also contains the R code used to create all of the tables and figures in the manuscript. While this was proposed sometime ago, until recently it was rather difficult to do in practice. However, the freely available LyX editor has now been extended to support documents containing a mixture of LaTeX and R code, so it is now much easier using this system to create these dynamic documents.
Use of such documents is particularly valuable not only improving communications with others as to what exactly was done, but also to provide a detailed record for later use of how exactly the analyses were done. This integrated compendium of text and code is much easier to understand and revisit in the future.  As Gentleman (2005) wrote, "New researchers are able to quickly and relatively easily determine what the previous investigator had done. Extension, improvement, or simply use will be easier than if no protocol has been used."
Here are some relevant links:
Robert Gentleman's slides on reproducible research can be found at:
http://gentleman.fhcrc.org/Fld-talks/RGRepRes.pdf
He also wrote a nice paper about this, 
1: Gentleman R. Reproducible research: a bioinformatics case study. Stat Appl Genet Mol Biol. 2005;4:Article2. Epub 2005 Jan 11. PubMed PMID: 16646837.
which can be found here.
This approach is implemented using the Sweave command in R.  Instructions for using LyX together with R and Sweave can be found at:
http://wiki.lyx.org/LyX/LyxWithRThroughSweave
See also the links on:
http://gregor.gorjanc.googlepages.com/lyx-sweave
as well as the article:
Using Sweave with LyX 
How to lower the LATEX/Sweave learning curve 
by Gregor Gorjanc 
which is the first article in: R News, Volume 8/1, May 2008
http://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf
The Sweave manual and FAQ can be found here:
http://www.statistik.lmu.de/~leisch/Sweave/
NOTE:
While it is wonderful to be able to use Sweave/R within LyX, in my experience it can be difficult to debug one's R code while working inside of LyX, as LyX does all of its R computations in a temporary directory and isn't very good yet at returning the R error messages back to the LyX user. Here is an example of how one might track down an error in the R code while working in LyX:
If an LyX document fails to typeset, it can be difficult to track down the error.
Suppose your chunk of R code in the LyX editor window contains an error like this:
<<2, echo=FALSE, fig=TRUE>>=
plot(x
@
Now when you try to typeset it, you will get a message that states:
"An error occured whilst running R CMD Sweave 'test.Rnw'"
To track this down, you can look at the intermediate temporary files that LyX generated.
To do this, open a Terminal window, and then type
cd /tmp
ls
You chould see a temporary directory with a name like lyx_tmpdir4466f8u8JU
Move into that directory by typing
cd lyx_tmpdir4466f8u8JU
ls
Now you should see a temporary directory with a name like lyx_tmpbuf0
Move into that directory by typing
cd lyx_tmpbuf0
ls
Now you should see all the temporary files that LyX generated, including one with a name like 'test.Rnw' which contains the Sweave code that LyX uses to generate the document.  To see why the R command failed, type
R CMD Sweave test.Rnw
When I do this, I see:
Writing to file test.tex
Processing code chunks ...
 1 : echo term verbatim (label=myFirstChunkInLyX)
 2 : term verbatim eps pdf (label=2)
Error:  chunk 2 (label=2) 
Error in parse(text = chunk) : unexpected end of input in "plot(x"
Execution halted
This output indicates exactly what the cause of the error is.
Dr. Weeks is a Professor of Human Genetics and Biostatistics at the University of Pittsburgh. His research focuses on statistical human genetics in the area of mapping susceptibility loci involved in complex human diseases. The content on this blog is for informational purposes only - use at your own risk!
 
 

No comments:
Post a Comment