tag:blogger.com,1999:blog-15283518144346744682024-03-12T23:49:59.192-04:00Daniel E. Weeks' Blog<a href="http://watson.hgen.pitt.edu">Dr. Weeks</a> is a Professor of Human Genetics and Biostatistics at the University of Pittsburgh. His research focuses on statistical human genetics in the area of mapping susceptibility loci involved in complex human diseases. The content on this blog is for informational purposes only - use at your own risk!Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.comBlogger42125tag:blogger.com,1999:blog-1528351814434674468.post-18893385034151599382016-12-23T15:36:00.002-05:002016-12-23T15:42:54.391-05:00Creating Beamer slides with optionally included notes slides using R MarkdownUsing RStudio, it is very easy to create Beamer slides using R Markdown. RStudio also supports LaTeX, so this enables even more flexibility.<br />
<br />
Using the template shared here:<br />
<br />
<a href="https://github.com/DanielEWeeks/Beamer-Rmd-Notes-Slides">https://github.com/DanielEWeeks/Beamer-Rmd-Notes-Slides</a><br />
<br />
one can also insert 'note slides' that can easily be all included or all excluded from the generated set of slides, just by toggling one word in the LaTeX header.<br />
<br />
This allows one to use a more concise set of slides, without the notes, in class while providing the students a more comprehensive set that contains all the notes.<br />
<br />
It could also be used to hide 'answer' slides from the students' slide set when posing questions in class. When posing questions in class, I prefer that the students don't have the 'answer' slides right in front of them. Then, after class, I make the slide set that contains the 'answer' slides available.<br />
<br />
I previously shared instructions on a different approach for doing this when generating Beamer slides using LaTeX or LyX - see:<br />
<br />
<a href="http://deweeks.blogspot.com/2014/08/hiding-answer-slides-in-student.html">http://deweeks.blogspot.com/2014/08/hiding-answer-slides-in-student.html</a><br />
<br />
<br />
<br />
<br />Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-73582442200394509072016-11-11T10:49:00.001-05:002016-11-11T10:52:53.938-05:00Optical character recognitionHere is a script that uses the 'tesseract' optical character recognition software to extract recognizable text from a PDF file:<br />
<br />
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
#!/bin/bash</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
# Purpose: </div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
# To carry out OCR on a PDF file</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
if [[ $# -ne 1 ]]; then</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "This script expects one argument."</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo " This argument is the name of the pdf file" </div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo " including the .pdf extension"</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "Usage: $0 file.pdf "</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
else</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
filename=$(basename "$1")</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
filename="${filename%.*}"</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "Converting $1 to a tiff file named $filename.tiff"</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "... "</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
convert -density 300 $1 -depth 8 -strip -background white -alpha off $filename.tiff</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "Carrying out OCR on $filename.tiff to create $filename.txt"</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
tesseract $filename.tiff $filename</div>
<div style="font-family: Menlo; font-size: 11px; line-height: normal;">
echo "The recognizable text in $1 has been output to the $filename.txt file."</div>
<span style="font-family: menlo; font-size: 11px;">fi</span><br />
<div>
<br /></div>
<div>
<span style="font-family: inherit;">For information on how to install the needed software, see <a href="https://diging.atlassian.net/wiki/display/DCH/Tutorial%3A+Text+Extraction+and+OCR+with+Tesseract+and+ImageMagick" target="_blank">this web page</a>.</span></div>
Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-26436865623846998652016-10-28T14:50:00.001-04:002016-10-28T14:52:59.634-04:00Please assign informative names to downloaded PDFs!Bravo for the journals that assign human-readable informative names to the PDF versions of articles downloaded from their web sites. For example, these naming schemes are very nice:<br />
<br />
<span style="font-family: "georgia" , "times new roman" , serif;">Hum. Mol. Genet.-2015-Simpkin-3752-63.pdf</span><br />
<span style="font-family: "georgia" , "times new roman" , serif;">PNAS-2005-Storey-12837-42.pdf</span><br />
<span style="font-family: "georgia" , "times new roman" , serif;">Int. J. Epidemiol.-2015-Sharp-1288-304.pdf</span><br />
<br />
<br />
Boo for the journals that assign non-informative names to their PDF files. For example, these naming schemes are not informative to me (even though the DOI is part of the name of the first two):<br />
<br />
<span style="font-family: "georgia" , "times new roman" , serif;">art%3A10.1186%2Fgb-2013-14-5-r42.pdf</span><br />
<span style="font-family: "georgia" , "times new roman" , serif;">art%3A10.1007%2Fs11357-016-9927-9.pdf</span><br />
<span style="font-family: "georgia" , "times new roman" , serif;">ijerph-12-14461.pdf</span>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-49016427530266502262016-03-22T18:27:00.002-04:002016-03-22T18:28:49.946-04:00Simple parallelization using 'sem' of the GNU parallel package<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
If you have a multiprocessor computer, you can easily use the ‘sem’ part of the GNU Parallel system to run processes in parallel. </div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
It was really simple to use and worked as intended. And this was much easier to install and get working than the other approach I was contemplating, which was installing grid engine software.</div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
See:</div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
https://www.gnu.org/software/parallel/sem.html</div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
Below is my parallelized script, which ran a whole set of “HaploPS” commands in parallel using 12 of my processors. I have bolded the two 'sem' commands that made this script execute in parallel.</div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; line-height: normal;">
<br /></div>
<div style="line-height: normal;">
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">$ more run_haploPS_parallel.sh </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">#!/bin/bash</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"><br /></span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">for (( chr=1;chr<=22;chr=chr+1)) {</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> cd $chr</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> for (( i=95 ; i>=5 ; i=i-5)) {</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"><br /></span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> freq=$(echo "scale=2;$i/100"|bc)</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> date</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> echo "Starting HaploPS run on chromosome " $chr " with freq = " $freq</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> <b>sem -j12</b> HaploPS -geno selscan.hap -legend selscan.map -freq 0$freq -out ../haploPS/haploPS_0${freq}_chr${chr}.txt </span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> echo "Run completed."</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> date</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">}</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"> cd ..</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">}</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"><b>sem --wait</b></span></div>
</div>
<div style="line-height: normal;">
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">##############It will automatically run haploPS at 5 to 95 percent frequencies.</span></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">#################################################################</span></div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
</div>
Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-33620124534459004032015-08-18T09:17:00.003-04:002015-08-18T09:17:55.449-04:00The difference between mathematics and statistics"M<span style="background-color: white; font-family: Arial, Helvetica, sans-serif; font-size: 13px;">athematics is about whether the conclusions follow from the assumptions. By contrast, statistics is about whether the assumptions have anything to do with the real world." </span><br />
<span style="background-color: white; font-family: Arial, Helvetica, sans-serif; font-size: 13px;"><br /></span>
<span style="background-color: white; font-family: Arial, Helvetica, sans-serif; font-size: 13px;">Jay Kadane, </span><span style="background-color: white; font-family: Arial, Helvetica, sans-serif; font-size: 13px;">Leonard J. Savage Professor of Statistics, Emeritus, </span><span style="background-color: white; font-family: Arial, Helvetica, sans-serif; font-size: 13px;">Carnegie-Mellon University</span>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-78969236137397875392015-02-23T10:55:00.001-05:002015-02-23T10:57:22.895-05:00rsyncAs part of transitioning from using an old desktop to a new desktop (while both are still active), I found the 'rsync' command to be very useful. Here is the options I ended up using to accomplish my goal of copying over (old) files from the old machine to the new without wiping out any new files on the new machine:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">rsync -avzuhP --log-file=/destination/dir/rsync_log.txt -e ssh remoteuser@remotehost:/source/dir /destination/dir/ </span><br />
<br />
Here's what the options do:<br />
<span style="font-family: "Courier New",Courier,monospace;">-a archive</span><br />
<span style="font-family: "Courier New",Courier,monospace;">-v verbose </span><br />
<span style="font-family: "Courier New",Courier,monospace;">-z compress</span><br />
<span style="font-family: "Courier New",Courier,monospace;">-u update</span> <= Do Not Overwrite the Modified Files at the Destination<br />
<span style="font-family: "Courier New",Courier,monospace;">-h human-readable</span><br />
<span style="font-family: "Courier New",Courier,monospace;">-P progress bar/partial transfers</span><br />
<br />
Note that these options are turned on in '-a' archive mode:<br />
<span style="font-family: "Courier New",Courier,monospace;"> -r, --recursive recurse into directories<br /> -l, --links copy symlinks as symlinks<br /> -p, --perms preserve permissions<br /> -t, --times preserve times<br /> -g, --group preserve group<br /> -o, --owner preserve owner (super-user only)<br /> -D same as --devices --specials</span>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-46253132086034507532014-08-28T13:40:00.005-04:002014-09-16T16:53:55.927-04:00Hiding 'answer' slides in student handouts using BeamerAfter adding 'Question' slides, followed by 'Answer' slides, to my Beamer presentation, I wondered if there was an easy way to remove the 'Answer' slides from the handout version of the slides that I will give to the students.<br />
<br />
It turns out that there is, as described in this <a href="http://tex.stackexchange.com/questions/35027/how-to-exclude-certain-slides-from-handout" target="_blank">link</a>: just add <span style="font-family: "Courier New",Courier,monospace;"><handout:0></span> right after the <span style="font-family: "Courier New",Courier,monospace;">\begin{frame} </span><br />
<br />
When 'handout' is added to the document class, all those slides marked with <span style="font-family: "Courier New",Courier,monospace;"><handout:0></span> are automatically excluded.<br />
<br />
This is wonderful!<br />
<br />Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-9825305020610299392014-06-16T17:21:00.002-04:002014-06-16T17:21:56.159-04:00The wonderful 'endfloat' LaTeX packageWith LaTeX (or LyX), one can work on a manuscript while placing the tables and figures in their natural places, and then, with the addition of a single line to the preamble:<div>
<br /><div>
<pre class="lang-tex prettyprint prettyprinted" style="background-color: #eeeeee; border: 0px; color: #393318; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; font-size: 12px; line-height: 18px; margin-bottom: 10px; max-height: 600px; overflow: auto; padding: 5px; vertical-align: baseline; width: auto; word-wrap: normal;"><code style="border: 0px; color: #222222; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin: 0px; padding: 0px; vertical-align: baseline; white-space: inherit;"><span class="kwd" style="border: 0px; color: #8a4a0b; margin: 0px; padding: 0px; vertical-align: baseline;">\usepackage</span><span class="pun" style="border: 0px; color: #145680; margin: 0px; padding: 0px; vertical-align: baseline;">[</span><span class="pln" style="border: 0px; color: black; margin: 0px; padding: 0px; vertical-align: baseline;">nolists,tablesfirst</span><span class="pun" style="border: 0px; color: #145680; margin: 0px; padding: 0px; vertical-align: baseline;">]{</span><span class="pln" style="border: 0px; color: black; margin: 0px; padding: 0px; vertical-align: baseline;">endfloat</span><span class="pun" style="border: 0px; color: #145680; margin: 0px; padding: 0px; vertical-align: baseline;">}</span></code></pre>
</div>
<div>
<br /></div>
<div>
the tables and figures are magically moved to the end, as is required at submission by many journals. </div>
<div>
<br /></div>
<div>
The <a href="http://www.ctan.org/pkg/endfloat" target="_blank">'endfloat' package</a> even inserts markers like "[Table 1 about here]" in the main text.</div>
<div>
<br /></div>
<div>
Now I just need to persuade my colleagues to start writing their manuscripts in LaTeX...</div>
</div>
Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-36949725655282626842014-04-10T13:31:00.002-04:002014-04-10T13:31:38.911-04:00<a href="https://www.google.com/culturalinstitute/browse/Ching%20Chun%20Li" target="_blank">Here</a> is an amazing set of photographs of C.C. Li, who was the founder of the Division of Human Genetics within the Department of Biostatistics at the University of Pittsburgh. Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-79831685452876470152014-02-06T16:40:00.003-05:002014-02-06T16:40:37.659-05:00Markov Chain HumorPossible mascots for a program that uses Markov Chains in its computations:<br />
<br />
<b>Markov Man</b><br />
<i>"Look, it's a bird... no, it's a plane,... no, I forgot what it was."</i><br />
<br />
<br />
<b>Hidden Markov Man</b><br />
<i>the truly Forgetful Man</i>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-84760393075830845162013-10-29T17:38:00.001-04:002013-10-29T17:45:54.015-04:00Computers are useful?!"A computer makes it possible to do, in half an hour, tasks which were completely unnecessary to do before" ~ Dave Barry.Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-86304907405043829572013-08-14T16:53:00.001-04:002013-08-14T16:53:56.093-04:00Zero p-values in RFor the top hits from an R program, I was getting all zero p-values, which are not that useful for ranking or plotting.<br />
<br />
Turns out that the R program was calculating the p-value in this manner:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">1 - pchisq(71.12830,1)</span><br />
<br />
But, according to the discussion <a href="http://stackoverflow.com/questions/6970705/why-cant-i-get-a-p-value-smaller-than-2-2e-16" rel="nofollow" target="_blank">here</a>, in R<br />
<br />
".Machine$double.eps is the smallest number such that 1+x can be distinguished from 1"<br />
<br />
On my machine, we have that:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">> .Machine$double.eps</span><br />
<span style="font-family: Courier New, Courier, monospace;">[1] 2.220446e-16</span><br />
<br />
So that is why this truncates to zero:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">> 1 - pchisq(71.12830,1)</span><br />
<span style="font-family: Courier New, Courier, monospace;">[1] 0</span><br />
<br />
However, we can get a non-zero p-value if instead we compute it this way:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">> pchisq(71.12830,1,lower.tail=FALSE)</span><br />
<span style="font-family: Courier New, Courier, monospace;">[1] 3.347341e-17</span>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-39415248467899605552012-10-23T10:48:00.003-04:002012-10-23T10:48:38.938-04:00knitrI recently became aware of the kintr package <a href="http://yihui.name/knitr/" target="_blank">http://yihui.name/knitr/</a> which is much more capable and easier to use than Sweave when creating dynamic 'reproducible research' reports that interweave LaTeX and R. Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-62247679819524111652012-10-22T15:27:00.000-04:002012-10-22T15:27:41.851-04:00Quertle http://www.quertle.info/I learned today about an amazing new program for searching the PubMed literature database plus full text documents - this is called 'Quertle' <a href="http://www.quertle.info/" target="_blank">http://www.quertle.info/</a> <br />
<br />
I tried it out and quickly found an important and relevant paper that we hadn't found in an earlier more extensive conventional literature search.Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-24770630556485899912011-10-18T14:03:00.004-04:002011-10-18T14:03:49.085-04:00Your genetic profileCraig Venter said, in response to a question about the 'risks' of knowing one's genetic makeup:<br />
<br />
"Understanding the genetic code is understanding probabilities. There is very little in your code that is yes/no."Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-80548006539761180652011-10-17T13:16:00.002-04:002011-10-17T13:16:54.873-04:00Hazards of Having Good Students in your Introductory Statistics Class<span style="font-size: small;"><span style="font-family: inherit;">This paper <a href="http://www.blogger.com/goog_1094374833">"</a></span></span><span style="font-family: inherit; font-size: small;"><a href="http://www-personal.umich.edu/%7Eteraghu/publications.html">Hazards of Having Good Students in your Introductory Statistics Class"</a> by Trivellore Raghunathan</span><span style="font-size: small;"><span style="font-family: inherit;"> about the meaning of confidence intervals and p-values is well-written and entertaining.</span></span>Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-8423330091364212152011-10-17T12:55:00.002-04:002011-10-17T12:55:22.975-04:00The meaning of p-valuesRegarding p-values, Paul Meehl said something along the lines of "a p-value answers a question we shouldn't ask about a topic we aren't interested in".Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-30543550589990130542011-10-17T12:53:00.001-04:002011-10-17T12:53:03.428-04:00THINK"I've often said that once scientists get a p-value less than 0.05 or a
confidence interval that implies statistical significance, they turn
their brains off." Steve Simon, quoted from <a href="http://www.pmean.com/11/NegativeInterval.html">here</a>.Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-71054468461594902262011-09-12T14:04:00.002-04:002011-09-12T14:04:49.358-04:00Significance<div class="separator" style="clear: both; text-align: center;">
This <a href="http://xkcd.com/882/">cartoon</a> from XKCD is wonderful:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://imgs.xkcd.com/comics/significant.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://imgs.xkcd.com/comics/significant.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-56118187294807364352011-09-12T13:56:00.001-04:002011-09-12T14:00:09.406-04:00"Any claim coming from an observational study is most likely to be wrong"This article from the journal "Significance"<br /><br /><a href="http://www.significancemagazine.org/details/magazine/1324539/Deming-data-and-observational-studies.html">Deming, data and observational studies</a><br />Author: S. Stanley Young, Alan Karr<br />Published: Aug 25, 2011 - From issue: Volume 8 Issue 3 (September 2011)<br />Doi: 10.1111/j.1740-9713.2011.00506.x <br /><br />
suggests a process control-based solution for improving the success rate of observational studies.<br />
<br />
To better appreciate the problem of multiple testing, they suggest visiting Jery Dallal's <a href="http://www.jerrydallal.com/LHSP/multtest.htm">simulation web page</a>, which illustrates what happens when one carries out "100 Independent 0.05 Level Tests For An Effect Where None Is Present".<br />Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-86714319421034884312011-09-08T16:14:00.001-04:002011-09-08T16:14:18.161-04:00Forensic bioinformaticsI like this phrase, "Forensic bioinformatics", that Keith Baggerly and Kevin Coombes have defined:<br />
<br />
<i>"Data processing, however, is often not described well enough to allow for exact reproduction of the results, leading to exercises in “forensic bioinformatics” where aspects of raw data and reported results are used to infer what methods must have been employed." </i><br />
<br />
in this paper <a href="http://projecteuclid.org/euclid.aoas/1267453942">here</a>. Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com1tag:blogger.com,1999:blog-1528351814434674468.post-55276609299060515662011-09-08T15:57:00.002-04:002011-09-08T16:17:11.766-04:00Computing for Statistical Genetics: nice R slidesThomas Lumley and Ken Rice have made a very nice set of "Computing for Statistical Genetics" slides on R that they prepared for the European Institute in Statistical Genetics. This set of slides is available <a href="http://faculty.washington.edu/kenrice/sisg/">here</a>. Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-35934087639653906602011-08-18T17:16:00.004-04:002011-08-18T17:22:10.812-04:00Elegant plotting of GWAs resultsI really like the way they plotted their GWAs results in <a href="http://www.nature.com/nature/journal/v476/n7359/full/nature10251.html">this paper</a>:
<br />
<br />International Multiple Sclerosis Genetics Consortium; Wellcome Trust Case
<br />Control Consortium. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature. 2011 Aug 10;476(7359):214-9. doi: 10.1038/nature10251. PubMed PMID: 21833088.
<br />
<br />
<br />Try out their elegant interactive version of their GWAs Figure at:
<br />
<br /><a href="http://wattle.well.ox.ac.uk/wtccc2/external/ms/">http://wattle.well.ox.ac.uk/wtccc2/external/ms/</a>
<br />Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-85855297765572096182010-06-22T14:09:00.002-04:002010-06-22T14:11:27.592-04:00I think the circular inferno plot for comparing genome-wide association scan results could be useful.<br /><br />This is described in this paper:<br /><br />Tembe WD, Pearson JV, Homer N, Lowey J, Suh E, Craig DW. Statistical comparison framework and visualization scheme for ranking-based algorithms in high-throughput genome-wide studies. J Comput Biol. 2009 Apr;16(4):565-77. PubMed PMID: 19361328.<br /><br />as well as in this <a href="http://asusrl.eas.asu.edu/share/xyz/Research/bio/TGen/plotting%20for%20integrated%20genomic%20data_poster.pdf">poster</a> (pdf file).<br /><br />Perhaps such plots could be drawn using the 'circos' software:<br /><br />See: <a href="http://mkweb.bcgsc.ca/circos/">http://mkweb.bcgsc.ca/circos/</a><br /><br />Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. Circos: an information aesthetic for comparative genomics. Genome Res. 2009 Sep;19(9):1639-45. Epub 2009 Jun 18. PubMed PMID: 19541911; PubMed Central PMCID: PMC2752132.Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0tag:blogger.com,1999:blog-1528351814434674468.post-70871642697837598832009-12-09T17:29:00.002-05:002009-12-09T17:33:10.335-05:00Open access Autism 10K genome-scan data set in NIH GEOThe Affymetrix 10K genome-scan data set from the Autism Genome Project is available in the NIH GEO repository (open access):<br /><br /><a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6754">http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6754</a><br /><br />"This is a large linkage study undertaken by The Autism Genome Project (AGP) Consortium to search for candidate genes underlying the etiology of autism. 1168 Muliplex families (= 2 affected individuals) consisting of 7600 individuals were genotyped using Affymetrix 10K whole genome mapping arrays. Copy number analysis was performed using DNA Chip (dChip) Analyzer."Daniel E. Weekshttp://www.blogger.com/profile/00430940551049146859noreply@blogger.com0