ADAM: Fast, Scalable Genome Analysis Frank Austin Nothaft AMPLab, University of California, Berkeley, @fnothaft with: Matt Massie, André Schumacher, Timothy Danford, Chris Hartl, Jey Kottalam, Arun Aruha, Neal Sidhwaney, Michael Linderman, Jeff Hammerbacher, Anthony Joseph, and Dave Patterson https://github.com/bigdatagenomics http://www.bdgenomics.org
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ADAM: Fast, Scalable Genome Analysis
Frank Austin Nothaft AMPLab, University of California, Berkeley, @fnothaft
with: Matt Massie, André Schumacher, Timothy Danford, Chris Hartl, Jey Kottalam, Arun Aruha, Neal Sidhwaney, Michael Linderman, Jeff
• Analysis time is often a matter of life and death
Whole Genome Data Sizes
Input Pipeline Stage Output
SNAP 1GB Fasta 150GB Fastq Alignment 250GB BAM
ADAM 250GB BAMPre-processing
200GB ADAM
Avocado 200GB ADAM
Variant Calling 10MB ADAM
Variants found at about 1 in 1,000 loci
Shredded Book AnalogyDickens accidentally shreds the first printing of A Tale of Two Cities
Text printed on 5 long spools
• How can he reconstruct the text? – 5 copies x 138, 656 words / 5 words per fragment = 138k fragments – The short fragments from every copy are mixed together – Some fragments are identical
It was the best of of times, it was thetimes, it was the worst age of wisdom, it was the age of foolishness, …
It was the best worst of times, it wasof times, it was the the age of wisdom, it was the age of foolishness,
It was the the worst of times, it best of times, it was was the age of wisdom, it was the age of foolishness, …
It was was the worst of times,the best of times, it it was the age of wisdom, it was the age of foolishness, …
It it was the worst ofwas the best of times, times, it was the age of wisdom, it was the age of foolishness, …
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, …
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, …
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, …
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, …
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, …
It was the best of of times, it was thetimes, it was the worst age of wisdom, it was the age of foolishness, …
It was the best worst of times, it wasof times, it was the the age of wisdom, it was the age of foolishness,
It was the the worst of times, it best of times, it was was the age of wisdom, it was the age of foolishness, …
It was was the worst of times,the best of times, it it was the age of wisdom, it was the age of foolishness, …
It it was the worst ofwas the best of times, times, it was the age of wisdom, it was the age of foolishness, …
Slide credit to Michael Schatz http://schatzlab.cshl.edu/
Acknowledgements• UC Berkeley: Matt Massie, André Schumacher, Jey
Kottalam, Christos Kozanitis
• Mt. Sinai: Arun Ahuja, Neal Sidhwaney, Michael Linderman, Jeff Hammerbacher
• GenomeBridge: Timothy Danford, Carl Yeksigian
• The Broad Institute: Chris Hartl
• Cloudera: Uri Laserson
• Microsoft Research: Jeremy Elson, Ravi Pandya
• And other open source contributors!
Acknowledgements
This research is supported in part by NSF CISE Expeditions Award CCF-1139158, LBNL Award
7076018, and DARPA XData Award FA8750-12-2-0331, and gifts from Amazon Web Services, Google, SAP, The Thomas and Stacey Siebel Foundation, Apple, Inc., C3Energy, Cisco,