Top Banner
Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008
23

Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Dec 23, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Towards Personal GenomicsTools for Navigating the Genome of an Individual

Saul A. KravitzJ. Craig Venter Institute

Rockville, MD

Bio-IT World 2008

Page 2: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Personal Genomics: The future is now

Page 3: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Outline

• HuRef Project: Genome of an Individual• HuRef Research Highlights• The HuRef Browser – http://huref.jcvi.org• Towards Personal Genomics Browsers• Conclusions and Credits

Page 4: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Genome of a Single Individual: Goals

• Provide a diploid genome that could serve as a reference for future individualized genomics

• Characterize the individual’s genetic variation– HuRef vs NCBI– HuRef haplotypes

• Understand the individual’s risk profile based on their genomic data

Page 5: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

How does HuRef Differ?• NCBI Genome

– Multiple individuals– Collapsed Haploid Sequence of a Diploid Genome– No haplotype phasing or inference possible

• HuRef– Single individual– Can reconstruct haplotypes of diploid genome

• Haplotype Blocks– Segment of DNA inherited from one parent

Page 6: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

The HuRef Genome

PLoS Biology 2007 5:e254September 4, 2007

Page 7: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

• DNA from a single individual• De Novo Assembly

– 7.5x Coverage Sanger Reads

• Diploid Reconstruction– Half of genome is in haplotype blocks of >200kb

• HuRef Data Released– NCBI: Genome Project 19621– JCVI: http://huref.jcvi.org

The HuRef Genome

Page 8: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Variants: NCBI-36 vs HuRef

• NCBI-36 vs HuRef yields Homozygous Variants

SNP MNP

Insertion DeletionNCBI

HuRef

variant: G/A variant: TA/AT

variant: variant:

Page 9: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Reads

ACCTTTGTAATTCCCACCTTTGTAATTCCCACCTTTGTAATTCCCACCTTTACAATTCCCACCTTTACAATTCCCACCTTTACAATTCCC

Computing Allelic Contributions• Consensus generation conflates alleles

Haploid Consensus

ACCTTTGCAATTCCC

Page 10: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Computing Allelic Contributions

• Consensus generation conflates alleles• Consensus generation modified to separate alleles• Bioinformatics. 2008 Apr 15;24(8):1035-40

Page 11: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Reads

ACCTTTGTAATTCCCACCTTTGTAATTCCCACCTTTGTAATTCCCACCTTTACAATTCCCACCTTTACAATTCCCACCTTTACAATTCCC

Computing Allelic Contributions• Modified Consensus generation separates allele• Compare HuRef alleles to identify SNP, MNP, Indel Variants

True Diploid Alleles

ACCTTTGTAATTCCC

ACCTTTACAATTCCC

Haploid Consensus

ACCTTTGCAATTCCC

AC / GT

MNP Variant

Page 12: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

HuRef Variations

• 4.1 Million Variations (12.3 Mbp)• 1.2 Million Novel

• Many non-synonymous changes• ~700 indels and ~10,000 total SNPs

• Indels and non-SNP Sequence Variation• 22% of all variant events, 74% of all variant bases

• 0.5-1.0% difference between haploid genomes• 5-10x higher than previous estimates

Page 13: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

HuRef Browser• Why do this?

• Research tool focused on variation• Verify assembly and variants• Show ALL the evidence• High Perfomance

• Features• Use HuRef or NCBI as reference• Genome vs Genome Comparison• Drill down from chromosome to reads and alignments• Overlay of Ensembl and NCBI Annotation• Links from HuRef features in NCBI (e.g., dbSNP)• Export of data for further analysis

Page 14: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

http://huref.jcvi.org

Search by Feature ID or coordinatesSearch by Feature ID or coordinates

Navigate by Chromosome BandNavigate by Chromosome Band

Page 15: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Zinc Finger ProteinChr19:57564487-57581356

Assembly StructureAssembly Structure

VariationsVariations

TranscriptTranscript GeneGene

Haplotype BlocksHaplotype Blocks

NCBI-36NCBI-36

HuRefHuRefAssembly-Assembly MappingAssembly-Assembly Mapping

Page 16: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

chr19:57578700-57581000

Protein Truncated by 476 bp Insertion

Homozygous SNPHomozygous SNPHeterozygous SNPHeterozygous SNP

Page 17: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Assembly Structure

Page 18: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Drill Down toMulti-sequence Alignment

Validation of Phased A/C Heterozygous SNPs in HuRef

Page 19: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

14kbp Inversion Spanning TNFRSF14chr1:2469149-2496613

Page 20: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Browser for Multiple Genomes

• Expand on existing features– Variants and haplotype blocks in individuals– Structural variation among individuals– Genetic traits of variants related to diseases

• Required Features– Which genome/haplotype is the reference?– Correlation with phenotypic, medical, and

population data– Correlation within families

Page 21: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Future Challenges

• Data volumes– read data included from new technologies– Multiplication of genomes

• Enormous number of potential comparisons– Populations, individuals, variants

• Dynamic generation of views in web time• Use cases are evolving

Page 22: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

Conclusion

• A high performance visualization tool for an individual genome– Validation of variants– Comparison with NCBI-36

• Planned extensions for multi-genome era

• Website: http://huref.jcvi.org• Contact: [email protected]

Page 23: Towards Personal Genomics Tools for Navigating the Genome of an Individual Saul A. Kravitz J. Craig Venter Institute Rockville, MD Bio-IT World 2008.

HuRef Browser: Nelson Axelrod, Yuan Lin, and Jonathan Crabtree

Scientific Leadership: Sam Levy, Craig Venter, Robert Strausberg, Marvin Frazier

Sequence Data Generation and Indel Validation: Yu-Hui Rogers, John Gill, Jon Borman, JTC Production, Tina McIntosh, Karen Beeson, Dana Busam, Alexia Tsiamouri, Celera Genomics. Data Analysis: Sam Levy, Granger Sutton, Pauline Ng, Aaron Halpern, Brian Walenz, Nelson Axelrod, Yuan Lin, Jiaqi Huang, Ewen Kirkness, Gennady Denisov, Tim Stockwell, Vikas Basal, Vineet Bafna, Karin Remington, and Josep Abril

CNV, Genotyping, FISH mapping: Steve Scherer, Lars Feuk, Andy Wing Chun Pang, Jeff MacDonald

Funding: J. Craig Venter Foundation DNA: J. Craig Venter

Acknowledgements