Introduction to Modern Omics Agus Salim, Ph.D Big Data Analytics: Application to Modern Genetics IPB Convention Center, 14-15 Dec 2015 Agus Salim, Ph.D Introduction to Modern Omics
Introduction to Modern Omics
Agus Salim, Ph.D
Big Data Analytics: Application to Modern GeneticsIPB Convention Center, 14-15 Dec 2015
Agus Salim, Ph.D Introduction to Modern Omics
Background
In 2001, Human Genome Project (HGP) together withCelera comes up with a draft of Human Genome(complete map in 2004)
What is a (Human) Genome?
Agus Salim, Ph.D Introduction to Modern Omics
What is the DNA?
DNA is the basic building block of genes
It is composed of 4 letters (A,C,G,T), wrapped in adouble-helix structure.
The analogy DNA (letter) make Genes (sentences) and acollection of genes make chromosome (chapter).
Human has 23 pairs of chromosomes (22 autosomes, 1sex)
Agus Salim, Ph.D Introduction to Modern Omics
Central Dogma of Genetics
This opened unprecedented opportunities to understandthe mechanisms behind human traits (phenotypes),disease and (hopefully) discover cure for those diseases.
After all, the DNA is the basic code book of genetics.
Central Dogma of Genetics:DNA =⇒ RNA =⇒ Protein =⇒ Phenotype (Trait)
Agus Salim, Ph.D Introduction to Modern Omics
Major types of ’Omics study
Genomics: Comprehensive study of DNA at organismscale
Transcriptomics: Comprehensive study of full set of RNA(transcripts)
Proteomics: Comprehensive study of full set of proteins
Metabolomics: Comprehensive study of full set of smallmolecules (metabolites)
Methyleomics: Comprehensive study of DNA methylation
Agus Salim, Ph.D Introduction to Modern Omics
Common Themes in ’Omics Data
Driven by the latest technology
High-throughput: large volume of data produced inrelatively short time
Low signal-to-noise ratio
Contain a lot of artifacts (bias)
Use of known databases to enhance findings
Agus Salim, Ph.D Introduction to Modern Omics
Technologies Used in Omics study: Microarray
An array can contain upto 20,000 spots. Each spot hasbeen implanted with ’target’ DNA sequence (1 spot ∼ 1gene).DNA from one or two samples are hybridized to the’target’, with each sample dyed with different color.
Agus Salim, Ph.D Introduction to Modern Omics
Technologies in Omics study: Microarray
The amount of hybridization from each sample dependson the amount of sequence that matched the targetsequence
More in sample A (red), More in sample B (green), equalamount (yellow)
Agus Salim, Ph.D Introduction to Modern Omics
Technologies in Omics study: Next-gen sequencing
Microarray technology only allows studying transcripts atgene-level
Next-gen sequencing (NGS) allows us to study transcriptsat sub-gene (e.g., exon) levels.
Agus Salim, Ph.D Introduction to Modern Omics
Technologies in Omics study: Next-gen sequencing
DNA is denatured, chopped randomly into short fragment
The sequencing machine will then scan (’read’) thesequence of letters from each end of fragment.
Unlike microarray data, initial NGS reads do not belongto any transcripts (genes); alignment is needed to mapthe reads back to the genome.
There is also problem since more reads will be mappedback to longer genes, purely due to chance alone.
Agus Salim, Ph.D Introduction to Modern Omics
Technologies in Omics study: LC (GC) - MS
Chromatography is popular for (physically) separatingmolecules; Mass-spectrometry is needed to measure themolecule weight
LC (GC) - MS is popular for proteomics andmetabolomics study
The data is usually in the form of spectra
Agus Salim, Ph.D Introduction to Modern Omics
Databases for Omics studies
Reflect knowledge accumulation over time
Refinement of gene boundaries; novel exons; howprotein-protein interacts; how DNA-protein interact.
Some examples:
Refseq (http://www.ncbi.nlm.nih.gov/refseq/):contain (experimentally-validated) reference sequences,transcripts and proteinsBioGRID (http://thebiogrid.org/: databases ofinteractions (mostly protein-protein and DNA-protein)KEGG (http://www.genome.jp/kegg/: pathway andinteraction databasesGEO (http://www.ncbi.nlm.nih.gov/geo/):publicly-available high-throughput datasets
Agus Salim, Ph.D Introduction to Modern Omics
Some Useful References
Ridley M. (1999). Genome: The Autobiography of aSpecies in 23 Chapters
Ridley M. (2003). Nature via Nurture: Genes, Experience,& What Makes Us Human
Ryan F. (2015). The Mysterious World of the HumanGenome.
Agus Salim, Ph.D Introduction to Modern Omics