Top Banner
Outline to SNP bioinformatics lecture • Brief introduction • SNPs in cell biology • SNP discovery • SNP assessment • SNP databases • SNPs in genome browsers
24

Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Outline to SNP bioinformatics lecture

• Brief introduction

• SNPs in cell biology

• SNP discovery

• SNP assessment

• SNP databases

• SNPs in genome browsers

Page 2: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Single Nucleotide Polymorphisms

• Must be present in at least 1% of the population

• Most (90%) of the sequence variation between two genomes

• Two humans differ 0.1%• 1/300 bp in the human genome

– Lower in coding regions

• 10 million in the human genome

Page 3: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Categories of SNPs

• Missense/Non-synonymous– Changes an amino acid– About half of the SNPs in coding sequence– Can alter function and or structure of the protein– Cause of most monogenetic diseases

• Hemochromatosis (HFE)• Cystic fibrosis (CFTR)• Hemophilia (F8)

• Nonsense– Introduces a stop codon– Same consequences as non-synonymous

Page 4: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Categories of SNPs

• Synonymous– Does not alter the coding sequence– May alter splicing

• Non-coding– Can be located in promoter or regulatory

regions– Can impact the expression of the gene

• All SNPs can be used as markers

Page 5: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Use to cell biologist

• Association studies– Use SNPs as markers to find regions associated with

phenotype

• Causative SNPs– Altered protein– Altered expression

• Regions of altered conservation between strains/species/individuals

• Evolutionary analyses• Etc…

Page 6: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP discovery

• Discovery of SNPs usually from sequencing• Discovery is based on separating

sequencing errors from ’real’ differences and assessing the frequency in the sequenced population

• Separation of parologous sequences

• Validation, genotyping

Page 7: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP discovery resources

• Polybayes – SNP discovery in redundant sequences

• Polyphred– SNP discovery based on phred/phrap/consed

• NovoSNP– Graphical identification of SNPs

Page 8: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Example: PolyPhred

• Detects heterozygotes from chromatograms

• Runs together with phred/phrap/consed

• Command line

Page 9: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP assessment

• Assess SNPs for functional effects– Non-synonymous SNPs

• Conservation across species

• Amino acid properties

• Protein structure

• Transmembrane regions, signal peptides etc.

Page 10: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP assessment resources

• SIFT• PolyPhen• Pmut• SNPs3D• PANTHER PSEC• TopoSNP• MAPP• Etc

Page 11: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Example: SIFT

• Sorting Intolerant From Tolerant

• Builds an alignment of similar sequences

• Calculates a score based on the aa in the alignment

• Takes the environment into account

• Takes the properties of the aa into account

• Does not use structure

Page 12: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.
Page 13: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP databases

• Maps of SNPs in human, mouse, etc

• Haplotype maps

• Functional SNPs

• Disease databases

Page 14: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNP databases

• dbSNP

• F-SNP

• HGVBase

• PolyDoms

• OMIN

• Etc…

Page 15: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Example: dbSNP

• 50 million submissions

• 18 million clusters

• 7 million in genes

• 44 organisms

• 91 million SNPs submitted

Page 16: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

dbSNP

• Search for SNPs, location, etc

• Information submitted on method, flanking sequence, alleles, population, sample size, validation etc

• Information computed on SNPs at same location including functional analysis, population diversity etc

Page 17: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.
Page 18: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

SNPs in genome browsers

• Ensembl

• UCSC

Page 19: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Example: UCSC

Page 20: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.
Page 21: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.
Page 22: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.
Page 23: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

HapMap

• Aim: a haplotype map of the human genome describing common patterns of sequence variation

• A haplotype map is based on alleles of SNPs close together are inherited together

• HapMap will identify which SNPs are informative in mapping, reducing the number of SNPs to genotype by a magnitude

• Populations from Asia, Europe and Africa• 2nd generation map with over 3.1 million SNPs

Page 24: Outline to SNP bioinformatics lecture Brief introduction SNPs in cell biology SNP discovery SNP assessment SNP databases SNPs in genome browsers.

Ng PC, Henikoff S. Predicting the effects of amino acid substitutions on protein function.

Annu Rev Genomics Hum Genet. 2006;7:61-80. Review.

Bhatti P, Church DM, Rutter JL, Struewing JP, Sigurdson AJ.

Candidate single nucleotide polymorphism selection using publicly available tools: a guide for epidemiologists.

Am J Epidemiol. 2006 Oct 15;164(8):794-804. Epub 2006 Aug 21.

Clifford RJ, Edmonson MN, Nguyen C, Scherpbier T, Hu Y, Buetow KH.

Bioinformatics tools for single nucleotide polymorphism discovery and analysis.

Ann N Y Acad Sci. 2004 May;1020:101-9. Review.

The International HapMap Consortium.

A second generation human haplotype map of over 3.1 million SNPs.

Nature 449, 851-861. 2007.