Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University 1 subPSEC (substitution position-specific evolutionary conservation) estimates the.

Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University1

subPSEC (substitution position-specific evolutionary conservation) estimates the likelihood of a functional effect. Values are 0 to -10, (-10 most likely to be deleterious). -3 is the previously identified cutoff point for functional significance.

Pdeleterious

(anything above

0.5 is considered

deleterious)

substitution

-3.968430.72481D538G

EVOLUTIONARY ANALYSIS OF CODING SNPS

http://www.pantherdb.org/tools/csnpScoreForm.jsp?

ESR1_HUMAN: D538G

http://www.pantherdb.org/tools/csnpScoreForm.jsp

Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University2http://mutationassessor.org/

ESR1_HUMAN: D538G

http://mutationassessor.org/

• 11 possible candidate SNPs were selected for their potential relevance to breast cancer.

• rs2747648, which resides in a predicted binding site for 3 miRNAs in the estrogen receptor-α (ESR1) gene, was associated with a 27% reduction in breast cancer risk in premenopausal women.

• When the C allele is present, miR-453 binds with greater affinity to ESR1, thus leading to decreased levels of ERα protein. Postmenopausal women already have reduced levels of endogenous estrogen, perhaps explaining why this SNP is relevant only in premenopausal women.

• Would carriers of the ancestral T allele respond better to endocrine therapy ? given that they will naturally express increased levels of the receptor.

References:Tchatchou, S. et al. A variant affecting a putative miRNA target site in estrogen receptor (ESR) 1 is associated with breast cancer risk in

premenopausal women. Carcinogenesis 30, 59–64 (2009).Adams, B. D., Furneaux, H. & White, B. A. The micro-ribonucleic acid (miRNA) miR-206 targets the human estrogen receptor- α (ERα) and

represses ERα messenger RNA and protein expression in breast cancer cell lines. Mol. Endocrinol. 21, 1132–1147 (2007).


SNPs in miRNA Binding Sites

http://www.genemania.org/


http://www.genemania.org/


Before you design your own primers – Don’t reinvent the wheels!

Essential Bioinformatics Resources for Designing PCR Primers for Various Applications: http://www.humgen.nl/primer_design.html

http://www.humgen.nl/primer_design.html



1. Use NCBI Gene or UCSC genome browser to find gene variants:

• Transcript variants• Alternative isoforms• Exon-intron boundaries • Pseusogenes

2. Gene conservation considerations

3. SNPs-There are approximately 56 million SNPs in the human genome, 16 million are in gene introns and exons, most are silent mutations. Are we

aiming at these locations ?

jPCR: http://primerdigital.com/tools/soft.html Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University6

Basic considerations before designing primers

http://primerdigital.com/tools/soft.html

Primer length determines the specificity and affects annealing to the template:Short primer => low specificity, non-specific amplificationLong primer => decreased binding efficiency at normal annealing temperature (due to high probability of forming secondary structures such as hairpins).

Primer design and primer characteristics

• Primer length: 18-24 bps, complete sequence identity to template• G/C content: 40-60%• Avoid mismatches at the 3’ end• The presence of G or C bases within the last five bases from the 3' end of primers

(GC clamp) helps promote specific binding at the 3' end. Avoid 3 or more G or C at the 3’ end because high primer-dimer probability

• Avoid a 3’ end T• Always have a reference gene (GAPDH, actin, RPLPO (Large Ribosomal Protein))

performed with your query genes• Optimal amplicon size: 100-1000 bps

http://www.sciencedirect.com/science/article/pii/S0888754311001066# Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University7

http://www.sciencedirect.com/science/article/pii/S0888754311001066

Primer design: Melting temperature (Tm) Tm is the temperature at which 50% of the DNA duplex dissociates to

become single stranded Determined by primer length, base composition and concentration Affected by the salt concentration of the PCR reaction mix

Optimal melting temperature: 52°C - 60°C Tm above 65°C may cause secondary annealing, higher Tm (75°C -

80°C) is recommended for amplifying high GC content targets Primer pair Tm mismatch

Significant primer pair Tm mismatch can lead to poor amplification (desirable Tm difference < 5°C between primer pairs)


Primer design: Annealing temperatureTa (Annealing temperature) vs. Tm

Ta is determined by the Tm of both primers and amplicons:

optimal Ta=0.3 x Tm(primer)+0.7 x Tm(product)-25 General rule: Ta is 5°C lower than Tm

Higher Ta enhances specific amplification but may lower yields Crucial in detecting polymorphisms


Primer design: Specificity and cross homology

Specificity: Determined primarily by primer length and sequence Cross homology: Cross homology may become a problem when

PCR template is DNA with highly repetitive sequences Avoid non-specific amplification: BLAST PCR primers against NCBI

non-redundant sequence database


Primer design: Avoid secondary structures Hairpins are formed via intra-molecular interactions, negatively affect primer-template binding, leading to poor or no amplification Self-Dimer (homodimer)

Formed by inter-molecular interactions between the two same primers Cross-Dimer (heterodimer)

Formed by inter-molecular interactions between the sense and antisense primers

Avoid Template Secondary Structure


Web Site: http://bioinfo.ut.ee/primer3-0.4.0/primer3/input.htm Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University12

http://bioinfo.ut.ee/primer3-0.4.0/primer3/input.htm

Web Site: http://primer3plus.com/cgi-bin/dev/primer3plus.cgi Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University13

http://primer3plus.com/cgi-bin/dev/primer3plus.cgi



Web Site: http://genepipe.ngc.sinica.edu.tw/primerz/beginDesign.do

SNP primers:

0

Design specific primers for each transcript:

http://genepipe.ngc.sinica.edu.tw/primerz/beginDesign.do

http://www4a.biotec.or.th/rexprimer2/Genotyping Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University16

SNPs

Copy number variation and InDels

http://www4a.biotec.or.th/rexprimer2/Genotyping

http://www4a.biotec.or.th/rexprimer2/OligoChecking Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University17

http://www4a.biotec.or.th/rexprimer2/OligoChecking

Primer Design Tools for Degenerate PCR– CODEHOP

Web Site:

http://blocks.fhcrc.org/codehop.html

More Info: http://www.hsls.pitt.edu/guides/genetics/obrc/dna/pcr_oligos/URL1118954832/info

Name CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

Type Web-based software Key Functions Design degenerate PCR primers based on multiple protein sequences

alignments Publication Info Nucleic Acids Research 2003 Times Cited 37 Pros Widely cited with many successful applications; settings for genetic code and

codon usage; Cons Requires local multiple alignment as input and must be in Blocks Database

format; Note In OBRC YiBu’s Rating 4 out of 5


http://blocks.fhcrc.org/codehop.html

http://www.hsls.pitt.edu/guides/genetics/obrc/dna/pcr_oligos/URL1118954832/info

Cross hybridization and specificity of primers


http://www.ncbi.nlm.nih.gov/tools/primer-blast/

http://www.ncbi.nlm.nih.gov/tools/primer-blast/

Resources for PCR Primer Specificity Analysis: NCBI BLAST

20

21

Primer specificity and Mapping: The UCSC In-Silico PCR

Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv Universityhttp://genome.csdb.cn/cgi-bin/hgPcr

http://genome.csdb.cn/cgi-bin/hgPcr


PCR reaction setup calculators

http://primerdigital.com/tools/ReactionMixture.html

http://primerdigital.com/tools/ReactionMixture.html


http://www.ncbi.nlm.nih.gov/probe

ESR1 human

Public PCR Primers/Oligo Probes Repository: The NCBI Probe Database

http://www.ncbi.nlm.nih.gov/probe

Resources for real time PCR: RTPrimerDB

Web Site:

http://www.rtprimerdb.org/ Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University24

Shows pre-calculated primers on all gene transcripts !

http://www.rtprimerdb.org/

Web Site:

http://pga.mgh.harvard.edu/primerbank/index.html

More Info: http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=14654707


http://pga.mgh.harvard.edu/primerbank/index.html

http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=14654707


http://primerdepot.nci.nih.gov/

http://primerdepot.nci.nih.gov/


http://eu.idtdna.com/pages/scitools

http://eu.idtdna.com/pages/scitools


Dilution CalculatorTakes an oligo stock solution of higher concentration and determines how much volume to dilute down to final (desired) lower concentration. Input of the volumes of the stock solution (Start Volume) and the diluted solution (End Volume) are not required, but recommended.

http://eu.idtdna.com/calc/dilution/

http://eu.idtdna.com/calc/dilution/


http://www.frontiersin.org/Journal/10.3389/fendo.2011.00008/full http://gtbinf.wordpress.com/2012/11/29/exome-sequence-analysis-group-1/

Exome Analysis Identify genetic disease causes: Sequence the human coding regions of patient and healthy (1-2% of the human genome (~30Mb)), find the genomic cause of diseases.

http://www.frontiersin.org/Journal/10.3389/fendo.2011.00008/full

http://gtbinf.wordpress.com/2012/11/29/exome-sequence-analysis-group-1/


http://www.ebi.ac.uk/Tools/st/emboss_backtranseq/

>A8KAF4_HUMAN A8KAF4 Estrogen receptor OS=Homo sapiens PE=2 SV=1 ATGACCATGACCCTGCACACCAAGGCCAGCGGCATGGCCCTGCTGCACCAGATCCAGGGC AACGAGCTGGAGCCCCTGAACAGGCCCCAGCTGAAGATCCCCCTGGAGAGGCCCCTGGGC GAGGTGTACCTGGACAGCAGCAAGCCCGCCGTGTACAACTACCCCGAGGGCGCCGCCTAC GAGTTCAACGCCGCCGCCGCCGCCAACGCCCAGGTGTACGGCCAGACCGGCCTGCCCTAC GGCCCCGGCAGCGAGGCCGCCGCCTTCGGCAGCAACGGCCTGGGCGGCTTCCCCCCCCTG AACAGCGTGAGCCCCAGCCCCCTGATGCTGCTGCACCCCCCCCCCCAGCTGAGCCCCTTC CTGCAGCCCCACGGCCAGCAGGTGCCCTACTACCTGGAGAACGAGCCCAGCGGCTACACC GTGAGGGAGGCCGGCCCCCCCGCCTTCTACAGGCCCAACAGCGACAACAGGAGGCAGGGC GGCAGGGAGAGGCTGGCCAGCACCAACGACAAGGGCAGCATGGCCATGGAGAGCGCCAAG GAGACCAGGTACTGCGCCGTGTGCAACGACTACGCCAGCGGCTACCACTACGGCGTGTGG AGCTGCGAGGGCTGCAAGGCCTTCTTCAAGAGGAGCATCCAGGGCCACAACGACTACATG TGCCCCGCCACCAACCAGTGCACCATCGACAAGAACAGGAGGAAGAGCTGCCAGGCCTGC AGGCTGAGGAAGTGCTACGAGGTGGGCATGATGAAGGGCATCAGGAAGGACAGGAGGGGC GGCAGGATGCTGAAGCACAAGAGGCAGAGGGACGACGGCGAGGGCAGGGGCGAGGTGGGC AGCGCCGGCGACATGAGGGCCGCCAACCTGTGGCCCAGCCCCCTGATGATCAAGAGGAGC AAGAAGAACAGCCTGGCCCTGAGCCTGACCGCCGACCAGATGGTGAGCGCCCTGCTGGAC GCCGAGCCCCCCATCCTGTACCCCGAGTACGACCCCACCAGGCCCTTCAGCGAGGCCAGC ATGATGGGCCTGCTGACCAACCTGGCCGACAGGGAGCTGGTGCACATGATCAACTGGGCC AAGAGGGTGCCCGGCTTCGTGGACCTGACCCTGCACGACCAGGTGCACCTGCTGGAGTGC GCCTGGCTGGAGATCCTGATGATCGGCCTGGTGTGGAGGAGCATGGAGCACCCCGGCAAG CTGCTGTTCGCCCCCAACCTGCTGCTGGACAGGAACCAGGGCAAGTGCGTGGAGGGCATG GTGGAGATCTTCGACATGCTGCTGGCCACCAGCAGCAGGTTCAGGATGATGAACCTGCAG GGCGAGGAGTTCGTGTGCCTGAAGAGCATCATCCTGCTGAACAGCGGCGTGTACACCTTC CTGAGCAGCACCCTGAAGAGCCTGGAGGAGAAGGACCACATCCACAGGGTGCTGGACAAG ATCACCGACACCCTGATCCACCTGATGGCCAAGGCCGGCCTGACCCTGCAGCAGCAGCAC CAGAGGCTGGCCCAGCTGCTGCTGATCCTGAGCCACATCAGGCACATGAGCAACAAGGGC ATGGAGCACCTGTACAGCATGAAGTGCAAGAACGTGGTGCCCCTGTACGACCTGCTGCTG GAGATGCTGGACGCCCACAGGCTGCACGCCCCCACCAGCAGGGGCGGCGCCAGCGTGGAG GAGACCGACCAGAGCCACCTGGCCACCGCCGGCAGCACCAGCAGCCACAGCCTGCAGAAG TACTACATCACCGGCGAGGCCGAGGGCTTCCCCGCCACCGTG

http://www.ebi.ac.uk/Tools/st/emboss_transeq/

>=

>= 6 frames translation

http://www.ebi.ac.uk/Tools/st/emboss_backtranseq/

http://www.ebi.ac.uk/Tools/st/emboss_transeq/

http://www.bioinformatics.org/sms2/index.html Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University31

Format Conversion tools:Reverse and\or Complement of DNA sequences (http://www.bioinformatics.org/sms2/rev_comp.html)Split FASTA: divides FASTA sequence records into smaller FASTA sequences of the size you specify (http://www.bioinformatics.org/sms2/split_fasta.html) Sequence Analysis:DNA Pattern Find: accepts one or more sequences along with a search pattern and returns the number and positions of sites that match the pattern (http://www.bioinformatics.org/sms2/dna_pattern.html)PCR Primer Stats: accepts a list of PCR primer sequences and returns a report describing the properties of each primer, including melting temperature, percent GC content, and PCR suitability (http://www.bioinformatics.org/sms2/pcr_primer_stats.html)PCR Products: accepts one or more DNA sequence templates and two primer sequences. The program searches for perfectly matching primer annealing sites that can generate a PCR product. Any resulting products are sorted by size, and they are given a title specifying their length, their position in the original sequence, and the primers that produced them (http://www.bioinformatics.org/sms2/pcr_products.html) Reverse Translate (http://www.bioinformatics.org/sms2/rev_trans.html) Translate (http://www.bioinformatics.org/sms2/translate.html)Primer Map: accepts a DNA sequence and returns a textual map showing the annealing positions of PCR primers (http://www.bioinformatics.org/sms2/primer_map.html)

Resources for PCR Primer Mapping/Amplicon Size

http://www.bioinformatics.org/sms2/index.html

http://www.bioinformatics.org/sms2/rev_comp.html

http://www.bioinformatics.org/sms2/split_fasta.html

http://www.bioinformatics.org/sms2/dna_pattern.html

http://www.bioinformatics.org/sms2/pcr_primer_stats.html

http://www.bioinformatics.org/sms2/pcr_products.html

http://www.bioinformatics.org/sms2/rev_trans.html

http://www.bioinformatics.org/sms2/translate.html

http://www.bioinformatics.org/sms2/primer_map.html


http://www.cmbi.ru.nl/cdd/biovenn/

x total127x only62x-y total overlap

y total628y only566x-z total overlap

z total0z only0y-z total overlap

Comparing gene-lists

Venny

http://bioinfogp.cnb.csic.es/tools/venny/

http://www.cmbi.ru.nl/cdd/biovenn/

Microarray Experiments

Probes for genes are located on the chip. Hybridization of mRNA to the probes on the chip is performed and results are recorded.

Various platforms !

Next generation sequencing bypass the rate-limiting step of conventional DNA sequencing (separating randomly terminated DNA polymers by gel electrophoresis) by physically arraying DNA molecules on solid surfaces and determining the DNA sequence in situ, without the need for gel separation.

Anchor DNA single molecule to solid surface

Amplify template by in situ PCR

Add 4 color labeled reverse terminators, polymerase, universal primer

Reverse termination, repeat 1…100 times, the number of cycles determines the length of sequence.

Remove un-incorporated nucleotide

Detect with laser

http://molonc.bccrc.ca/?page_id=191

Next Generation Sequencing

In both technologies, the great advantage is achieved by novel bio-technologies for producing high throughput data !!!However, both have pros and cons…

Microarray and Next Generation Sequencing Technologies

33Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University

proscons

Arrays

relatively cheap

mature biotechnology and analysis tools (since the late 90’s)

fixed probes, no heterogeneity of coverage

highly reproducible

detection of only known transcripts

limited to sequenced organisms, no de-novo

higher background

low expressed genes are less accurately detected

NGS

very sensitive if sufficient sequence depth

direct read-out of all transcripts

paired-end reads, better accuracy

de-novo sequencing, new genomes

highly reproducible

new and exciting

still expensive

technical bias in mRNA library preparation and in transcripts of different length

pre-mature bioinformatics tools

de-novo analysis is tricky, ambiguity in mapping reads to the genome

very high coverage is needed for low expressed genes

variable sequence coverage for different genomic regions

In both, consistent biological interpretation !34

Marioni J C et al. Genome Res. 2008;18:1509-1517http://cage.unl.edu/RNASEQ_Transcriptomics.pdf Copyright © 2008,

Cold Spring Harbor Laboratory Press

Consistent Biological Interpretation ?


NGS are becoming the technology of choice for a wide range of applications, but the transition away from microarrays is still long.

Different applications have different requirements, so researchers need to carefully weigh their options when making the choice for using a platform.

36

http://www.genengnews.com/gen-articles/next-generation-sequencing-vs-microarrays/4689/ Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University


TAU Bioinformatics unit: who are we and what do we do ?

http://www.tau.ac.il/lifesci/bioinformatics.html [email protected]: 03-6406992


Dr. Metsada Pasmanik-Chor, Bioinformatics Unit, Tel Aviv University 1 subPSEC (substitution position-specific evolutionary conservation) estimates the.

Documents

bioinformatics unit

tel aviv university

metsada pasmanikchor

d538g slide

estrogen receptor esr1

primer design

human estrogen receptor

gene variants