Top Banner
Next Generation (Sequencing) Tools for Advanced Molecular Breeding Peter Winter, Kamila Bokszczanin, Himabindu Kudapa, Alejandro Rodriguez Meisel, Nicolas Krezdorn, Ruth Jüngling, Rajeev Varshney, Guenter Kahl, Björn Rotter GenXPro GmbH, Frankfurt am Main www.genxpro.de TranSNiPtomics: Genome-wide transcription profiles provided by NGS-based Massive Analysis of cDNA Ends (MACE) simultaneously identify allele-specific differential expression of root-trait-related drought-response genes in drought-tolerant and susceptible chickpea varieties
61

Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Jan 23, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Next Generation (Sequencing) Tools for

Advanced Molecular Breeding

Peter Winter, Kamila Bokszczanin, Himabindu Kudapa, Alejandro Rodriguez Meisel,

Nicolas Krezdorn, Ruth Jüngling, Rajeev Varshney, Guenter Kahl, Björn Rotter

GenXPro GmbH, Frankfurt am Main

www.genxpro.de

TranSNiPtomics: Genome-wide transcription profiles provided by

NGS-based Massive Analysis of cDNA Ends (MACE) simultaneously

identify allele-specific differential expression of root-trait-related

drought-response genes in drought-tolerant and susceptible

chickpea varieties

Page 2: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Genome & Transcriptome Analysis Services

Transcriptome : - Massive Analysis of cDNA Ends (MACE)

- Bacterial SuperSAGE

- Normalization of cDNA libraries (qualitative information)

- RNA-seq

- Small RNAs / microRNA in tissues, body fluids, exosomes

- Other non-coding RNAs, Degradome

- qPCR service

Genome: - Whole-genome Sequencing

- Digital karyotyping (ST-DK), RC-seq, CNVs

- Methylation-specific DK (ST-MSDK), Meth-seq

- All Exome sequencing, Target Enrichment

Metagenome: - COXI, 16s rRNA, others...

Bioinformatics: - NGS Data Handling, Assembly, Quantification, BLAST

- Expression Data Interpretation, Gene Ontology

GenXPro GmbH

Page 3: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Nucleotide-based information

GenXPro: Our Service Portfolio

Illumina Hiseq2000

Sequence length 2 x 150 bp ~ 500-15.000 bp (!)

Throughput/ subunit 30-60 Giga Bases 250 Mega Bases

PacBio

Full service: Transcriptomics, Genomics, Genotyping, Epigenomics, Bioinformatics

• Patented techniqe for reduced representation analyses

• Method to eliminate PCR-copies from dataset

• No prior knowledge about NGS required, no hardware, no software, just samples…

Page 4: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Hardware/Computers:

• 164 CPUs, 704 Gigabyte RAM

Assembly:

• different assembly programs available

Annotation:

• Novoalign, BLAST, SOAP, BLAT, Annovar

Enrichment Analysis:

• Gene Ontology, KEGG, BioCarta, GSEA etc.

How to handle Gigabytes of Data?

Bioinformatics NGS data management at GenXPro

Page 5: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

From poster of Manish Roorkiwal, ICRISAT, 2/7/2013

Drought is the major constraint to chickpea production:

A case for TranSNiPtomics

Application of NGS for Chickpea Breeding

TranSNiPtomics

Page 6: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

“TranSNiPtomics“

Requirements:

• Sufficient coverage - distinguish between sequencing error and SNP

• Accurate measurement of transcription levels

“TranSNiPtomics”:

simultaneous analysis of gene expression AND polymorphism =

allel-specific gene expression measurement

Advantages:

• Markers located within genes - very likely connected to specific

trait

• Markers can be chosen from differentially expressed genes to

increase chance of involvement in trait

Page 7: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Transcriptomes

Frequencies of transcript species Total transcript distribution

Less than 0.2 % of genes contribute

more than 40% of all transcripts

> 50% of transcripts are present in

less than 10 copies

*

Some frequent, many rare transcripts

Differential gene expresson results in large differences of

transcript representation in all transcriptomes

Page 8: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

RNA-Seq

5’ 3’

AAAAAAA-3’ TTTTTTT-5’ cDNA of transcript B

5’ 3’

AAAAAAA-3’ TTTTTTT-5’ cDNA of transcript A RNA-Seq

o Many reads per transcript

o Reads per transcript vary, depending on transcript lenght

o Quantification often difficult in non-model organisms

o Very deep sequencing required for short and low-abundant transcripts

(e.g. transcription factors, receptors)

Measuring the Transcriptome

Page 9: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Our solution = Massive Analyisis of cDNA Ends (MACE)

only the cDNA-3‘ends (or 5‘-ends) are sequenced

• Reduced complexity, less variants, but:

• concentration on the most polymorphic region in a gene

• highly specific for good annotation !

• easy to quantify !

• high coverage for SNP detection !

• low costs !

• hundreds of genotypes can be analysed, e.g. mapping

populations at reasonable costs

MACE

Measuring the Transcriptome

Page 10: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

5’ 3’

AAAAAAA-3’ TTTTTTT-5’

cDNA

cDNA

cDNA

cDNA

5’ 3’

5’ 3’

5’ 3’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

Massive Analysis of cDNA Ends (MACE):

cDNA 5’ 3’

AAAAAAA-3’ TTTTTTT-5’

cDNA 5’ 3’

AAAAAAA-3’ TTTTTTT-5’

Streptavidin-Beads

How it works

Massive Analysis of cDNA Ends: MACE

Page 11: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

Fragmentation, washing

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

Streptavidin-Beads

100-300 bp

How it works

Massive Analysis of cDNA Ends: MACE

Page 12: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

2nd generation sequencing of 50-100 bp

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

How it works

Massive Analysis of cDNA Ends: MACE

Page 13: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

Assembly & Counting

AAAAAAA-3’ TTTTTTT-5’

AAAAAAA-3’ TTTTTTT-5’

How it works

Massive Analysis of cDNA Ends: MACE

Page 14: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

50-400bp

Assembly & Counting

Counting, BLAST

4

1

1

Only one fragment per transcript!

How it works

Massive Analysis of cDNA Ends: MACE

Page 15: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

for model and non-model organsisms

Tags:

annotation / mapping

Gen 1

Gen 2

unknown

unknown

unknown

unknown

Assembly

BLASTX (Protein DBs)

quantification

1

1

4

Enrichment

analysis

quantification

WEB tool

„MACE2GO“

data browser

Bioinformatics: automated workflow

Page 16: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

RNA-Seq vs. MACE

5’ 3’

AAAAAAA-3’ TTTTTTT-5’ cDNA of transcript B

5’ 3’

AAAAAAA-3’ TTTTTTT-5’ cDNA of transcript A

RNA-Seq

Many reads per transcript, reads per transcript varies!

For similar resolution, RNA-Seq requires

about 20-30 times more sequencing*

one read = one transcript

*Asmann et. al 2009

MACE A

B

Page 17: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

cDNA AT

CG

AAAAAAAA

TTTTTTTT AT

cDNA AT

CG

AAAAAAAA

TTTTTTTT

AT

RNA-Seq = high complexity

MACE = reduced complexity

Concentration on polymorph 3‘ end: SNPs with enough coverage : 2

Reads distributed all over transcript: SNPs with enough coverage : 0

Reads

Reads

High coverage to distingish between SNP and error

TranSNiPtomics- why MACE?

Page 18: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Sufficient coverage

for SNP detection!

MACE, 20 Mio Reads

Wheat, nucleosome/chromatin assembly factor C; 160 TPM

Coverage for SNP detection

Page 19: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Coverage too low!

RNA seq, 20 Mio reads, same position

Coverage for SNP detection

Wheat, nucleosome/chromatin assembly factor C

Page 20: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

IGSTC-Project: IND 09/515

Biotechnological approaches to improve chickpea

crop productivity for farming community and industry

Dr Rajeev Varshney

ICRISAT, Patancheru,

India

Prof. Dr. Günter Kahl

Molecular BioSciences,

Frankfurt University

Germany

Dr. Manash Chatterjee

BenchBio Private Ltd

India

Dr. Peter Winter,

GenXPro GmbH,

Germany

Funding

Drought Research

Page 21: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Aim:

• Understand the impact and mode of action of a

major QTL for drought tolerance present in chickpea

variety ICC4958 on drought tolerance in different

genetic background

• Identify the genes underlying the QTL

• Produce qRT-PCR markers for transfer of the genes

• Produce transgenics containing the gene(s)

Page 22: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Modified from the poster of Manish Roorkiwal, ICRISAT, 2/7/2013

The genomic region around SRR marker TAA170 on chickpea

linkage group 4 contains a major QTL for drought tolerance

LG04: ICC 4958 x ICC 1882

Genetic base of experiment

Chickpea Drought Research

Page 23: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

ICC4958 (Tolerance Donor)

JG11

(Indien elite line)

JG11Plus

(High yielding under

drought)

Marker–Assisted

Backcrossing

ICC1882

(drought-susceptible)

F2

JG11

JG11 Selfing

129 Recombinant Inbred Lines

X X

Tolerant Susceptible

?

Genotypes used

Chickpea Drought Research

Page 24: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Chickpea Transcriptome Assembly (GenXPro, L.

Belarmino)

C. reticulatum (PI489777) 27.06.2012; CTDB

C. arietinum (ICC4958) transcriptome; CTDB;

Hybrid assembly

C. arietinum (ICC4958) transcriptome; CTDB; Short

read assembly

Unigenes (NCBI) (assembly of ESTs available at

NCBI, CTDB

Refseq plant RNA, downloaded from NCBI

database

Medicago_sativa_NCBI_Entrez_EST_19082011

Trinity.fasta

all_TIGR_DFCI_PLANT; Dana-Farber Cancer

Institute

12 chickpea libraries

ICC4958-WW ICC4958-04

ICC1882-WW ICC1882-04

JG11-WW JG11-04

JG11plus-WW JG11plus-04

RILsS-WW RILsS-04

RILsR-WW RILsR-04

MACE libraries Reference Transcriptome Databases

-WW: Well Watered

-04: 4 days drought

MACE libraries & annotation

TranSNiPtomics

Page 25: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Transcript variants (TVs) up- (log2 >=2, p-value < 1e-3) and down-regulated (log2 <=-2, p-value<1e-3) under stress in the different genotypes

0 20 40 60 80 100 120 140

RILsR

JG11plus

JG11

RILsS

ICC1882

ICC4958

Tvs up-regulated (>=2)under drought

Tvs down-regulated (<=-2) under drought

Under drought, ICC4958 regulates many genes up,

ICC1882 regulates many genes down

# of TVs

General observation

TranSNiPtomics

Page 26: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Gene Expression Profiles of the different

Genotypes stressed and well watered

Clustered heat-map of gene

expression in response to drought

stress in roots of susceptible and

toerant chickpea varieties

Blue = well watered

Red = drought stressed

Priming!

Already under well-watered

conditions gene expression

in drought-tolerant ICC4958

clusters with drought-

stressed other varieties

Page 27: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

JG11WW

JG1104

JG11plusWW

JG11plus04

ICC4958WW

ICC495804

RILsRWW

RILsR04

RILsSWW

RILsS04

ICC1882WW

ICC188204 1 regulation of metabolic

process

respiratory electron transport

chain

respiratory electron transport chain oxidoreductase activity, acting

on NADH or NADPH

NADH dehydrogenase (quinone)

activity

oxidation-reduction process

2 negative regulation of

catalytic activity

NADH dehydrogenase activity NADH dehydrogenase activity energy derivation by oxidation

of organic compounds

NADH dehydrogenase activity oxidoreductase activity

3 negative regulation of

molecular function

oxidoreductase activity, acting

on NADH or NADPH, quinone or

similar compound as acceptor

NADH dehydrogenase (ubiquinone)

activity

cellular respiration NADH dehydrogenase

(ubiquinone) activity

monooxygenase activity

4 regulation of catalytic

activity

NADH dehydrogenase

(quinone) activity

NADH dehydrogenase (quinone)

activity

NADH dehydrogenase activity oxidoreductase activity, acting on

NADH or NADPH, quinone or

similar compound as acceptor

response to biotic stimulus

5 regulation of molecular

function

NADH dehydrogenase

(ubiquinone) activity

oxidative phosphorylation NADH dehydrogenase (quinone)

activity

oxidoreductase activity, acting on

NADH or NADPH

iron ion binding

6 enzyme inhibitor

activity

cellular respiration ATP synthesis coupled electron

transport

NADH dehydrogenase

(ubiquinone) activity

oxidative phosphorylation lipid localization

7 regulation of biological

process

oxidoreductase activity, acting

on NADH or NADPH

electron transport chain respiratory electron transport

chain

ATP synthesis coupled electron

transport

lipid transport

8 regulation of primary

metabolic process

electron transport chain cellular respiration oxidoreductase activity, acting

on NADH or NADPH, quinone or

similar compound as acceptor

respiratory electron transport

chain

heme binding

9 regulation of cellular

metabolic process

energy derivation by oxidation

of organic compounds

oxidoreductase activity, acting on

NADH or NADPH

electron transport chain electron transport chain endopeptidase regulator activity

10 biological regulation ATP synthesis coupled electron

transport

oxidoreductase activity, acting on

NADH or NADPH, quinone or similar

compound as acceptor

generation of precursor

metabolites and energy

protein oligomerization endopeptidase inhibitor activity

11 sequence-specific DNA

binding transcription

factor activity

oxidative phosphorylation energy derivation by oxidation of

organic compounds

oxidative phosphorylation organelle membrane peptidase inhibitor activity

12 nucleic acid binding

transcription factor

activity

oxidation-reduction process protein oligomerization ATP synthesis coupled electron

transport

mitochondrial membrane peptidase regulator activity

Most enriched GO terms

NIL

RIL bulks

Tolerant parent Susceptible

parent High performing

Tolerant Susceptible Recurrent Parent Donor Parent

MACE profiles mirror breeding history

Page 28: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

GO

Nr.

JG11plus WW

JG11plus-04

ICC4958 WW

ICC4958-04

1 respiratory electron transport chain respiratory electron transport chain

2 NADH dehydrogenase activity NADH dehydrogenase activity

3 oxidoreductase activity, acting on NADH

or NADPH, quinone or similar

compound as acceptor

NADH dehydrogenase (ubiquinone) activity

4 NADH dehydrogenase (quinone) activity NADH dehydrogenase (quinone) activity

5 NADH dehydrogenase (ubiquinone)

activity

oxidative phosphorylation

6 cellular respiration ATP synthesis coupled electron transport

7 oxidoreductase activity, acting on NADH

or NADPH

electron transport chain

8 electron transport chain cellular respiration

9 energy derivation by oxidation of

organic compounds

oxidoreductase activity, acting on NADH or

NADPH

10 ATP synthesis coupled electron

transport

oxidoreductase activity, acting on NADH or

NADPH, quinone or similar compound as

acceptor

QTL on LG4

Conclusion: The drought-tolerance QTL from ICC4958 is responsible

for mitochondrial drought responses of JG11Plus

Most enriched GO terms

MACE profiles mirror breeding history

Page 29: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Tolerant

bulk: 9 best

Susceptible

bulk: 9 worst

Recombinant

Inbred Lines

(RILs)

ICC4958

(tolerant)

x ICC1885

(susceptible)

Recombinant Inbred Lines

RNA from Roots of:

Experimental setup

Page 30: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Library Total no of

annotated tags

No of tags

not annotated

Annotated in %

ICC4958_WW 9‘867‘301 3‘734‘104 62,16

ICC4958_04 12‘373‘412 3‘226‘189 73,93

ICC1882_WW 9‘422‘171 2‘269‘180 75,92

ICC1882_04 7‘364‘703 2‘532‘140 65,62

Tolerant Bulk_WW 11‘066‘463 4‘620‘331 58,25

Tolerant Bulk_04 11‘902‘727 5‘160‘686 56,64

Susceptible Bulk_WW 38‘714‘730 10‘496‘416 72,89

Susceptible Bulk_04 27‘310‘911 4‘494‘668 83,54

Blue = well watered

Red = 4 days drought stressed

Number of sequenced and annotated MACE tags

from roots of ICC4958, ICC1882 and the bulks

MACE libraries & annotation

TranSNiPtomics

Page 31: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Susceptible

Tolerant Tolerant

202

338

Results of Bulked Segregant Analysis

707

479

277 177

447 304

Susceptible

Drought Well Watered P value ~0

up regulated

transcripts

down regulated

transcripts

MACE libraries: 4x Parents, 4 x Bulks

Differentially regulated genes

Susceptible RILs regulate twice as many genes down as

Resistant RILs

Page 32: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Susceptible RILs reaction to water stress

Stressed

Well watered

Results of Bulked Segregant Analysis

Differentially regulated genes

Page 33: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Tolerant RILs reaction to water stress: much less…

Stressed

Well watered

Results of Bulked Segregant Analysis

Differentially regulated genes

Page 34: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Tolerant RILs compared to Susceptible RILS under water stress

Only expressed in

tolerant Bulk

Susceptible

Tolerant

Results of Bulked Segregant Analysis

Differentially regulated genes

Page 35: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

0 10 20 30 40 50 60 70

respiratory electron transport chain

response to stress

ATP synthesis coupled electron transport

oxidative phosphorylation

response to osmotic stress

NADH dehydrogenase activity

response to salt stress

cellular respiration

electron transport chain

oxidoreductase activity, acting on NADH or NADPH

up susceptible

up tolerant

GO Terms most enriched under water deficit in tolerant varieties

are strongly related to mitochondrial function

Tolerant bulk vs. susceptible bulk

The crucial role of plant mitochondria in orchestrating drought tolerance

Owen K. Atkin and David Macherel; 2009

Gene Expression Results

Page 36: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Transcripts and alleles that are unique in the tolerant RILs

are potentially powerful markers for drought resistance.

58 Transcripts were exclusively found in the tolerant Bulks

(>50 copies) under stress and well watered conditions

Among them:

• heat shock proteins

• LEA proteins

• many unknown or unknown in context of drought

MACE for Marker development:

Page 37: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Tolerant

bulk

Susceptible

bulk

Recombinant

Inbred Lines

ICC4958

(tolerant)

x ICC1882

(susceptible)

Allele distribution: Whose alleles went where ?

„Allelome“

Page 38: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

TranSNiPtomics

1) annotation to reference or to de novo assembly of MACE tags of all libraries

2) SNP detection: Position of SNP and sequence 100 bp before and after SNP are provided

for primer design

Transcript position ref_base var_base Library I ref base Library I var base 100 bp before SNP SNP 100 bp behind SNP

comp12617_c0_seq1 len=363 path=[341:0-52 394:53-92 434:93-362] : gi|224029627|gb|ACN33889.1| unknown [Zea mays] e-value: 7e-04 blast score: 42.054 A G 33 0 AGCCAGGCTCTCTGCTCGCTGGTATGCAGGCTTTCTAAAAAACAATGGCATCAA/G ATTCGGAGTGAGGTCGACCATACAAGAACCCATGCTCGACAGTGAGTTGTATAGTCAGATGCACTGAGTTGTGTGCCTTTATTTTAGGTATATGAGGGAA

comp12617_c0_seq1 len=363 path=[341:0-52 394:53-92 434:93-362] : gi|224029627|gb|ACN33889.1| unknown [Zea mays] e-value: 7e-04 blast score: 42.0119 C T 125 0 TGGTATGCAGGCTTTCTAAAAAACAATGGCATCAATATTCGGAGTGAGGTCGACCATACAAGAACCCATGCTCGACAGTGAGTTGTATAGTCAGATGCAC/T GAGTTGTGTGCCTTTATTTTAGGTATATGAGGGAAATGCTAAATTTATCCTGGGTAAAAGAAATCAATGTAACTAAATAAAAATTGTTTCGCCGTATGTA

comp14620_c0_seq1 len=251 path=[229:0-168 398:169-250] : gi|326519328|dbj|BAJ96663.1| predicted protein [Hordeum vulgare subsp. vulgare] e-value: 7e-06 blast score: 46.2169 T C 194 2 GACAAGGTAGGTACTCATAAAACAAACCATGGAGAGAGACCATGAACCAAATTGGACAAAACATACTTGCTTCCATATTAGAAAGCTTACATGGTATATT/C AAGTGGTGCTAAATAATCTTATAGAAGGGCAAAACAGTATACACGGTCTGCAAGAGAGTGGCCACAAGCAGGACGACGGCG

comp2006449_c0_seq160 C T 0 11 GTATACTTTTATGTACAAGTAGTTGCTTAATTGTTATTATGTGTTCTCTTTTTAGTTATC/T TTCTTCATTATAATTTTTCCATGGAAATAATGTATGCTGGTAGAGTGGCAGTGGTAATCAATGTGTATATTGCAAGGTGCTAGAGTACACACTGCAGGCT

comp20869_c0_seq1 len=243 path=[216:0-66 511:67-100 312:101-130 568:131-146 358:147-242] : gi|326518316|dbj|BAJ88187.1| predicted protein [Hordeum vulgare subsp. vulgare] e-value: 8e-08 blast score: 51.6 : sp|P12257|GUB2_HORVU Lichenase-2 (Fragment) OS=Hordeum vulgare PE=1 SV=1 e-value: 6e-04 blast score: 36.6132 T G 7 159 CGATAAGGTCTACCCCATCACCTTCGGCAGGTGAACTTGATTCAGTTCATCATGCGTCCTCCATGCATCCATGTACGTACGCGGCCATGCATGGTCATAT/G CGTGTATATATACTGTATAAATATATTCATTGAGTGTGTGTTTGTGTGGCTGATTTGGAAAAAGCTCAAGATATATAAGAAAATGTTCATGAATTGGGGA

comp20869_c0_seq1 len=243 path=[216:0-66 511:67-100 312:101-130 568:131-146 358:147-242] : gi|326518316|dbj|BAJ88187.1| predicted protein [Hordeum vulgare subsp. vulgare] e-value: 8e-08 blast score: 51.6 : sp|P12257|GUB2_HORVU Lichenase-2 (Fragment) OS=Hordeum vulgare PE=1 SV=1 e-value: 6e-04 blast score: 36.6147 T A 5 222 CATCACCTTCGGCAGGTGAACTTGATTCAGTTCATCATGCGTCCTCCATGCATCCATGTACGTACGCGGCCATGCATGGTCATATACGTGTATATATACT/A TATAAATATATTCATTGAGTGTGTGTTTGTGTGGCTGATTTGGAAAAAGCTCAAGATATATAAGAAAATGTTCATGAATTGGGGAAAATATTTGC

comp21319_c0_seq1 len=460 path=[438:0-277 716:278-296 735:297-335 774:336-344 783:345-459] : gi|357131581|ref|XP_003567415.1| PREDICTED: uncharacterized protein LOC100823458 [Brachypodium distachyon] e-value: 3e-13 blast score: 70.1297 C T 243 0 ATGCTGCGTGGCAGTGGCACACGTCTCAGTGTACATAGAAGCTCGAGCTACCATAGGCTCGTGTCATCGATCCCGTCCGTCGGCCACCCGTTCGCTAGCC/A AAGCATTCTTTTTCTCTCTTATCTCTGCACTGTACTACGTACGCATCGCCATGAATGATAGCTCAGCTCAAGCTGCCGTCCTCTCAACTCAACTCAATGA

Automated bioinformatics workflow for model and non-

model organisms

Excerpt of output table:

Bioinformatics

Page 39: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

comp21149_c0_seq1 = not annotated

MACE-Allele SNP Example

Unknown transcript from chickpea

Page 40: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

MACE-Allele SNP Example

comp21149_c0_seq1 = not annotated

Unknown transcript from chickpea

Page 41: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Susceptible and tolerant Bulks:

Alleles, at least 10 x exlusively found in either one bulk: 1234

Contrasting, exclusive alleles expressed at least 10 x in both bulks: 12

Parents (ICC4958 and ICC 1885):

Alleles, at least 10 x exlusively found in either one parent: 3896

Contrasting, exclusive alleles, each allel at least 10 x in both parents 128

Exclusive alleles in parents and bulks

Summary:

Alleles detected in parental lines and tolerant and susceptible bulks

Page 42: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Resistant Bulk Susceptible Bulk

Allel Distribution

85%

84%

Whose genes went where?

In the resistant bulk transcription profiles ICC4958

alleles are strongly over-represented

Page 43: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

TAA170 STMS11 ICCM0249 GA24

TranSNiPtomics: SNPs in the QTL on LG4

Genomic Sequence

Genes

Coding Region 3‘-UTR 5‘-UTR

MACE contig

SNP 0

20406080

322 G 322 T

QTL region

Whose genes went where?

Page 44: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

0

10

20

30

40

50

60

70

80

G T A C T C

322 322 1220 1220 1390 1390

ICC4958 ICC1882

TranSNiPtomics: SNPs in genes in the LG4 QTL region

Genomic Position: Ca4_13,726,396_13,728,300

Gene: Ninja-family protein mc410

Polymorphic base

Position in gene

QTL region

Whose genes went where?

Page 45: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

0

20

40

60

80

100

120

140

160

180

JG11 JG11plus ICC4958 RILsR RILsS ICC1882

1976 G 1976 A

Genomic position: Ca4_13,835,369_13,839,221

Gene: TIME FOR COFFEE-like

TIME FOR COFFEE encodes a nuclear regulator in the Arabidopsis thaliana

circadian clock. (Plant Cell. 2007 May;19(5):1522-36. Epub 2007 May 11)

TranSNiPtomics: SNPs in genes in the LG4 QTL region

Tolerant Susceptible

QTL region

Whose genes went where?

Page 46: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Gene Genomic Position # of SNPs Cysteine synthase-like Ca4_13,678,141_13,678,617 4

FRIGIDA-like protein Ca4_13,684,591_13,686,920 6

FRIGIDA-like protein Ca4_13,688,248_13,689,022 3

Suppressor of gene silencing 3 homolog Ca4_13,693,728_13,695,332 3

Uncharacterized LOC101489729 Ca4_13,699,049_13,700,445 2

Ninja-family protein mc410 Ca4_13,726,396_13,728,300 3

Uncharacterized LOC101494058 Ca4_13,768,316_13,768,909 1

Primary amine oxidase Ca4_13,788,285_13,789,046 2

Uncharacterized LOC101495327 Ca4_13,797,642_13,799,515 2

TIME FOR COFFEE-like Ca4_13,835,369_13,839,221 2

TIME FOR COFFEE-like Ca4_13,843,038_13,844,225 2

MACE detected 30 SNPs in 11 genes in an important

166,084kbp long region of the chickpea genome

TranSNiPtomics: SNPs in genes in the LG4 QTL region

Summary of polymorphic, expressed genes

QTL region

Whose genes went where?

Page 47: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

ICC4958

Parent

Tolerant

RILs

ICC1882

Parent

Susceptible

RILs Total

# Individuals 2 41 2 33 78

# Reads (Mio) 17.8 362.8 7.0 274.4 662.1

Ongoing Experiment:

MACE analysis of the stress responses of 75 RILs

from the cross ICC4958 x 1882

Sequencing performed so far

Page 48: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

• GO-enrichment analysis of drought-responsive transcripts reveals strong

regulation of respiratory transport chain-transcripts.

• Besides typical drought-responders like dehydrin-1, LEA-proteins and heat

shock proteins, many currently un-described reactive transcripts were

identified.

• New, highly reliable SNPs /alleles in the drought tolerant RIL-bulks

• MACE = cost-efficient, simultaneous gene expression and genotyping !

• Just started: Analysis of 75 RILs with MACE

• MACE-Kit will be released this year.

Preliminary Results

Page 49: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Himabindu Kudapa

Rajeev Kumar Varshney

Nicolas Krezdorn

Björn Rotter

Kamila Bokszczanin

Anja Frank

Peter Winter

Jutta Kreutz

Nicolas Gonzales

Günter Kahl

The Indogerman Program (BMBF)

Thank you for your patience

and:

Page 50: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Thank you!

Page 51: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000
Page 52: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

A) At ICRISAT, the major QTL for drought tolerance was transfered

from ICC4958 (drought tolerant, DP) via marker-assisted back

crossing (MABC) to JG11 (elite line, moderately drought

tolerant, RP) to give the NIL JG11Plus (highly drought tolerant,

high yielding under drought).

B) ICC4958 (drought tolerant, DP) was crossed to ICC1882

(drought-susceptible). The F2-offspring was self-pollinated to

give tolerant and susceptible RILs, RILsS and RILsR,

respectively.

Breeding lines used for TranSNiPtomics

Page 53: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Aims:

Identification markers and/or genes for drought

resistance

Better understanding of drought resistance in roots

Approach

Massive analysis of cDNA Ends (MACE), a reduced

complexity transcriptome sequencing method for

simultaneous gene expression analysis and genotyping

Page 54: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Reference Transcriptome Databases

Redundancy

Reference Protein Repository Databases

NR

formatdb

UniprotKB/Swissprot

Mt3.5

MACE library prep

… clustering 94 bp sequences

with 100% of identity

Mean quality score for each bp

MACE tags frequencies

with quality scores

polyA trimming

Transcript annotation

Trancriptomics of Thermotolerance

Page 55: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

take ids

combine ids

combine blastx tables in the order:

nr, swissprot, Mt3.5

Reference Protein Repository Databases

NR

formatdb

Swissprot

Mt3.5

BLASTX

Transcript annotation

MACE library prep

clustering 94 bp sequences

with 100% of identity

Mean quality score for each bp

MACE tags frequencies

with quality scores

polyA trimming

Trancriptomics of Thermotolerance

Page 56: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

MACE library prep

Library

clustering 94 bp sequences

with 100% of identity

Mean quality score for each bp

MACE tags frequencies

with quality scores

polyA trimming

SOAP

Trancriptomics of Thermotolerance

Transcript annotation

Page 57: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

GO analysis

Trancriptomics of Thermotolerance

Functional Analysis

Page 58: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Bioinformatics: automated workflow

for model and non-model organsisms

Tags:

annotation / mapping

Gen 1

Gen 2

unknown

unknown

unknown

unknown

Assembly

BLASTX (Protein DBs)

quantification

1

1

4

Enrichment

analysis

quantification

WEB tool

„MACE2GO“

Page 59: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Bioinformatics

Gene Ontology enrichment analysis

Page 60: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

TranSNiPtomics

Requirements:

• Sufficient coverage - distinguish between sequencing error and SNP

• Accurate measurement of transcription levels

“TranSNiPtomics”:

simultaneous analysis of gene expression AND polymorphisms;

advantages:

• Markers located within genes - very likely connected to specific

trait

• Markes can be chosen from differentially expressed genes to

increase chance of involvement in trait

Page 61: Next Generation (Sequencing) Tools for Advanced Molecular …ksiconnect.icrisat.org/wp-content/uploads/2013/10/... · 2013. 10. 21. · GenXPro: Our Service Portfolio Illumina PacBioHiseq2000

Susceptible

Tolerant Tolerant

Experimental setup

Susceptible

4 Days Drought Well Watered

MACE libraries: 4x Parents, 4 x Bulks

comparisons