Top Banner
NEXT GENERATION SEQUENCING
46

Next Generation Sequencing & Transcriptome Analysis

Dec 03, 2014

Download

Education

How to use next generation sequencing in transcriptomics and how to analyse those data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Next Generation Sequencing & Transcriptome Analysis

NEXT GENERATION SEQUENCING

Page 2: Next Generation Sequencing & Transcriptome Analysis

NEXT GENERATION SEQUENCING

AND HOW TO USE THE DATA GENERATED

FOR TRANSCRIPTOMICS

Page 3: Next Generation Sequencing & Transcriptome Analysis

METHODS

Page 4: Next Generation Sequencing & Transcriptome Analysis

METHODS

454 SEQUENCING

SOLEXA / ILLUMINA

SOLID

Page 5: Next Generation Sequencing & Transcriptome Analysis

454 SEQUENCING

SEQUENCING BY SYNTHESIS

PYROSEQUENCING

> 400 BASEPAIRS IN A SINGLE READ

Page 6: Next Generation Sequencing & Transcriptome Analysis

454 SEQUENCING

Page 7: Next Generation Sequencing & Transcriptome Analysis

454 SEQUENCING

Page 8: Next Generation Sequencing & Transcriptome Analysis

454 SEQUENCING

Page 9: Next Generation Sequencing & Transcriptome Analysis

454 SEQUENCING

REPEATS OF SINGLE NUCLEOTIDES ARE DETECTED BY SIGNAL STRENGTH

WORKS FOR UP TO 8 CONSECUTIVE BASES

Page 10: Next Generation Sequencing & Transcriptome Analysis

SOLEXA / ILLUMINA

AGAIN: SEQUENCING BY SYNTHESIS

ANOTHER DETECTION-APPROACH

UP TO 100 BASEPAIRS IN A SINGLE READ

Page 11: Next Generation Sequencing & Transcriptome Analysis

SOLEXA / ILLUMINA

T

A

C

C

G

G

...

...

Page 12: Next Generation Sequencing & Transcriptome Analysis

SOLEXA / ILLUMINA

TG

C

AT

A

C

C

G

G

...

...

Page 13: Next Generation Sequencing & Transcriptome Analysis

SOLEXA / ILLUMINA

TG

C

AT

A

C

C

G

G

...

...

Page 14: Next Generation Sequencing & Transcriptome Analysis

SOLEXA / ILLUMINA

TG

C

AT

A

C

C

G

G

...

...

Page 15: Next Generation Sequencing & Transcriptome Analysis

ADVANTAGES OF NGS

CAN RUN IN PARALLEL

PREPERATION CAN BE AUTOMATED

MUCH CHEAPER WHEN COMPARED TO TRADITIONAL SEQUENCING

Page 16: Next Generation Sequencing & Transcriptome Analysis

TRANSCRIPTOME ANALYSIS

ALLOWS FOR EXPRESSION CHANGES IN:

DIFFERENT CELL TYPES

DIFFERENT CONDITIONS OF THE ENVIRONMENT

DISEASES

DIFFERENT DEVELOPMENTAL STAGES

Page 17: Next Generation Sequencing & Transcriptome Analysis

TRANSCRIPTOME ANALYSIS

CAN BE USED TO IDENTIFY NEW GENES

CAN BE APPLIED TO NON-MODEL ORGANISMS

Page 18: Next Generation Sequencing & Transcriptome Analysis

HOW TO ANALYSE TRANSCRIPTOMES

TRADITIONALLY: EXPRESSED SEQUENCE TAGS (ESTS)

USING NGS: RNA-SEQ

FIRST STEP: GET THE DATA

Page 19: Next Generation Sequencing & Transcriptome Analysis

ESTS

DONE USING SHOTGUN-SEQUENCING

TAKES CLONES OF EXPRESSED MRNA

CHEAP TO PRODUCE

Page 20: Next Generation Sequencing & Transcriptome Analysis

RNA-SEQ

SAME PRINCIPLE:

GET AVAILABLE MRNA

THEN SEQUENCING IN PARALLEL VIA NGS

Page 21: Next Generation Sequencing & Transcriptome Analysis

RNA-SEQ

SAME PRINCIPLE:

GET AVAILABLE MRNA

THEN SEQUENCING IN PARALLEL VIA NGS

RNA-SEQ == EST + NGS

Page 22: Next Generation Sequencing & Transcriptome Analysis

HOW TO ANALYSE TRANSCRIPTOMES

ASSEMBLY OF READS

DETECTION OF SNPS

GENE ANNOTATION

DETECTION OF OPEN READING FRAMES

DETECTION OF HOMOLOGOUS GENES

Page 23: Next Generation Sequencing & Transcriptome Analysis

ASSEMBLY

CAP3

MIRA

...

AVAILABLE TOOLS:

Page 24: Next Generation Sequencing & Transcriptome Analysis

CAP3

SMITH-WATERMAN TO CLIP BAD ENDINGS

GLOBAL ALIGNMENT TO FIND FALSE OVERLAPS

Page 25: Next Generation Sequencing & Transcriptome Analysis

MIRA

COMBINES ASSEMBLY & SNP-DETECTION

USES:

TRACE FILES

TEMPLATE INSERT INFORMATION

REDUNDANCY

Page 26: Next Generation Sequencing & Transcriptome Analysis

MIRA

FAST READ COMPARISON TO DETECT POTENTIAL OVERLAPS

CONFIRMS OVERLAPS USING SMITH-WATERMAN AND CREATES ALIGNMENTS

ASSEMBLES READ-PAIRS BY FINDING BEST PATH

CHECKS ASSEMBLIES FOR ERRORS AND BEGINS AGAIN

Page 27: Next Generation Sequencing & Transcriptome Analysis

MIRATHE WORKFLOW

Page 28: Next Generation Sequencing & Transcriptome Analysis

MIRA

RESULTS:

CONSENSUS CONTIGS MADE OF READS THAT OVERLAP

SNPS THAT ARE CALLED DURING ASSEMBLY PROCESS

Page 29: Next Generation Sequencing & Transcriptome Analysis

SNP DETECTION

TOOLS:

MIRA

QUALITYSNP

AND SOME MORE

Page 30: Next Generation Sequencing & Transcriptome Analysis

QUALITYSNP

USES CAP3-FILES

INPUT: CLUSTERS OF POTENTIAL HAPLOTYPES

CALCULATES SIMILARITY BETWEEN SEQUENCES TO CONSTRUCT HAPLOTYPES AND REMOVES PARALOGS

Page 31: Next Generation Sequencing & Transcriptome Analysis

QUALITYSNP

REMOVES HAPLOTYPES THAT CONSIST OF ONLY ONE SEQUENCE

DETECTS SYNONYMOUS AND NON-SYNONYMOUS SNPS

PROVIDES A WEB-FRONTEND CALLED HAPLOSNPER

Page 32: Next Generation Sequencing & Transcriptome Analysis

HOMOLOGY DETECTION

ALLOWS TO FIND GENES THAT SHARE AN ANCESTOR

USUALLY ONE SEARCHES AGAINST A DATABASE

Page 33: Next Generation Sequencing & Transcriptome Analysis

HOMOLOGY DETECTION

DIFFERENT KIND OF SEARCHES:

PROTEIN AGAINST PROTEIN

NUCLEOTIDE AGAINST NUCLEOTIDE

PROTEIN AGAINST NUCLEOTIDE

NUCLEOTIDE AGAINST PROTEIN

Page 34: Next Generation Sequencing & Transcriptome Analysis

HOMOLOGY DETECTION

TOOLS:

BLAST

FASTX / FASTY

HMMER

PATTERNHUNTER

Page 35: Next Generation Sequencing & Transcriptome Analysis

BLAST

AVAILABLE FOR ALL TYPES OF COMPARISONS

ONE OF THE OLDEST ALGORITHMS

WIDELY USED

SPEED OVER SENSITIVITY

Page 36: Next Generation Sequencing & Transcriptome Analysis

FASTX / FASTY

PARTS OF FASTA

COMPARE NUCLEOTIDES AGAINST PROTEINS

DETERMINES A HYPOTHESIZED CODING REGION (HCR)

FASTX IS FASTER, FASTY IS MORE ACCURATE

Page 37: Next Generation Sequencing & Transcriptome Analysis

HMMER

PROTEIN-QUERIES AGAINST PROTEIN-DATABASE

USES HIDDEN MARKOV MODELS

MAPS SMITH-WATERMAN PARAMETERS ONTO A PROBABILISTIC MODEL

IMPROVES ACCURACY

Page 38: Next Generation Sequencing & Transcriptome Analysis

PATTERNHUNTER

NUCLEOTIDE-QUERIES AGAINST OTHER NUCLEOTIDE-SEQUENCES

USES NON-CONSECUTIVE SEEDS FOR INCREASED SENSITIVITY

COMPARES HUMAN GENOME TO MOUSE GENOME IN 20 CPU-DAYS

Page 39: Next Generation Sequencing & Transcriptome Analysis

ORF DETECTION

READING FRAMES CAN BE DETECTED IN EST-DATA

ALLOWS TO SCREEN FOR PREVIOUSLY UNKNOWN GENES

ALLOWS TO GIVE A POTENTIAL PROTEIN SEQUENCE

Page 40: Next Generation Sequencing & Transcriptome Analysis

ORF DETECTION

TOOLS:

ESTSCAN

ORFPREDICTOR

...

Page 41: Next Generation Sequencing & Transcriptome Analysis

ESTSCAN

USES HIDDEN MARKOV MODELS

ROBUST FOR FRAMESHIFT ERRORS

SENSITIVE ( 5 % FN, 18 % FP)

Page 42: Next Generation Sequencing & Transcriptome Analysis

ORFPREDICTOR

WEB-BASED

USES BLASTX AS GUIDELINE IF POSSIBLE

USES A DEFINED RULESET FOR DEFINING ORFS

Page 43: Next Generation Sequencing & Transcriptome Analysis

ORFPREDICTOR

Page 44: Next Generation Sequencing & Transcriptome Analysis

GENE ANNOTATION

BLAST2GO VIA GENE ONTOLOGY

FINDS HOMOLOG GENES TO ANNOTATE FUNCTIONS OF GENE OF INTEREST

Page 45: Next Generation Sequencing & Transcriptome Analysis

GENE ONTOLOGY

3 ONTOLOGIES:

MOLECULAR FUNCTION

CELLULAR COMPONENTS

BIOLOGICAL PROCESS

Page 46: Next Generation Sequencing & Transcriptome Analysis

CONCLUSIONS

NGS PROVIDES A FAST AND CHEAP WAY TO GENERATE DATA

TONS OF TOOLS EXIST TO ANALYSE TRANSCRIPTOME DATA

ALL TOOLS HAVE THEIR OWN PROS & CONTRAS

MOST OF THOSE TOOLS ARE UNSUITABLE FOR A „NORMAL USER“