Top Banner
oslo.genomics.no RNA sequencing module Wednesday 15.10.14 – Day1 09.00 Welcome 09.10 RNA sequencing introduc>on 10.00 RNAseq data analysis –Introduc>on 10.45 RNAseq – prac>cal part 1 12.00 Lunch 12.45 RNAseq – prac>cal part 2 Thursday 16.10.14 – Day2 09.00 Introduc>on to genomeguided transcriptome assembly 09.30 Transcriptome assembly – prac>cal part 1 12.00 Lunch 12.45 Transcriptome assembly – prac>cal part 2 14.30 Func>onal annota>on 15.00 Alterna>ve RNAseq applica>ons 15.30 Ques>ons and discussion
31

RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

Jul 19, 2018

Download

Documents

trinhhuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  sequencing  module  Wednesday  15.10.14  –  Day1    09.00    Welcome  09.10    RNA  sequencing  introduc>on  10.00  RNA-­‐seq  data  analysis  –Introduc>on  10.45  RNA-­‐seq  –  prac>cal  part  1  12.00    Lunch  12.45  RNA-­‐seq  –  prac>cal  part  2    Thursday  16.10.14  –  Day2    09.00  Introduc>on  to  genome-­‐guided  transcriptome  assembly  09.30  Transcriptome  assembly  –  prac>cal  part  1  12.00  Lunch  12.45  Transcriptome  assembly  –  prac>cal  part  2  14.30  Func>onal  annota>on  15.00  Alterna>ve  RNA-­‐seq  applica>ons  15.30  Ques>ons  and  discussion        

Page 2: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Dr.  Susanne  Lorenz    

Genomics  Core  Facility  Helse  Sør-­‐Øst  

 Dept.  of  Tumor  Biology  

The  Norwegian  Radium  Hospital,  OUS        

RNA  sequencing  -­‐  Introduc5on  

Page 3: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

exon  1   exon  2   exon  3   exon  4  

exon  1   exon  2   exon  3   exon  4   exon  1   exon  2   exon  3   exon  4  

Genome  

pre-­‐  mRNA  

Blood   Brain  Transcrip2on  

AAAAAAAAAA  mRNA  

Splicing,    Poly(A)  tailing  

AAAAAAAAAA  

From  a  Gene  to  RNA  

Transcript  A1   Transcript  A2  

Gene  A  

à  messanger  RNA  (mRNA)  will  be  translated  into  protein  (coding  RNAs)  à  in  human  20.000-­‐25.000  protein  coding  genes  

Page 4: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

exon  1   exon  2  

exon  1   exon  2  

Genome  

Blood   Brain  Transcrip2on  

Non-­‐coding  RNA  

Spligcing  Transcript  

Gene  B  

à  Non-­‐coding  RNA  (ncRNA)  will  not  be  translated  into  protein  à  Some  types  of  ncRNAs  have  a  polyA-­‐tail,  others  not  à  Three  main  categories:  houskeeping  RNAs,  short  (<  200  bp)  and  long  ncRNAs  (>200bp)  

pre  Non-­‐  coding  RNA  

From  a  Gene  to  RNA  

Page 5: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Non-­‐coding  RNAs      

Category   Name   Length  (bp)   Func2on  

Housekeeping  RNAs  

Ribosomal  RNA  (rRNA)   120-­‐5000   ribosome  structure  

Transfer  RNA  (tRNA)   73-­‐94   protein  transla>on  

small  nuclear  RNA  (snRNA)   ~  150   splicing  

small  nucleolar  RNA  (snoRNA)   70-­‐200   post-­‐transcrip>onal  modifica>on  

Short  non  coding  RNAs  

(smallRNAs)  

micro  RNAs   16-­‐30  (21-­‐24)   transla>onal  repression  

PIWI-­‐interac>ng  RNAs   26-­‐31   regulate  transposon  ac>vity  and  chroma>n  state  

promotor-­‐associated  short  RNAs   ~18   may  regulate  gene  expression  at  

chroma>n  level  

Long  non  coding  RNAs  

long  intergenic  ncRNA   >  200   epigene>c,  transcrip>onal  and  post-­‐transcrip>onal  regula>on  

pseudogenes   >  200   compe>>ve  endogenous  RNA  

Enhancer  RNA   50-­‐2000   not  known  

An>sense  RNA   >  200   gene  expression  

long  intronic  ncRNA   >  200   not  known  

Repeat  associated  long  RNA   >  200   not  known  

à  Ribosomal  RNA  represents  a  challenge  for  RNA  sequencing  as  it  cons>tutes  up  to  80  %  of  total  RNA  

Page 6: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  sequencing  

What  is  RNA  sequencing?      

Massive  parallel  sequencing  to  characterize  and  quanDfy  transcriptomes  (all  acDvely  transcribed  genes)    What  does  RNA  sequencing  offer?    

•  Iden>fica>on  of  all  ac>vely  transcript  genes  in  a  cell  type/>ssue    •  Differen>ally  gene  expression  

•  Iden>fica>on  of  new  transcripts  •  Detec>ng  of  alterna>ve  splicing  events    •  Detec>on  of  fusion  transcripts  •  Strand-­‐specific  measurements  •  Muta>on  analysis  –  expression  level  of  genomic  muta>ons,  RNA  edi>ng    

Page 7: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  sequencing  in  comparison  

“RNA-­‐Seq:  a  revolu>onary  tool  for  transcriptomics”  Wang  Z.  et  al.,  2009  Nature  Reviews    

Page 8: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  sequencing  protocols  

1.  mRNA  (protein  coding)  stranded  sequencing    à  only  Poly-­‐A  tail  RNA    à  no  rRNA  contamina>on  but  genes  encoding  proteins  of  the      

                 ribosome  

2.  total  RNA  stranded  transcriptome  (ribosomal  RNA  deple>on)      à  total  RNA  isola>on  followed  by  rRNA  deple>on      à  generates  informa>on  about  all  RNA  molecules  except                      rRNAs  and  RNA  molecules  longer  than  120  bp  

 3.  Capturing  systems  for  stranded  RNA-­‐sequencing  

 à  hybridiza>on  based      à  dependent  on  annota>on    à  increased  sequencing  depth  at  coding  regions    à  capable  for  very  low  star>ng  material  (10  ng)  

   

Page 9: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Illumina  TruSeq  strand-­‐specific    RNA  protocols  

1. Poly-A selection

mRNA  Sequencing   Total  RNA  Sequencing  

Page 10: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Illumina  TruSeq  strand-­‐specific    RNA  protocols  

Flow cell

Page 11: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Strand-­‐specific  total  RNA  sequencing-­‐  advantages  

§  more  even  coverage  along  the  transcript    à  significant  less  3´  -­‐bias  compared  to  Poly-­‐A  tailing    à  more  accurate  quan>fica>on  of  gene  expression    

Page 12: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Strand-­‐specific  total  RNA  sequencing-­‐  advantages  

Fresh  frozen  high  quality  sample  (RNA  RIN  value  9.0)  

Formalin-­‐fixed  paraffin-­‐embedded  sample  (RNA  RIN  value  6.0)  

§  robust  and  efficient  method  even  for  low  quality  samples      

Page 13: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Strand-­‐specific  total  RNA  sequencing-­‐  advantages  

§  Improved  discrimina>on  of  overlapping  transcripts    à  more  accurate  quan>fica>on  of  gene  expression        

Page 14: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

1.  Hybridiza5on  and  amplifica5on  on  the  flow  cell  

RNA  sequencing  -­‐  Illumina  2.  Sequencing  

4.  Millions  of  short  sequences  in  fastq  format  

>  HWUSI-EAS100R:6:73:941:1973#0/1   AGCGTAACCGGTAACGATAGCAGAT @ HWUSI-EAS100R:6:73:941:1973#0/1 bbbbbbbb%%%++)(%%%%)1**((((***+

3.  Image  analysis  and  base  calling  

Page 15: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  sequencing  -­‐  Illumina  

Read1   Read2  

cDNA  fragment  

Single-­‐end  sequencing  (Read1  only)  

Paired-­‐end  sequencing  (Read1  and  Read2)  

Page 16: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Scien5fic  RNA  sequencing  case  1  

“Au>sm  spectrum  disorder  (ASD)  is  a  common,  highly  heritable  neuro-­‐developmental   condi>on   characterized   by   marked   gene>c  heterogeneity.”   RNAseq   is   used   to   inves>gate   gene   expression   in  au>s>c  brain  compared  to  normal  brain.  

Page 17: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Transcriptomic  analysis  of  au5s5c  brain    

Heatmap  of  the  top  200  differen>ally  expressed  genes  between  au>sm  and  control  cortex  samples  

à  dis>nct  clustering  of  the  majority  of  au>sm  cortex  samples,  in  contrast  to  genomic  heterogeneity  (shown  in  GWAS  study)  

Page 18: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

A)  Significant  expression  differences  between  frontal  and  temporal  cortex  in  control  samples  (top)  and  au>sm  samples  (bomom).  

B)  Top  20  genes  differen>ally  expressed  between  frontal  and  temporal  cortex  in  controls.  None  of  the  genes  show  significant  expression  differences  between  frontal  and  temporal  cortex  in  au>sm.  

Transcriptomic  analysis  of  au5s5c  brain    

Page 19: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Transcriptomic  analysis  of  au5s5c  brain    

Results:    §  Dis>nct  transcriptomic  differences  between  au>sm  and  control  

cortex  samples  even  if  heterogeneous  at  genomic  level  (WGAS)    

§  Gene  ontology  analysis  showed  down-­‐regulated  genes  related  to  synap>c  func>on,  whereas  up-­‐regulated  genes  were  related  to  immune  and  inflammatory  response  

 

§  Consistent  expression  in  frontal  and  temporal  cortex  compared  to  differen>al  expression  in  normal  samples  

à  Gained  knowledge  about  biology  behind  the  disease  that  can  improve  the  development  of  diagnosis  and  treatment  strategies  

 

Page 20: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Scien5fic  RNA  sequencing  case  2  

”To  idenDfy  the  precise  geneDc  elements  and  study  the  exclusive  nature  of  three  immunohistochemically  different  breast  cancer  types,  we  employed  massively  parallel  mRNA  sequencing.”    

Page 21: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

PCA  plots  showing  the  clustering  of  the  TNBC  (magenta),  Non-­‐TNBC  (Red)  and  HER2-­‐posi>ve  (green)  breast  cancer  samples  based  on  the  transcriptomic  expression  profiles.  Table  showing  the  number  of  sta>s>cally  significant  differen>ally  expressed  transcripts.  

Transcriptomic  landscape  of  breast  cancer  through  mRNA  sequencing  

Page 22: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Transcriptomic  landscape  of  breast  cancer  through  mRNA  sequencing  

The  table  presents  the  six  most  common  highly  abundant  primary  transcripts  and  all  of  the  associated  informa>on.  The  bomom  four  lines  of  the  table  show  the  primary  transcript  expression  profiles  specific  for  the  TNBC  and  Non-­‐TNBC  (APOE)  and  HER2-­‐posi>ve  (FN1,  PP1B  and  OAZ1)  groups.    

Page 23: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Transcriptomic  landscape  of  breast  cancer  through  mRNA  sequencing  

§   Compara>ve  transcriptomic  analyses  elucidated  differen>ally  expressed            transcripts  between  the  three  breast  cancer  groups,  iden>fying  several          new  modulators  of  breast  cancer.      §   Iden>fica>on  of  common  transcrip>onal  regulatory  elements,  such  as            highly  abundant  primary  transcripts,  including  osteonec>n,  RACK1,          calnexin,  calre>culin,  FTL,  and  B2M,  and  ‘‘genomic  hotspots’’  enriched  in            primary  transcripts  between  the  three  groups.      §   The  study  opens  previously  unexplored  niches  that  could  enable  a  bemer          understanding  of  the  disease  and  the  development  of  poten>al          interven>on  strategies.  

Page 24: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Scien5fic  RNA  sequencing  case  3  

Integra5ve  annota5on  of  human  large  intergenic  noncoding  RNAs  reveals  global  proper5es  and  specific  subclasses    Moran  N.  Cabili,  Cole  Trapnell,  […],  and  John  L.  Rinn  (2011)  

In  this  study  a  reference  catalog  of  >  8000  human  lincRNAs  is  defined  and  characterize  by  sequence,  structural  and  transcrip>onal  features  across  24  >ssues  and  cell  types.    

Page 25: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Integra5ve  annota5on  of  human  large  intergenic  noncoding  RNAs    

Computa>onal  approach  for  comprehensive  annota>on  of  lincRNAs  

B  A  

Page 26: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Integra5ve  annota5on  of  human  large  intergenic  noncoding  RNAs    

Expression  level  of  lincRNAs  and  protein  coding  genes  across  the  >ssues  (color  intensity  represents  frac>onal  density  across  the  row)  

Page 27: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Integra5ve  annota5on  of  human  large  intergenic  noncoding  RNAs    

(B)  Expression  abundance  of  1508  highest  expressed  lincRNAs  compared  to  8906  highest  expressed  protein  coding  genes  à  lincRNAs  are  lower  expressed    (C)  Distribu>on  of  maximal  >ssue  specificity  scores  calculated  from  data  in  A      à  lincRNAs  show  higher  >ssue  specificity  

Page 28: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Non-­‐coding  RNAs  in  human  diseases    

HOTAIR   binds   to   polycomp   proteins   that  remodel   chroma>n   marks   what   leads   to  epigene>c   silencing   of   i.e.   HOXD   and  increases  invasiveness  of  cancer  cells.    

lincRNA  HOTAIR  

Page 29: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

BACE1-­‐AS,   an   an>sense   lncRNA   regulates  the   expression   of   the   sense   BACE1   gene  (labelled   BACE1-­‐S   in   the   figure)   through  the   stabiliza>on  of   its  mRNA.  BACE1-­‐AS   is  elevated  in  Alzheimer’s  disease,   increasing  the   amount   of   BACE1   protein   and,  subsequently,  the  produc>on  of  β-­‐amyloid  pep>de.  

lncRNA  in  Alzheimer`s  disease  

Non-­‐coding  RNAs  in  human  diseases    

Page 30: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

Non-­‐coding  RNAs  in  human  diseases    

The  loss  of  the  snoRNA  in  PWS  changes  the  alterna>ve   splicing   of   the   serotonin  receptor   HTR2C   precursor   mRNA   (pre-­‐mRNA),  resul>ng  in  a  protein  with  reduced  func>on.    

snoRNA  in  Prader-­‐Willi  syndrome  

Page 31: RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length( bp) Func2on$ Housekeeping+ RNAs+ Ribosomal+RNA+(rRNA) 1205000 ribosome+structure+

oslo.genomics.no  

RNA  seq  data  set  for  the  prac5al  part  

Aim:    Iden5fica5on  of  dysregulated  genes  in  osteosarcoma          

•   Most  common  primary  malignant  tumours  of  bone  

•   occurs  mainly  in  long  bone  (arm  and  leg)  Children/adolescents  

•  High  grade  tumours  that  are  very  aggressive  

•  Complex  genomic  aberra5ons  

à  The  high  number  of  genomic                    aberra>ons  is  likely  to  have  an                effect  on  genes  expression