Top Banner

of 62

Genomic signatures for metagenomic data analysis

Apr 05, 2018

Download

Documents

Fabio Gori
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/2/2019 Genomic signatures for metagenomic data analysis

    1/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    G e n o m i c S i g n a t u r e s f o r M e t a g e n o m i c D a t a

    A n a l y s i s : E x p l o i t i n g t h e R e v e r s e C o m p l e m e n t a r i t y

    o f T e t r a n u c l e o t i d e s

    F a b i o G o r i

    1

    D i m i t r i o s M a v r o e d i s

    1

    M i k e S . M . J e t t e n

    2

    E l e n a M a r c h i o r i

    1

    1

    R a d b o u d U n i v e r s i t y N i j m e g e n , I n s t i t u t e f o r C o m p u t i n g a n d I n f o r m a t i o n S c i e n c e s ,

    T h e N e t h e r l a n d s

    2

    R a d b o u d U n i v e r s i t y N i j m e g e n , D e p a r t m e n t o f M i c r o b i o l o g y , T h e N e t h e r l a n d s

    H o n g K o n g U n i v e r s i t y , 1 2 S e p t e m b e r 2 0 1 1

    g o r i @ s c i e n c e . r u . n l

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    2/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    T a b l e o f C o n t e n t s

    M e t a g e n o m i c s a n d B i n n i n g

    G e n o m i c S i g n a t u r e s f o r B i n n i n g

    E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    3/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    T a b l e o f C o n t e n t s

    M e t a g e n o m i c s a n d B i n n i n g

    G e n o m i c S i g n a t u r e s f o r B i n n i n g

    E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    4/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    9 9 % m i c r o b e s

    c a n n o t b e s t u d i e d i n

    l a b o r a t o r i e s

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    5/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    9 9 % m i c r o b e s

    c a n n o t b e s t u d i e d i n

    l a b o r a t o r i e s

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    6/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    9 9 % m i c r o b e s

    c a n n o t b e s t u d i e d i n

    l a b o r a t o r i e s

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    7/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    H o w ? D N A S e q u e n c i n g T e c h n o l o g y

    E n v i r o n m e n t a l

    S a m p l e

    D N A s

    S m a l l - I n s e r t L i b r a r y C l o n i n g

    = T A C C A C A G A T A T C A G . . .

    A m e t a g e n o m i c d a t a s e t i s m a d e b y t h e s e D N A s e q u e n c e s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    8/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    W h a t k i n d o f d a t a ? A m e t a . . . j i g s a w - p u z z l e

    F r a g m e n t s o f D N A s

    P i e c e s a r e s i m i l a r

    O r i g i n a l p i c t u r e s a r e

    u n k n o w n

    M i s s i n g P i e c e s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    9/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    A i n t e r e s t i n g p r o b l e m : M e t a g e n o m i c B i n n i n g

    C l u s t e r i n g t o g e t h e r s e q u e n c e s s a m p l e d f r o m t h e s a m e g e n o m e

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    10/62

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    11/62

    M e t a g e n o m i c b i n n i n g

    C l u s t e r i n g t o g e t h e r s e q u e n c e s s a m p l e d f r o m t h e s a m e g e n o m e

    ( u n s u p e r v i s e d a p p r o a c h )

    {A , C , G , T }

    Rn

    C l u s t e r i n g

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    12/62

    M e t a g e n o m i c b i n n i n g

    C l u s t e r i n g t o g e t h e r s e q u e n c e s s a m p l e d f r o m t h e s a m e g e n o m e

    ( u n s u p e r v i s e d a p p r o a c h )

    {A , C , G , T }

    Rn

    C l u s t e r i n g

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    13/62

    M e t a g e n o m i c b i n n i n g

    C l u s t e r i n g t o g e t h e r s e q u e n c e s s a m p l e d f r o m t h e s a m e g e n o m e

    ( u n s u p e r v i s e d a p p r o a c h )

    {A , C , G , T }

    Rn

    C l u s t e r i n g

    I n t h i s s t u d y : f o c u s o n

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    14/62

    T a b l e o f C o n t e n t s

    M e t a g e n o m i c s a n d B i n n i n g

    G e n o m i c S i g n a t u r e s f o r B i n n i n g

    E x p e r i m e n t s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    15/62

    W h a t s h o u l d d o

    zs r

    Rn

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    16/62

    W h a t s h o u l d d o

    zs r

    Rn

    (s)(z)

    (r)

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    17/62

    W h a t s h o u l d d o

    zs r

    n e e d s t o b e a

    g e n o m i c s i g n a t u r e :

    [ K a r l i n

    e t a l .

    , T r e n d s i n G e n e t i c s , 1 9 9 5 ]

    (s ) (z )

    (s ) = ( r )

    Rn

    (s)(z)

    (r)

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    18/62

    T y p i c a l ' s u s e d i n b i n n i n g

    T

    (s

    ):=

    f r e q u e n c i e s o f t h e 4

    k

    s e q u e n c e s o f l e n g t h k

    ( k - m e r s ) .

    U s u a l l y k=

    4= 4 k = 2 5 6 f e a t u r e s : T ( s ) N2 5 6

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ D i a z

    e t a l .

    , B M C B i o i n f o r m a t i c s , 2 0 0 9 ]

    [ C h a n

    e t a l .

    , J . B i o m e d . B i o t e c h . , 2 0 0 8 ] , [ T e e l i n g

    e t a l .

    , E n v i r o n . M i c r o b . , 2 0 0 4 ]

    E x a m p l e :

    s =A G C A T G C A G C A T A T G T G G A G C A

    T (

    s) =( . . .

    )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    19/62

    T y p i c a l ' s u s e d i n b i n n i n g

    T

    (s

    ):=

    f r e q u e n c i e s o f t h e 4

    k

    s e q u e n c e s o f l e n g t h k

    ( k - m e r s ) .

    U s u a l l y k=

    4= 4 k = 2 5 6 f e a t u r e s : T ( s ) N2 5 6

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ D i a z

    e t a l .

    , B M C B i o i n f o r m a t i c s , 2 0 0 9 ]

    [ C h a n

    e t a l .

    , J . B i o m e d . B i o t e c h . , 2 0 0 8 ] , [ T e e l i n g

    e t a l .

    , E n v i r o n . M i c r o b . , 2 0 0 4 ]

    E x a m p l e :

    s =A G C A T G C A G C A T A T G T G G A G C A

    T (

    s) =(#

    A A A A =

    0, . . .

    )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    20/62

    T y p i c a l ' s u s e d i n b i n n i n g

    T (s) := f r e q u e n c i e s o f t h e 4 k s e q u e n c e s o f l e n g t h k

    ( k - m e r s ) .

    U s u a l l y k=

    4= 4 k = 2 5 6 f e a t u r e s : T ( s ) N2 5 6

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ D i a z

    e t a l .

    , B M C B i o i n f o r m a t i c s , 2 0 0 9 ]

    [ C h a n

    e t a l .

    , J . B i o m e d . B i o t e c h . , 2 0 0 8 ] , [ T e e l i n g

    e t a l .

    , E n v i r o n . M i c r o b . , 2 0 0 4 ]

    E x a m p l e :

    s =A G C A T G C A G C A T A T G T G G A G C A

    T (

    s) =(#

    A A A A =

    0, . . . , #

    A G C A =

    3, . . .

    )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    21/62

    T y p i c a l ' s u s e d i n b i n n i n g

    T (s) := f r e q u e n c i e s o f t h e 4 k s e q u e n c e s o f l e n g t h k

    ( k - m e r s ) .

    U s u a l l y k=

    4= 4 k = 2 5 6 f e a t u r e s : T ( s ) N2 5 6

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ D i a z

    e t a l .

    , B M C B i o i n f o r m a t i c s , 2 0 0 9 ]

    [ C h a n

    e t a l .

    , J . B i o m e d . B i o t e c h . , 2 0 0 8 ] , [ T e e l i n g

    e t a l .

    , E n v i r o n . M i c r o b . , 2 0 0 4 ]

    E x a m p l e :

    s =A G C A T G C A G C A T A T G T G G A G C A

    T (

    s) =(#

    A A A A =

    0, . . . , #

    A G C A =

    3, . . . , #

    A T A T =

    1, . . .

    )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    22/62

    T y p i c a l ' s u s e d i n b i n n i n g

    T (s) := f r e q u e n c i e s o f t h e 4 k s e q u e n c e s o f l e n g t h k

    ( k - m e r s ) .

    U s u a l l y k=

    4= 4 k = 2 5 6 f e a t u r e s : T ( s ) N2 5 6

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ D i a z

    e t a l .

    , B M C B i o i n f o r m a t i c s , 2 0 0 9 ]

    [ C h a n

    e t a l .

    , J . B i o m e d . B i o t e c h . , 2 0 0 8 ] , [ T e e l i n g

    e t a l .

    , E n v i r o n . M i c r o b . , 2 0 0 4 ]

    E x a m p l e :

    s =A G C A T G C A G C A T A T G T G G A G C A

    T (

    s) =(#

    A A A A =

    0, . . . , #

    A G C A =

    3, . . . , #

    A T A T =

    1, . . .

    . . . , #G C A T = 2 , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    23/62

    M e t a c l u s t e r a n d s i g n a t u r e R a n k [ B . Y a n g

    e t a l .

    , A C M - B C B , 2 0 1 0 ]

    S p e a r m a n F o o r t u l e d i s t a n c e

    b e t w e e n s a n d z

    M a n h a t t a n d i s t a n c e

    b e t w e e n R a n k ( s ) a n d R a n k

    (z )

    S y m m e t r i z e d R a n k S i g n a t u r e R a n k : S S1 3 6

    R a n k (s

    ) :=r a n k i n g i n d u c e d b y s o r t i n g t h e e l e m e n t s o f

    S (s

    ).

    F o r i n s t a n c e , i f S

    ( s ) = ( 7 , 0 , 3 ) t h e n R a n k

    ( s ) = (1 , 3 , 2 ) .

    S y m m e t r i z e d S i g n a t u r e S : S N1 3 6

    Si

    ( s ) = #wi

    + # w Ci

    , i = 1 , . . . , 1 3 6

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    24/62

    W h a t i s m i s s i n g ?

    ' s u s e d i n b i n n i n g :

    N o t d e s i g n e d a s s i g n a t u r e s

    f o r m e t a g e n o m i c d a t a

    N o t h o r o u g h c o m p a r a t i v e a n a l y s i s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    25/62

    W h a t i s m i s s i n g ?

    ' s u s e d i n b i n n i n g :

    N o t d e s i g n e d a s s i g n a t u r e s

    f o r m e t a g e n o m i c d a t a

    N o t h o r o u g h c o m p a r a t i v e a n a l y s i s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    26/62

    I n t h i s s t u d y

    1 I n t r o d u c e n e w g e n o m i c s i g n a t u r e s

    f o r b i n n i n g

    2 T e s t & C o m p a r e p e r f o r m a n c e s o f

    n e w a n d k n o w n s i g n a t u r e s

    3 . . . a n d s i g n a t u r e c o m b i n a t i o n s ( e x t r a )

    4 R e l a t i o n t a x o n o m i c d i v e r g e n c e & s i g n a t u r e d i s s i m i l a r i t y

    ( e x t r a )

    T E S T : i s a s i g n a t u r e o n m e t a g e n o m i c s d a t a :

    ( s ) ( z )

    ( s ) = ( r )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    27/62

    I n t h i s s t u d y

    1 I n t r o d u c e n e w g e n o m i c s i g n a t u r e s

    f o r b i n n i n g

    2 T e s t & C o m p a r e p e r f o r m a n c e s o f

    n e w a n d k n o w n s i g n a t u r e s

    3 . . . a n d s i g n a t u r e c o m b i n a t i o n s ( e x t r a )

    4 R e l a t i o n t a x o n o m i c d i v e r g e n c e & s i g n a t u r e d i s s i m i l a r i t y

    ( e x t r a )

    T E S T : i s a s i g n a t u r e o n m e t a g e n o m i c s d a t a :

    ( s ) ( z )

    ( s ) = ( r )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    28/62

    I n t h i s s t u d y

    1 I n t r o d u c e n e w g e n o m i c s i g n a t u r e s

    f o r b i n n i n g

    2 T e s t & C o m p a r e p e r f o r m a n c e s o f

    n e w a n d k n o w n s i g n a t u r e s

    3 . . . a n d s i g n a t u r e c o m b i n a t i o n s ( e x t r a )

    4 R e l a t i o n t a x o n o m i c d i v e r g e n c e & s i g n a t u r e d i s s i m i l a r i t y

    ( e x t r a )

    T E S T : i s a s i g n a t u r e o n m e t a g e n o m i c s d a t a :

    ( s ) ( z )

    (s

    ) = (r

    )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    29/62

    S p e c i a l r e q u i r e m e n t s f o r m e t a g e n o m i c s

    G e n o m i c s i g n a t u r e n e e d s t o :

    W o r k o n s e q u e n c e s

    1 , 0 0 0 b p

    ( s t a n d a r d t e s t 1 0 , 0 0 0 b p )

    N o t r e l y o n s o u r c e g e n o m e

    S t r a n d i n d e p e n d e n t

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    30/62

    S e q u e n c e s c a n b e s a m p l e d f r o m b o t h s t r a n d s

    s =

    A G C A T G C A G C A T A T G T G G A G C A

    T C G T A C G T C G T A T A C A C C T C G T = s C

    W e w a n t :

    ( s ) = ( s C )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    31/62

    S e q u e n c e s c a n b e s a m p l e d f r o m b o t h s t r a n d s

    s =

    A G C A T G C A G C A T A T G T G G A G C A

    T C G T A C G T C G T A T A C A C C T C G T = s C

    W e w a n t :

    ( s ) = ( s C )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    32/62

    T a b l e o f C o n t e n t s

    M e t a g e n o m i c s a n d B i n n i n g

    G e n o m i c S i g n a t u r e s f o r B i n n i n g

    E x p e r i m e n t s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    33/62

    T e s t e d s i g n a t u r e s :

    S i g n a t u r e s e x p l o i t f r e q u e n c i e s o f s u b s e q u e n c e s ( l e n g t h = 4 )

    3 k n o w n s i g n a t u r e s , S n o t u s e d i n m e t a g e n o m i c s

    6 n e w s t r a n d i n d e p e n d e n t s i g n a t u r e s

    D a t a :

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s ( N C B I )

    S e q u e n c e l e n g t h : 1 , 0 0 0 b p [ B . Y a n g

    e t a l .

    , A C M - B C B , 2 0 1 0 ]

    M a x o u t p u t o f 4 5 4 G S F L X + S y s t e m

    D i s s i m i l a r i t y m e a s u r e d w i t h s i g n a t u r e d i s t a n c e ( M a n h a t t a n ) :

    d((

    s), (

    z)) := ( s ) (z )

    1

    =n

    i

    =1

    |i

    (s

    ) i

    (z

    )|

    [ M o h a m m e d

    e t a l .

    , B i o i n f o r m a t i c s , 2 0 1 1 ] , [ M r z e k

    e t a l .

    , M o l . B i o l . E v o l . , 2 0 0 9 ]

    [ B o h l i n

    e t a l .

    , S c i e n t i c W o r l d J o u r n a l , 2 0 1 1 ] , [ K a r l i n

    e t a l .

    , A n n u . R e v . G e n e t . , 1 9 9 8 ]

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    34/62

    P e r f o r m a n c e e v a l u a t i o n

    d ((sh

    ), ( si

    ))W I T H I N - g e n o m e

    d i s t a n c e

    t

    # b e t w e e n - d i s t a n c e s

    S p e c i c i t y ( t ) =# w i t h i n - d i s t a n c e s t

    # w i t h i n - d i s t a n c e s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    38/62

    H o w w e c o m p a r e : R O C c u r v e

    F o r e a c h d i s t a n c e t h r e s h o l d t :

    S e n s i t i v i t y (

    t) =

    # b e t w e e n - d i s t a n c e s > t

    # b e t w e e n - d i s t a n c e s

    S p e c i c i t y ( t ) =# w i t h i n - d i s t a n c e s t

    # w i t h i n - d i s t a n c e s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    39/62

    H o w w e c o m p a r e : R O C c u r v e

    F o r e a c h d i s t a n c e t h r e s h o l d t :

    S e n s i t i v i t y (

    t) =

    # b e t w e e n - d i s t a n c e s > t

    # b e t w e e n - d i s t a n c e s

    S p e c i c i t y ( t ) =# w i t h i n - d i s t a n c e s t

    # w i t h i n - d i s t a n c e s

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    40/62

    H o w w e c o m p a r e : R O C c u r v e

    a

    a l w a y s b e t t e r t h a n

    b

    i f a n d o n l y i f

    R O C o f a a b o v e R O C o f b

    A l t e r n a t i v e i n d e x :

    A r e a U n d e r t h e C u r v e ( A U C )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    41/62

    H o w w e c o m p a r e : R O C c u r v e

    a

    a l w a y s b e t t e r t h a n

    b

    i f a n d o n l y i f

    R O C o f a a b o v e R O C o f b

    A l t e r n a t i v e i n d e x :

    A r e a U n d e r t h e C u r v e ( A U C )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    42/62

    R e s u l t s

    T a b l e : C o m p a r i s o n o f

    g e n o m i c s i g n a t u r e s

    S i g n a t u r e A U C F e a t .

    S 0 . 9 1 2 1 3 6

    m a x0 . 9 0 0 1 2 0

    T 0 . 8 8 4 2 5 6

    m i n 0 . 8 8 1 1 2 0

    I 0 . 8 5 1 1 6

    R a n k 0 . 7 9 4 1 3 6

    R a t i o 10 . 7 0 7 1 2 0

    R a t i o 2 0 . 6 8 6 1 2 0

    J S0 . 5 7 3 1 2 0

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    43/62

    C o n c l u s i o n

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    44/62

    C o n c l u s i o n

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    45/62

    C o n c l u s i o n

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    46/62

    C o n c l u s i o n

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    47/62

    C o n c l u s i o n

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    C o n c l u s i o n

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    48/62

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    C o n c l u s i o n

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    49/62

    W h a t w e d i d

    F i r s t c o m p a r a t i v e t e s t o f

    f o r m e t a g e n o m i c s

    1 , 2 8 4 p r o k a r y o t i c g e n o m e s s t u d i e d

    N e w s i g n a t u r e s d e s i g n e d f o r m e t a g e n o m i c s

    R e s u l t s

    S u p p o r t s o m e k n o w n s i g n a t u r e s

    ( S b e t t e r t h a n T b u t n o t u s e d )

    N e w s i g n a t u r e s : c o m p a r a b l e r e s u l t s w i t h l e s s f e a t u r e s

    F u t u r e w o r k

    T e s t o n s h o r t e r s e q u e n c e s ( 1 5 0 - 5 0 0 b p ) - p r e l i m i n a r y r e s u l t s

    A n a l y z e p e r f o r m a n c e s a t o t h e r t a x o n o m i c l e v e l s ( f a m i l y ,

    g e n u s , . . . )

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    50/62

    T h a n k y o u !

    Q u e s t i o n s ?

    g o r i @ s c i e n c e . r u . n l

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    51/62

    T h a n k y o u !

    Q u e s t i o n s ?

    g o r i @ s c i e n c e . r u . n l

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    W h i t h i n g e n o m e d i s t a n c e - v a l u e s d e r i v a t i o n

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    52/62

    F o r e a c h

    g e n o m e

    ( 1 , 2 8 4 )

    1 0 , 0 0 0

    s e q u e n c e s

    C o m p u t e

    1 0 , 0 0 02

    d i s t a n c e s

    ( a l l p a i r s )

    M e a n

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    B e t w e e n g e n o m e d i s t a n c e - v a l u e s d e r i v a t i o n

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    53/62

    F o r 8 , 0 0 0 g e n o m e p a i r s

    1 0 , 0 0 0

    s e q u e n c e p a i r s

    C o m p u t e

    d i s t a n c e s

    a n d t a k e t h e

    M e a n

    1 , 0 0 0 g e n o m e p a i r s f o r e a c h l e v e l o f t a x o n o m i c d i v e r s i t y , r a n d o m l y

    s e l e c t e d

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    T a x o n o m i c d i v e r s i t y

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    54/62

    T w o g e n o m e s g

    i

    , g j

    h a v e t a x o n o m i c d i v e r s i t y a t r a n k r

    i L o w e s t C o m m o n A n c e s t o r o f g

    i

    a n d g

    j

    i s a t r a n k r .

    L C A

    g

    1

    g

    2

    g

    3

    g

    4

    g

    5

    g

    6

    g

    7

    g

    8

    g

    9

    g

    1 0

    g

    1 1

    g

    1 2

    M e t a g e n o m i c s a n d B i n n i n g G e n o m i c S i g n a t u r e s f o r B i n n i n g E x p e r i m e n t s

    T a x o n o m i c d i v e r g e n c e a n d s i g n a t u r e d i s t a n c e

  • 8/2/2019 Genomic signatures for metagenomic data analysis

    55/62

    F o r e a c h s i g n a t u r e :

    F o r e a c h (

    r

    1

    ,r

    2

    )p a i r o f r a n k s :

    C h e c k t h a t :

    D i s t a n c e d i s t r i b u t i o n r

    1