Top Banner
R SCHOOL UTATIONAL BIOLOGY um ence Precinct Queensland ustralia by: RAM 2014 e in Bioinformatics 2014 WIN MATHEMATICAL & CO Aud Queensland B The Universi Brisban Ho PRO 7-11 ARC Centre of IMB 2014 WINTER SCHOOL IN MATHEMATICAL & COMPUTATIONAL BIOLOGY 7-11 July 2014 Auditorium Queensland Bioscience Precinct The University of Queensland Brisbane, Australia PROGRAM Hosted by:
44

MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

Sep 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL& COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellencein Bioinformatics

IMBIMB

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

7-11 July 2014

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

PROGRAM

Hosted by:

Page 2: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ
Page 3: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

                       

2014  Winter  School  in  Mathematical  and  Computational  Biology  7-­‐11  July  2014  

http://bioinformatics.org.au/ws14  

                                             

Queensland  Bioscience  Precinct  (Building  #80)  The  University  of  Queensland  

Brisbane,  Australia          

Page 4: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

i    

 Monday  7  July  2014  NEXT  GENERATION  SEQUENCING  &  BIOINFORMATICS    8:15  a.m.   REGISTRATION  OPENS    9:00  a.m.   Welcome  and  introduction  

Dr  Nicholas  Hamilton  Institute  for  Molecular  Bioscience  The  University  of  Queensland  

 09:05  a.m.   Next-­‐generation  sequencing:  an  overview  of  technologies  and  applications  

Dr  Ken  McGrath  Australian  Genome  Research  Facility  Ltd  (AGRF)  Brisbane  Node  (The  University  of  Queensland)  

 09:45  a.m.   NGS  mapping,  errors  and  quality  control  

Dr  Felicity  Newell  The  University  of  Queensland  Diamantina  Institute  

 10:30  a.m.   Morning  Tea    11:00  a.m.   Defensive  NGS  informatics  –  what  can  go  wrong  and  how  do  you  know  when  to  throw  in  the  

towel?  Mr  John  Pearson  QIMR  Berghofer  Medical  Research  Institute  

 11:45  a.m.   Structural  variants  detection  using  whole  genome  sequencing  

Dr  Ann-­‐Marie  Patch  Queensland  Centre  for  Medical  Genomics  Institute  for  Molecular  Bioscience  The  University  of  Queensland  

 12:30  p.m.   Lunch    13:30  p.m.   De  novo  genome  assembly  

Dr  Torsten  Seemann  Victorian  Bioinformatics  Consortium  Monash  University  

 14:00  p.m.   Introduction  to  RNA-­‐seq  

Dr  Nadia  Davidson  Murdoch  Childrens  Research  Institute  Royal  Children’s  Hospital  

 14:30  p.m.   RNA-­‐seq  differential  expression  

Dr  Annette  McGrath  CSIRO  Canberra  

 15:00  p.m.   Afternoon  Tea        

Page 5: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

ii    

 15:30  p.m.   MicroRNAs  –  sequencing,  analysis  …  and  then  what?  

Dr  Nicole  Cloonan  Genomic  Biology  Laboratory  QIMR  Berghofer  Medical  Research  Institute  

 16:00  p.m.   NGS  experimental  design  and  statistical  power  

Dr  Stephen  Rudd  QFAB  Bioinformatics  

 16:30  p.m.   Genomic  infrastructure  for  NGS  

A/Professor  Mik  Black  School  of  Medical  Sciences  University  of  Otago  

 17:00  p.m.   What  the  Australian  Bioinformatics  Network  can  do  for  you  

Dr  David  Lovell  CSIRO  

 17:30  p.m.   Welcoming  BBQ    

   

Page 6: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

iii    

Tuesday  8  July  2014  NEXT  GENERATION  SEQUENCING  &  BIOINFORMATICS    09:00  a.m.   Great  expectations  or  why  sequencing  platforms  are  not  magic  wands  

Dr  Lauren  Bragg  CSIRO  

 Dr  Michael  Imelfort  School  of  Chemistry  &  Molecular  Biosciences,  The  University  of  Queensland  

 10:00  a.m.     Phlogeny-­‐based  methods  for  analysing  and  comparing  uncultured  microbial  communities  

A/Professor  Aaron  Darling  University  of  Technology,  Sydney  

 10:30  a.m.   Morning  Tea    MODELLING  FROM  HIGH-­‐THROUGHPUT  BIO-­‐DATA  11:00  a.m.   Exploring  the  structure  of  whole-­‐genome  conservation  profiles  using  Bayesian  segmentation  

Dr  Jonathan  Keith  School  of  Mathematical  Sciences  Monash  University  

 11:45  a.m.   Machine  learning  in  action  

Ms  Tatyana  Goldberg  Technische  Universitat  München,  Germany  

 12:30  p.m.   Lunch    13:30  p.m.   Detection  of  recombination  events  in  bacterial  genomes  

Dr  Nouri  Ben  Zakour  School  of  Chemistry  and  Molecular  Sciences  The  University  of  Queensland  

 14:15  p.m.   Epigenomics:  The  many  garments  of  the  genome  sequence  

Dr  Fabian  Buske  Garvan  Institute  of  Medical  Research  

 15:00  p.m.   Afternoon  Tea    15:30  p.m.   Mixed  linear  model  analyses  of  human  complex  traits  using  SNP  data  

Dr  Jian  Yang  Queensland  Brain  Institute  The  University  of  Queensland  

 16:15  p.m.   Detection  and  replication  of  epistasis  influencing  transcription  in  humans  

Dr  Joseph  Powell  Queensland  Brain  Institute  The  University  of  Queensland    

17:00  p.m.   An  introduction  to  BRAEMBL  services  Dr  Webber  Liao  BRAEMBL  

     

Page 7: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

iv    

   Wednesday  9  July  2014  MODELLING  FROM  HIGH-­‐THROUGHPUT  BIO-­‐DATA  09:00  a.m.   The  future  of  DNA  sequencing  technology  

Professor  Graham  Taylor  University  of  Melbourne  

 09:45  a.m.   Population-­‐scale  high-­‐throughput  sequencing  data  analysis  

Dr  Denis  Bauer  Computational  Informatics  (CCI)  CSIRO  Sydney  

 10:30  a.m.   Morning  Tea    11:00  a.m.   Translating  exome  and  whole  genome  sequencing  to  the  clinic  

A/Professor  Marcel  Dinger  Garvan  Institute  of  Medical  Research  

 12:00  noon   Panel  discussion  

Moderated  by  Dr  Nicole  Cloonan  QIMR  Berghofer  Medical  Research  Institute  

       

***  FREE  WEDNESDAY  AFTERNOON  ***        

Page 8: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

v    

   Thursday  10  July  2014  BIG  DATA,  STATISTICS  AND  APPLICATIONS    08:45  a.m.     Taming  the  Big  Data  Dragon  

Professor  John  Quackenbush  Dana-­‐Farber  Cancer  Institute  &  Harvard  School  of  Public  Health,  USA  

 10:00  a.m.     From  Big  Data  to  smart  knowledge  –  integrating  multimodal  biological  data  and  modelling  

metabolism  Professor  Falk  Schreiber  Monash  University  University  Halle-­‐Wittenberg,  Germany  

 10:40  a.m.   Morning  Tea    11:10  a.m.     Visual  analytics  of  Big  Data  

Professor  Seok-­‐Hee  Hong  University  of  Sydney  

 11:50  a.m.   The  life-­‐sciences  as  a  pathfinder  in  data-­‐intensive  research  practice  

Dr  Andrew  Treloar  Australian  National  Data  Service  (ANDS)  

 12:30  p.m.   Lunch    13:30  p.m.   Statistical  experiment  design  principles  for  biological  studies  

Dr  Alec  Zwart  CSIRO  

 14:15  p.m.   Genome-­‐wide  association  studies  

Professor  David  Evans  The  University  of  Queensland  Diamantina  Institute  

 15:00  p.m.   Afternoon  Tea    15:30  p.m.   Mixture  models  for  analysing  transcriptome  and  ChIP-­‐chip  data  

Dr  Marie-­‐Laure  Martin  Magniette  Mathématiques  et  Informatique  Appliquées  Institut  National  de  la  Recherche  Agronomique,  France  

 16:15  p.m.   Multivariate  models  for  dimension  reduction  and  biomarker  selection  in  omics  data  

Dr  Kim-­‐Anh  Lê  Cao  The  University  of  Queensland  Diamantina  Institute  

     

Page 9: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

 

vi    

   Friday  11  July  2014  MOLECULAR  PHYLOGENETICS  09:00  a.m.   An  introduction  to  phylogenetic  inference  

Dr  Robert  Lanfear  Macquarie  University  

 10:15  a.m.   Morning  Tea    10:45  a.m.   Loss  of  information  at  deeper  divergences,  and  what  we  can  do  about  it  

Distinguished  Professor  David  Penny  Institute  of  Fundamental  Sciences  Massey  University,  NZ  

 ***  IMB  FRIDAY  SEMINAR  ***  12:00  noon   From  mutation  to  macroevolution  

Professor  Lindell  Bromham  Research  School  of  Biology  Australian  National  University  

 13:00  p.m.   Lunch    14:00  p.m.   The  application  of  high  throughput  DNA  barcoding  for  landscape  ecology  and  management  

Professor  Mike  Wilkinson  School  of  Agriculture  Food  &  Wine  The  University  of  Adelaide  

 15:00  p.m.   Open  forum/questions  

Moderated  by  Professor  Mark  Ragan  Institute  for  Molecular  Bioscience  The  University  of  Queensland  

 16:00  p.m.   Student  travel  award  presentation  &  close  of  Winter  School    

   

~*~*~*~*~  

Page 10: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

1  

   

                   

Dr  Ken  McGrath  Australian  Genome  Research  Facility  Ltd  (AGRF)  Brisbane  Node  (The  University  of  Queensland)  

   Biography:  Ken   McGrath   is   the   manager   of   the   Brisbane   Lab   of   the   Australian   Genome   Research   Facility.   He   completed   his  undergraduate   degree   with   honours   in   2001   at   QUT   working   with   the   plant   biotechnology   group   on   developing  transgenic  bioreactors,  and  transitioned  to  UQ  for  his  PhD  work  investigating  the  genetic  regulation  of  plant  defence  responses  to  disease,   in  collaboration  with  CSIRO  and  the  CRC  for  Tropical  Plant  Protection.  Following  this,  his  post-­‐doctoral  research  with  the  Schmidt  and  Schenk  labs  at  UQ  involved  examining  the  transcriptomes  of  mixed  microbial  communities   in   industrial   and   agricultural   settings.   In   2009,   Ken   joined   the   AGRF   as   sequencing   supervisor,   and  currently  helps  manage  submissions  and  workflows  on  a  range  of  next-­‐generation  sequencing  platforms.    Date:  Monday  7  July  2014    Presentation  title:  Next-­‐generation  sequencing:  an  overview  of  technologies  and  applications  

Abstract:  The   “Next-­‐Generation   Sequencing”   landscape   is   one   of   constant   change,   with   new   and   emerging   technologies  constantly   competing  with   established   platforms.   This   abundance   of   competition   is   resulting   in   faster   and   cheaper  methods  to  perform  sequencing  of  DNA  and  RNA  samples,  but  it  also  brings  with  it  a  confusing  array  of  options,  each  with   its  own  strengths  and  weaknesses.  Ken  will  give  an  overview  of   the  available  sequencing  technologies  and  run  through  some  examples  projects  that  can  be  run  on  them,  as  well  as  describe  the  typical  bioinformatics  approaches  for  these  projects,  and  also  take  a  look  at  what’s  “next”  in  Next-­‐Gen.        

Page 11: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

2  

   

                     

Dr  Felicity  Newell  The  University  of  Queensland  Diamantina  Institute  

   Biography:  Felicity  originally  trained  in  the  fields  of  molecular  and  cell  biology.  In  her  PhD  and  first  post-­‐doctoral  position  at  the  University  of  Queensland,  she  investigated  the  role  of  growth  factors   in  the  differentiation  of  human  preadipocytes.  After  this,  she  developed  an  interest  in  software  development  and  bioinformatics,  obtaining  a  Master  of  Information  Technology   from   the   Queensland   University   of   Technology.   She   worked   for   two   years   as   a   software   developer  developing   bioinformatics   web   applications   at   QFAB   Bioinformatics,   before   moving   to   the   Queensland   Centre   for  Medical  Genomics  at  UQ.  At  QCMG,  she  developed  software  for  the  analysis  of  cancer  sequencing  data,   including  a  tool  used  for  the  detection  of  structural  variants.  She  is  currently  carrying  out  postdoctoral  research  at  the  University  of  Queensland  Diamantina  Institute,  using  next  generation  sequencing  data  to  understand  human  disease,  and  has  a  particular  interest  in  structural  variation.    Date:  Monday  7  July  2014    Presentation  title:  NGS  mapping,  errors  and  quality  control    Abstract:  An  important  step  in  next  generation  sequencing  is  the  alignment  (mapping)  of  the  short  reads  that  are  generated  to  a  reference  genome.  Tools  designed   for  mapping  are   required   to  efficiently  and  accurately  align  each   read  and  more  than   60   applications   are   currently   available   for   this   purpose.   In   this   presentation   I   will   describe   some   of   the  approaches  to  sequence  alignment,  highlighting  popular  tools  that  are  used  such  as  BWA,  Novoalign  and  Bowtie.  An  important   consideration   for  mapping   and   downstream   sequence   analysis   is   the   ability   to   recognise   and   deal   with  common  errors  and  biases  that  can  occur  during  the  process.  I  will  discuss  some  of  the  common  errors  that  occur  in  next   generation   sequencing   and   the   approaches   to   quality   control   that   should   be   applied   in   order   to   obtain   high  quality  data.      

Page 12: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

3  

   

                   

Mr  John  Pearson  QIMR  Berghofer  Medical  Research  Institute  

   Biography:  John  Pearson  has  qualifications  in  biochemistry,  physiology,  computing  science  and  technology  management  and  has  spent  20  years  creating  software   for   scientists.   John  was  Computer  Systems  Manager   for   the  Genetic  Epidemiology  Laboratory   at   the   Queensland   Institute   of  Medical   Research   (QIMR)   prior   to  moving   to   the   United   States   in   2000  where   he  was   the   lead   programmer   in   the   Bioinformatics   and   Scientific   Programming   Core   (BSPC)   at   the  National  Human  Genome  Research   Institute   (NHGRI)  within  the  National   Institutes  of  Health   (NIH)   in  Bethesda,  Maryland.   In  2003,   John   left   the   NIH   to   become   a   founding   Faculty   member   at   the   Translational   Genomics   Research   Institute  (TGen)  where   he   lead   the  Bioinformatics   Research  Unit   and   also   served   as   a  Division  Director  with   oversight   of   all  bioinformatics  activities  at  TGen.  John  has  held  software  development  grants  from  the  American  Cancer  Society,  the  National   Institutes  of  Health   and  Microsoft   and  has  been   focusing  on  next-­‐generation   sequencing   since   the   end  of  2007.   John   returned   from   the   US   to   take   up   a   position   in   early   2010   as   Senior   Bioinformatics   Manager   for   the  Queensland  Centre  for  Medical  Genomics  (QCMG).    Date:  Monday  7  July  2014    Presentation  title:  Defensive  NGS  informatics  -­‐  what  can  go  wrong  and  how  do  you  know  when  to  throw  in  the  towel?    Abstract:  Next-­‐generation  sequencing  has   radically   changed  medical   research  by  allowing  deep   interrogation  of   the  DNA  and  RNA   of   pathogenic   organisms,   families   with   inherited   disorders   and   the   de-­‐novo   mutations   responsible   for  tumourigenesis.    As  with  any  new  technology,  a  "gold  rush"  mentality  can  arise  where  being  first  to  the  answer  can  push  rigour  and  methodological  soundness  into  the  background.    In  this  seminar,  I'll  talk  from  QCMG  experience  about  some  of  the  ways  sequencing  can  go  wrong,  how  the  problems  became  apparent,  what  we  did  about  them,  and  tools  we  developed  to  try  to  catch  the  same  problems  in  future.        

Page 13: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

4  

   

           

Dr  Ann-­‐Marie  Patch  Queensland  Centre  for  Medical  Genomics  

Institute  for  Molecular  Bioscience  The  University  of  Queensland  

   Biography:  Ann-­‐Marie  is  currently  a  Senior  Bioinformatics  Researcher  within  the  multi-­‐skilled  group  at  the  Queensland  Centre  for  Medical   Genomics   led   by   Prof   Sean  Grimmond.   Her   current   research   focuses   on   the   detection   of   small   indels   and  somatic   structural   rearrangements   in   ovarian,   pancreatic   and   other   cancers,   with   a   personal   interest   in   the  mechanisms  of  DNA  repair.  Her  PhD,  gained  in  2006  from  the  University  of  Exeter  UK,  combined  bioinformatics  and  laboratory   approaches   to   study   the   nature   of   tandem   repetitive   elements   in   the   model   genomes   of   fission   and  budding  yeast.  She  then  joined  the  Peninsula  College  of  Medicine  &  Dentistry  as  an  associate  research  fellow  in  Prof  Andrew  Hattersley’s  group  employing  next  generation  sequencing  to  identify  monogenic  causes  of  neonatal  diabetes  and   to   identify   causal   mutations   across   a   broad   spectrum   of   genetic   disorders   for   the   Royal   Devon   and   Exeter  Molecular  Genetics  Laboratory.    Date:  Monday  7  July  2014    Presentation  title:  Structural  variants  detection  using  whole  genome  sequencing    Abstract:  As   part   of   the   International   Cancer   Genome   Consortium,   the   Queensland   Centre   for   Medical   Genomics   has  established  a  world  class  laboratory  and  computational  infrastructure  balanced  with  high  level  expertise  to  enable  the  analysis  of  whole  human  genomes  for  the  presence  of  DNA,  RNA  and  epigenetic  variants  that  are  associated  with  the  hallmarks  of  cancer.  This  talk  will  describe  and  discuss  the  principles  and  challenges  of  identifying  structural  variants  (SVs)   using   whole   genome   sequencing.   I   will   present   the   basis   of   detecting   SVs,   a   tool   developed   at   QCMG,   and  examples  of  how  SV  analysis  can  identify  mechanisms  driving  tumorigenesis.        

Page 14: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

5  

   

                 

Dr  Torsten  Seemann  Victorian  Bioinformatics  Constortium  

Monash  University      Biography:  Dr  Torsten  Seemann  is  the  Scientific  Director  of  the  Victorian  Bioinformatics  Consortium  at  Monash  University,  and  a  Senior  Research  Scientist  at  the  Life  Sciences  Computation  Centre  in  Melbourne.  He  originally  trained  as  a  computer  scientist  and  did  his  PhD  in  image  processing  and  data  compression,  but  his  first  postdoc  in  2002  saw  him  thrown  into  the   middle   of   Australia's   first   large   genome   project,   and   he   hasn't   looked   back   since.   He   specialises   in   microbial  comparative   genomics,   genome  assembly,   and  genome  annotation;   and   is   a   strong  believer   in  writing  high  quality,  useful  software  tools  and  contributing  back  to  the  bioinformatics  community.  You  can  learn  more  at  his  group  website  www.bioinformatics.net.au,  his  blog  TheGenomeFactory.blogspot.com.au,  and  on  Twitter  @torstenseemann.      Date:  Monday7  July  2014    Presentation  title:  De  novo  genome  assembly    Abstract:  De  novo  assembly  is  the  process  of  reconstructing  a  genome's  DNA  sequence  using  only  a  set  of  much  shorter  error-­‐prone   sequences   (reads)   sampled   from   the   genome.   It   is   the   "original"   genomics-­‐based   bioinformatics   problem,  because  it  is  all  we  can  do  when  we  don't  have  any  related  reference  genome  sequences,  with  the  exemplar  being  the  original  human  genome  project.  This  presentation  will  discuss  the  principles  of  and  approaches  to  de  novo  assembly  of  data,  and  practical  issues  like  computational  and  memory  requirements,  limitations  of  de  novo  assembly,  terminology,  file  formats,  available  software,  and  an  example  run-­‐through  of  an  assembly  using  the  Velvet  software  if  time  permits        

Page 15: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

6  

   

                 

Dr  Nadia  Davidson  Murdoch  Childrens  Research  Institute  

Melbourne      Biography:  Dr   Nadia   Davidson   is   a   bioinformatician   working   within   the   Oshlack   group   at   the   Murdoch   Childrens   Research  Institute,  Melbourne.   She  was   trained   in  physics  and   software  engineering  and  completed  her  PhD   in  Experimental  Particle  Physics  from  the  University  of  Melbourne  in  2011.  Her  research  interests  include  methodology  development  and  analysis  of  next-­‐generation  RNA  sequencing  data.  She  has  been  involved  in  a  diverse  set  of  projects  that  include  studying  sex  development  in  birds,  to  identifying  genomic  rearrangements  in  cancer,  all  with  the  common  theme  of  de  novo  transcriptome  assembly.    Date:  Monday  7  July  2014    Presentation  title:  Introduction  to  RNA-­‐Seq    Abstract:  The  central  dogma  of  genetics  is  that  the  genome,  comprised  of  DNA,  encodes  many  thousands  of  genes  that  can  be  transcribed  into  RNA.  Following  this,  the  RNA  may  be  translated  into  amino  acids  giving  a  functional  protein.  While  the  genome  of  an  individual  will  be  identical  for  each  cell  throughout  their  body,  the  number  of  transcribed  copies  of  each  gene,  as  RNA,  will  differ  due  to  the  different  functional  requirement  of  each  tissue  type.  An  important  area  of  research  within   genetics   is   to   study   the   genome   in-­‐action,   through   RNA.   For   example,   by   comparing   the   quantities   of   each  gene’s  RNA  between  different  tissue  types,  through  development,  in  disease  or  in  different  environments  –  known  as  differential  gene  expression  analysis.      RNA-­‐Seq,  or  high  throughput  RNA  sequencing,  has  accelerated  research  in  this  area.  The  technology  works  by  reverse  transcribing   the   RNA   back   into   DNA,   sheering   it   into   smaller   fragments,   then   reading   each   fragments   sequence   in  parallel  to  give  millions  of  short  “reads”,  each  between  approximately  50-­‐200  bases  in  length.  With  these  data  comes  a   computational   and   statistical   challenge   because   the   biology   must   be   inferred   from  millions   of   short   sequences.  Along  with   technical   biases,   there   is   true   biological   variability   between   samples   of   the   same   type,   which  must   be  accounted  for.    In   this   talk   I   discuss   the   applications   of   RNA-­‐Seq,   its   challenges   and   some   of   the   bioinformatics   strategies   being  employed  to  analyse  this  complex  data.  In  particular,  I  will  focus  on  the  steps  involved  in  differential  gene  expression  analysis,  for  both  model  organisms,  like  human,  and  more  exotic  organisms,  without  a  sequenced  genome.        

Page 16: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

7  

   

                 

Dr  Annette  McGrath  CSIRO  Canberra  

   Biography:  Dr  Annette  McGrath  is  the  Bioinformatics  Core  Leader  at  CSIRO  where  her  team  works  on  enhancing  bioinformatics  capability   and  developing  and   supporting  enterprise  bioinformatics   infrastructure   for  CSIRO’s  bioinformaticians  and  bioscientists.   They   also   collaborate   with   researchers   on   a   number   of   genomics   research   projects.   She   has  qualifications  in  biochemistry,  molecular  biology  and  statistics,  and  has  worked  in  bioinformatics  roles  in  industry,  the  not-­‐for-­‐profit  sector  and  now  CSIRO  since  1998.    Date:  Monday  7  July  2014    Presentation  title:  RNA-­‐seq  differential  expression    Abstract:  RNASeq  has  become  one  of  the  most  popular  applications  of  NGS  technology  and  it  is  used  to  give  a  snapshot  of  the  RNA   that   is   present,   and   in  what   relative  quantity,   in   a   particular   biological  material   at   a   given  point   in   time.     The  previous   presentation   covers   a   number   of   applications   of   RNASeq   including   applications   in   non-­‐model   organisms.  RNASeq  can  be  used  for  many  applications  including  spliced  gene  discovery,  differential  expression,  RNA  editing  and  detection  of  variants  and  this  talk  will  focus  on  the  tools  and  methods  of  data  analysis  for  these  applications.        

Page 17: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

8  

   

                 

Dr  Nicole  Cloonan  QIMR  Berghofer  Medical  Research  Institute  

   Biography:  Nicole  Cloonan   is  an  ARC  Future  Fellow  who  has   recently  established   the  Genomic  Biology  Laboratory  at   the  QIMR  Berghofer  Medical   Research   Institute.  Her  work   is  multi-­‐disciplinary   in   nature,   involving   computational   biology   and  bioinformatics,  biochemistry,  cell  biology,  and  molecular  biology  –  all  of  which  she  uses  to  understand  the  complexity,  function,  and  systems  biology  of  RNA.      Date:  Monday  7  July  2014    Presentation  title:  MicroRNAs  -­‐  sequencing,  analysis  ...  and  then  what?    Abstract:  MicroRNAs   (miRNAs)  are  an   important   class  of  non-­‐coding   regulatory  RNAs,  which   interfere  with   the   translation  of  protein-­‐coding  mRNA  transcripts.  By  incorporation  into  the  RNA  induced  silencing  complex  (RISC),  miRNAs  can  inhibit  translation,  promote   sequestration  of  mRNAs   to  P-­‐bodies,  and/or  destabilise  and  degrade   target  mRNAs.  The  small  size  of  mature  miRNAs  (typically  only  20  to  24  nucleotides)  makes  them  ideal  for  characterisation  using  short-­‐tag  RNA-­‐sequencing   (RNA-­‐seq)   technologies   as   you   can   capture   the   entire   molecule   in   a   single   read.   Unlike   hybridisation  approaches  such  as  microarray  profiling  or  Northern  blotting,  massive-­‐scale  sequencing  provides  a  way  to  discriminate  discrete  but  closely  related  RNA  molecules,  and  profile  miRNAs  without  a  priori  knowledge  of  expression.      MicroRNAs   perform   their   biological   roles   by   binding   to   mRNAs   through  Watson-­‐Crick   base-­‐pairing.   The   attractive  simplicity  of  using  nucleotide  complementarity  to  identify  mRNA  targets  has  given  rise  to  many  bioinformatics  tools.  These  are  based  (to  differing  extents)  on  complementarity  to  the  seed,  evolutionary  conservation,  and  free  energy  of  binding.      So  with  great   technology  and  plenty  of  well   researched  and  well   respected  bioinformatics   tools,  miRNAs   should  be  easy,   right?   This   talk  will   systematically   crush   this   rosy   view  of  miRNAs   as   a   field   of   study,   and   lay   before   you   the  desolate  wasteland  to  navigate  on  your  path  to  publication.  Those  towards  the  end  of  their  PhD  study  on  miRNAs  may  wish  to  avoid  this  talk.        

Page 18: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

9  

   

                   

Dr  Stephen  Rudd  Queensland  Facility  for  Advanced  Bioinformatics  (QFAB)  

The  University  of  Queensland      Biography:  Stephen   is   Head   of   Computational   Biology   at   QFAB   Bioinformatics,   a   bioinformatics   and   biostatistics   services  organisation   based   here   at   IMB.   Over   the   last   15   years   Stephen   has   worked   as   a   genome   biologist   and  bioinformatician  in  academia  and  industry  in  five  different  countries.  He  is  a  classical  geneticist  by  training  with  a  PhD  in  molecular  biology  and  an  adjunct  professorship  in  plant  genome  bioinformatics.  In  a  service  provision  role  Stephen  has   seen   some   of  worst   experimental   designs   imaginable   (pharmaceutical   industry)   and   regularly   provides   "-­‐omics  disaster  recovery"  for  when  complex  sequence-­‐based  studies  don't  seem  like  such  a  good  idea  after  all.  He  really  does  not   like  Excel  and  knows  that  you  will  appreciate  reactive  approaches  to  data  visualisation.  For  this  reason  Stephen  continues   to   develop   open-­‐source   software  with   the   aim  of   enabling   comparative   genomics   by   bridging   the   divide  between  bench-­‐biologists  and  big-­‐data.    Date:  Monday  7  July  2014    Presentation  title:  NGS  experimental  design  and  statistical  power    Abstract:  Today's   sequencing   platforms   make   it   rather   too   easy   to   inexpensively   generate   hundreds   of   gigabases   of   DNA  sequence   data.   It   is   advisable   to   plan   your   research   study   carefully   before   you   start   collecting   samples,   pooling  controls   and   sequentially   sending   off   your   lovingly   extracted   cDNA   to   the   cheapest   sequencing   service   provider  around.  In  this  talk  we  will  explore  the  anatomy  of  a  sequence  based  genomics  study.  We  will  consider  experimental  design   and   the   selection   of   appropriate   controls   to   design   a   simple   hypothesis   driven   project.   The   application   of  statistical  power   calculations  will  be  used  determine   the  appropriate  number  of   samples  and  we  will   consider  how  potential   batch-­‐effects   associated   with   library   preparation   and   sequencing   may   confound   downstream   analyses.  Experimental  metadata  will  be  discussed  and  I  will  reference  anecdotal  studies  where  additional  metadata  would  have  greatly  simplified  the  data  interpretation.  Building  on  the  old-­‐adage  of  "Junk  in  ::  Junk  out"  we  have  a  paranoid  look  at  NGS  data  quality  control  and  I  will  provide  pointers  to  a  number  of  suitable  workflows  for  understanding  whether  a  provided  RNA-­‐Seq  /  ChIP-­‐Seq  /  exome  or  WGS  dataset  is  really  fit-­‐for-­‐purpose.        

Page 19: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

10  

   

             

A/Professor  Mik  Black  Department  of  Biochemistry  

University  of  Otago  Dunedin,  New  Zealand  

   Biography:  Mik   received  a  BSc   (Hons)   in   statistics   from  the  University  of  Canterbury,  and  an  MSc   (mathematical   statistics)  and  PhD  (statistics)  from  Purdue  University.  After  completing  his  PhD  in  2002,  Mik  returned  to  New  Zealand  to  work  as  a  lecturer   in   the   Department   of   Statistics   at   the   University   of   Auckland.   An   ongoing   involvement   in   a   number   of  Dunedin-­‐based  collaborative  genomics  projects  resulted  in  a  move  to  the  University  of  Otago  in  2006.  Mik's  research  focuses   on   the   development   and   application   of   statistical   methods   for   the   analysis   of   data   from   genomics  experiments,   with   a   particular   emphasis   on   human   disease.   Mik   is   also   heavily   involved   in   two   major   initiatives  designed   to  put   in  place   sustainable  national   research   infrastructure   for  NZ:  NZGL   (New  Zealand  Genomics   Ltd)   for  genomics  (where  he  was  the  interim  Bioinformatics  Team  Leader  during  2012-­‐2013),  and  NeSI  (New  Zealand  eScience  Infrastructure)  for  computing/eResearch.    Date:  Monday  7  July  2014    Presentation  title:  Genomic  Infrastructure  for  NGS    Abstract:  In  the  current  research  environment,  the  ability  to  manage,  analyse  and  interpret  data  produced  by  high-­‐throughput  sequencing  platforms  has  become  an  essential  skill  for  both  wet-­‐  and  dry-­‐lab  researchers.  While  a  number  of  options  exist  for  outsourcing  these  tasks,  the  reality  is  that  researchers  still  need  (and  desire)  a  level  of  analytic  skill  that  allows  them  to  perform  basic  exploratory  analysis  of  their  data,  without  having  to  rely  on  external  assistance.    In  this  talk,  I  will  discuss  some  of  the  infrastructure  initiatives  that  have  been  undertaken  in  New  Zealand  and  Australia  to  provide  both  genomics  and  bioinformatics   support   for   researchers,  as  well  as  highlighting  some  of   the   tools  and  skills  that  help  to  ensure  the  robustness  and  reproducibility  of  the  analyses  being  carried  out.      

Page 20: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

11  

   

               

Dr  Lauren  Bragg  CSIRO  

Biography:  Lauren  is  a  research  scientist  in  the  CSIRO  Digital  Productivity  and  Services  Flagship.  Lauren  completed  her  Bachelors  in   Science   (Bioinformatics)   at   the  University   of   Sydney   in   2005,   and   subsequently  worked   for   a   year   as   a   software  developer   for   the   Capital   Markets   CRC.   In   2007,   Lauren   joined   CSIRO’s   Division   of   Mathematics,   Informatics   and  Statistics  at  North  Ryde,  and  worked  on  a  variety  of  projects  spanning  microarray  design,  analysis  and  genomic  tool  development.   Inspired  by  the  global  oceanic  survey  (GOS)  study,  Lauren  moved  to  Queensland  to  begin  a  CSIRO-­‐UQ  PhD  in  the  area  of  metagenomics,  supervised  by  Professor  Gene  Tyson.  Lauren's  thesis  focused  on  the  development  of  statistical  and  computational  methods  for  analysis  of  environmental  sequencing,  where  she  established  a  protocol  for  metagenome   assembly,   developed   a   novel   tool   for   correcting   errors   in   pyrosequenced   amplicons   ('Acacia'),   and  evaluated  the  quality  of  Ion  Torrent  PGM  as  a  platform  for  environmental  sequencing  applications.  Upon  completing  her  thesis  research  in  2012,  Lauren  returned  to  CSIRO  and  by  using  metabolomic  information  as  a  proxy  for  metabolic  capability  and  activity,  is  developing  expression  models  that  will  predict  the  consequences  of  controlled  perturbations  (such   as   dietary   changes   and   probiotics)   on   the   complex  microbial   communities   present   in   the   digestive   tracts   of  animals  and  humans.                  

Dr  Michael  Imelfort  School  of  Chemistry  &  Molecular  Biosciences  

The  University  of  Queensland  Biography:  Michael  is  a  bioinformatician  at  The  Australian  Center  for  Ecogenomics,  The  University  of  Queensland.  During  his  PhD  he  worked  almost  exclusively  with  plant  genomes,  but  now  he  focusses  on  the  genomics  of  environmental  microbial  communities,  particularly  those  communities  which  cannot  be  cultured.  His  current  research  involves  finding  ways  to  merge   and   analyse   data   produced  using   a   variety   of  DNA   sequence-­‐based   experimental   frameworks,   including   16S  pyrotag  community  profiling  and  metagenomic  and  meta-­‐transcriptomic  sequencing.  Recently  he  has  been  developing  novel  techniques  that  cluster  metagenomic  contigs  into  population-­‐specific  groups  (differential  coverage  binning).    Date:  Tuesday  8  July  2014    Presentation  title:  Great  expectations  or  why  sequencing  platforms  are  not  magic  wands    Abstract:  Environmental   microbial   sequencing   (e.g.   amplicon   sequencing,   metagenomics,   metatranscriptomics)   provides   a  culture-­‐independent  means  to  investigate  the  composition,  genomic  potential  and  activity  of  microbial  communities.      These   approaches   have   been   rapidly   and   widely   adopted   with   the   result   that   the   corresponding   data   typically  constitutes   a   critical   component   of   many   studies.   Unfortunately,   highly   complex   microbiomes   coupled   with   poor  experimental  designand  unrealistic  goals  have  all  too  often  lead  to  doomed  studies,  disappointment  and  tears.  In  this  tag-­‐team  talk,  we  provide  an  overview  of  environmental  microbial  sequencing  techniques,  with  a  focus  on  appropriate  experimental  design  and  bioinformatic  analyses.  We  aim  to  provide  a  broad  overview  of  what  can  be  achieved  with  a  HiSeq  and  some  derring-­‐do:  and  what  cannot.  We  will  illustrate  our  main  points  with  a  few  case  studies.      

Page 21: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

12  

     

                   

A/Professor  Aaron  Darling  University  of  Technology  Sydney  

   Biography:  A/Professor   Aaron   Darling   is   an   internationally   recognised   expert   in   computational   biology   and   bioinformatics.  Darling’s   career   began   14   years   ago   in   the   team   that   sequenced   the   first   few   E.   coli   genomes   and   he  went   on   to  develop   the  widely   used  Mauve   software   for   genome   analysis   and   comparison.   Darling   has   been   awarded   several  competitive   fellowships,   research   grants,   and   industry   sponsored   research   contracts.   He   has   published   over   50  manuscripts  in  journals  ranging  from  PLoS  to  PeerJ  to  Nature.    Date:  Tuesday  8  July  2014    Presentation  title:  Phylogeny-­‐based  methods  for  analysing  and  comparing  uncultured  microbial  communities  

Abstract:  Sequencing   of   uncultured   microbial   communities   via   both   shotgun   metagenomic   and   16S   amplicon   methods   has  provided  great  insight  into  the  diversity  of  microbes  and  their  roles  in  the  environment  and  human  health.  The  most  commonly   used   methods   for   analysing   such   datasets   are   based   on   identification   of   Operational   Taxonomic   Units  (OTUs):  collections  of  sequences  within  a  predefined  percent  nucleotide  identity.  These  OTU-­‐based  approaches  have  some  shortcomings,   such  as  ambiguity   in  OTU  definition  and   limited   resolution.   In   this   seminar   I  will   review   recent  work   on   alternative   approaches   to   quantifying   and   comparing   microbial   community   diversity   using   Bayesian  phylogenetic  inference.  This  will  include  an  introduction  to  basic  phylogenetic  models  and  the  concepts  of  alpha  and  beta  diversity  in  microbial  communities.        

Page 22: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

13  

 

               

Dr  Jonathan  Keith  Faculty  of  Science  Monash  University  

   Biography:  Dr   Jonathan   Keith   was   awarded   a   PhD   in  mineral   processing   by   the   University   of   Queensland   in   2000,   and  was   a  postdoctoral   fellow  there  and  at  Queensland  University  of  Technology  before  moving  to  Monash  University.  He  has  worked  in  Bayesian  methodology  and  applications  since  2000  and  has  developed  a  trans-­‐dimensional  generalisation  of  the  Gibbs  sampler  and  adaptive  Markov  chain  Monte  Carlo  methods.  His  methods  have  been  applied  in  comparative  genomics  to  investigate  the  non-­‐protein-­‐coding  fraction  of  eukaryotic  genomes,  and  also  in  phylogenetics,  in  genetic  linkage  and  association  studies,  and  in  modelling  the  spread  of  invasive  pest  species.    Date:  Tuesday  8  July  2014    Presentation  title:  Exploring  the  structure  of  whole-­‐genome  conservation  profiles  using  Bayesian  segmentation    Abstract:  Conservation  is  a  key  indicator  of  function  in  genomes,  and  can  potentially  be  used  to  discover  novel  functional  non-­‐protein-­‐coding   RNAs   and   regulatory   sequences.   However,   recent   investigations   have   demonstrated   that   a   simple  dichotomy  between  conserved  and  non-­‐conserved  sequence  is  too  naïve  a  distinction  to  reflect  the  full  complexity  of  the  numerous  types  of  structural  and  functional  constraints  acting  on  genomes.  This  presentation  will  discuss  recent  investigations   into   the   detailed   structure   of   whole-­‐genome   conservation   profiles,   using   Bayesian   segmentation  techniques   to   identify   multiple   classes   of   conservation   level.   By   integrating   information   about   conservation   with  profiles   of   other   properties   indicative  of   function,   including  GC   content   and   transition/   transversion   ratios,   a  much  finer   level   of   structure   can   be   detected.   The  method   has   been   applied   to   a   range   of   species   including  Drosophila,  zebrafish,  malaria  and  bacterial   genomes,  and   results   from  each  of   these  will  be  presented.  One  key   implication  of  these   results   is   that   the  proportion  of   functionally   constrained   sequence   in  eukaryotic  genomes  may  be  very  much  larger   than   previously   supposed.   Another   key   implication   is   that   genomic   sequences  may   be   subject   to   ephemeral  functional   constraints   that   act  on   too   short   a   time   scale   to  be  detected   in  most   comparative   genomic   studies.   The  functional  content  of  various  classes  of  conserved  sequence  will  also  be  discussed.        

Page 23: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

14  

   

                 

Ms  Tatyana  Goldberg  Technical  University  of  Munich  

Germany      Biography:  Tatyana   Goldberg   is   a   PhD   student   in   Bioinformatics   at   Technical   University   of  Munich,   Germany.   In   her   research  Tatyana  focuses  on  applying  Machine  Learning  to  answer  various  biological  questions.  In  particular  she  is  interested  in  the   prediction   of   protein   sub-­‐cellular   localisation   and   the   understanding   of   micro-­‐world   warfare   (prediction   of  bacterial  pathogen  effectors).  Tatyana  is  leading  students  in  several  scientific  projects,  including  those  participating  in  “The  CAFA  Challenge”  and  “Google  Summer  of  Code”.    Date:  Tuesday  8  July  2014    Presentation  title:  Machine  learning  in  action    Abstract:  Advances   in  high-­‐throughput  sequencing  technologies   led  to  an  enormous   increase   in  the  amount  of  data  stored   in  public   databases.   The  experimental   annotation  of   this   data  however   remains   a   challenging   task,   thus  widening   the  sequence-­‐to-­‐annotation  gap.  Reliable  computational  prediction  methods  of  protein  function  could  counter  this  trend;  they   are   becoming   invaluable   in   the   analysis   and   annotation   of   biological   data.   In   this   presentation   I   will   give   an  introduction   to   machine   learning   and   its   applications   in   bioinformatics.   On   the   example   of   protein   sub-­‐cellular  localisation   prediction,   I   will   discuss   a   typical   workflow   for   applying   machine   learning   methods   and   provide   code  samples.        

Page 24: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

15  

   

                 

Dr  Nouri  Ben  Zakour  School  of  Chemistry  and  Molecular  Sciences  

Australian  Infectious  Diseases  Research  Centre  The  University  of  Queensland  

   Biography:  Dr   Nouri   Ben   Zakour   is   a   researcher   in   Microbial   and   Evolutionary   Genomics   with   over   10   years   of   international  experience   in   the   field.  After   completing  her   PhD   in  Bioinformatics   at   the   French  National   Institute   for  Agricultural  Research,  she  held  a  post-­‐doctoral  position  at  the  Roslin   Institute,  University  of  Edinburgh,   to  work  on  the  genomic  basis   of   host   adaptation   in   staphylococcal   species.   In   2009,   she   joined   the   Australian   Infectious   Diseases   Research  Centre  and  School  of  Chemistry  and  Molecular  Biosciences  at  the  University  of  Queensland  as  a  senior  post-­‐doctoral  fellow.  Working  with  Dr  Scott  Beatson  and  the  Microbial  Genomics  Group,  she  has  expanded  her  knowledge  on  the  evolution  of  bacterial  pathogens  of  medical  and  veterinary  importance.  Her  interests  range  from  population  genetics  and   evolutionary   genomics   to   functional   genomics,   to   elucidate   how   pathogenic   bacteria   evolve   to   colonise   new  ecological  niches  and  cause  outbreaks.    Date:  Tuesday  8  2014    Presentation  title:  Detection  of  recombination  events  in  bacterial  genomes    Abstract:  Bacteria   have   the   extraordinary   ability   to   evolve   not   only   by   accumulating   point   mutations,   but   also   by   acquiring  foreign  DNA  through  lateral  gene  transfer.  They  can  also  “reshuffle”  alleles  present  in  a  bacterial  population  through  a  mechanism   called   homologous   recombination,   which   allows   them   to   exchange   homologous   DNA   regions.  Recombination  can  mediate   large  evolutionary   jumps   in  bacterial   genomes  by   rapidly   spreading  variants  associated  with   increased   virulence,   antibiotic   resistance   or   fitness.   A   corollary   of   this   adaptive   diversification   is   that   laterally  exchanged   variations   introduced   by   recombination   conflict   with   the   phylogenetic   signal   of   vertically   transmitted  variations.  Detecting  recombination  in  bacterial  genomes  is  not  only  essential  to  understand  the  patterns  of  bacterial  evolution  and  adaptation,  but  can  also  be  crucial  when  attempting  to  infer  phylogenies.      A  plethora  of  approaches  has  been  developed  in  the  recent  years  to  solve  the  computational  challenges  of  detecting  recombination   events   in   bacterial   genomes.   I   will   review   some   of   the   current   approaches   used,   with   a   particular  emphasis  on  those  adapted  to  large-­‐scale  population  studies.  I  will  also  illustrate  briefly  with  some  examples,  how  the  recent   advances   in   the   detection   of   recombination   have   helped   shift   some   of   the   established   dogmas   of   bacterial  evolution.        

Page 25: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

16  

   

                   

Dr  Fabian  Buske  Garvan  Institute  of  Medical  Research  

   Biography:  Dr.  Fabian  Buske  specialises  in  Big  Data  analysis  of  sequence,  epigenomic,  transcriptomic  and  medical  data.  He  did  his  PhD  on  nucleic  acid  triple  helices  at  the  Institute  for  Molecular  Bioscience,  The  University  of  Queensland.  In  2013,  he  joined  Prof.  Susan  Clark's   lab  at  the  Garvan  Institute  of  Medical  Research   in  Sydney  on  the  quest  to  advance  cancer  research.  He   accepted   the   challenge  of   integrating   the  wide   array   of   epigenetic   data   sets   as  well   as   to   extend  our  predominately  one-­‐dimensional  view  of  genomic  data   to   the   third  dimension   in  order   to  gain  new   insights   into   the  cellular  mechanisms  that  contribute  to  cancer.    Date:  Tuesday  8  July  2014    Presentation  title:  Epigenomics:  The  many  garments  of  the  genome  sequence    Abstract:  Epigenetic  modifications  are   reversible  modifications  on   the  DNA   that   affect   gene  expression  without   changing   the  actual   genome   sequence.   The   spectrum   of   modifications   range   from   DNA   methylation,   histone   modification   and  nucleosome   positioning   to   DNA   packaging   and   chromatin   organisation   in   the   three   dimensional   space.   This  presentation   will   highlight   different   assays   and   bioinformatic   approaches   used   to   query   epigenetic   modifications  genome-­‐wide  as  well  as  how  these  layers  of  information  can  be  integrated  into  meaningful  models.          

Page 26: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

17  

   

             

Dr  Jian  Yang  Senior  Research  Fellow  

Queensland  Brian  Institute  The  University  of  Queensland  

   Biography:  Jian  Yang  is  a  Senior  Research  Fellow  at  Queensland  Brain  Institute,  The  University  of  Queensland.  He  received  his  PhD  in  2008  from  Zhejiang  University,  China,  which  was  followed  by  postdoctoral  research  at  the  Queensland  Institute  of  Medical   Research.   He   joined   The  University   of   Queensland   in   2012.   His   research   interests   are   in   developing   novel  methods  and  software  tools  to  better  understand  the  genetic  architecture  of  complex  diseases  and  traits  using  high-­‐throughput   genetic   and   genomic   data.   In   2012,   he   won   the   Centenary   Institute   Lawrence   Creative   Prize,   which   is  awarded  annually   to  only  one  young  medical   researcher   in  Australia.  He  was  awarded  a  NHMRC  RD  Wright  Career  Development   Fellowship   in   the   same   year,   and   was   part   of   a   team   shortlisted   for   the   Eureka   Prize   in   Scientific  Research.   In   2013,   he   received   a  UQ   Foundation   Research   Excellence   award   and  was   one   of   two   recipients   of   the  Sylvia  and  Charles  Viertel  Charitable  Foundation’s  Senior  Medical  Research  Fellowship.    Date:  Tuesday  8  July  2014    Presentation  title:  Mixed  linear  model  analyses  of  human  complex  traits  using  SNP  data    Abstract:  Most  traits  and  common  diseases   in  humans,  such  as  height,  cognitive  ability,  psychiatric  disorders  and  obesity,  are  influenced  by  many  genes  and  their  interplay  with  environmental  factors.  These  diseases/traits  are  called  “complex”  traits   to   differentiate   them   from   “Mendelian”   traits   that   are   caused   by   single   genes.   Understanding   the   genetic  architecture  of  human  complex  traits,  e.g.  how  much  of  the  difference  between  people’s  susceptibilities  to  diseases  are   accounted   for   by   their   difference   in   DNA   sequence,   how  many   genes   are   involved   in   the   etiology   of   diseases,  where   the  genes  are   located  and  how  much  effects  of   the  genes  are  on   the  disease   risks,   is   essential   to  diagnosis,  discovery   of   new   drug   targets   and   prevention.   To   date,   thousands   gene   loci   as   represented   by   single   nucleotide  polymorphisms  (SNPs)  have  been  identified  to  be  associated  with  hundreds  of  human  complex  traits  by  the  genome-­‐wide  association  study   (GWAS)   technique.   In   this   lecture,   I  will  be   introducing   the  use  of  mixed   linear  model   in   the  analyses  of  GWAS  data,  to  estimate  the  proportion  of  variance  for  a  trait  that  can  be  explained  by  all  SNPs  (or  called  SNP   heritability),   to   quantify   the   extent   to   which   two   traits   (or   diseases)   share   a   common   genetic   basis   (genetic  correlation)  using  all  SNPs,  and  to  control  for  population  structure  in  genome-­‐wide  association  analyses  of  individuals  SNPs.        

Page 27: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

18  

   

                   

Dr  Joseph  Powell  Queensland  Brain  Institute  

The  University  of  Queensland      Biography:  Dr  Joseph  Powell  is  a  team  leader  in  the  Centre  for  Neurogenetics  and  Statistical  Genomics  based  at  the  Queensland  Brain  Institute.  He  received  his  PhD  from  the  University  of  Edinburgh  in  2010  followed  by  two  years  working  as  a  post-­‐doctoral  researcher  at  QIMR  Berghofer.        Joseph  has  worked  on  a   range  of   research  projects   involving  methods,   theory  and  application  around   the  nexus  of  quantitative,   statistical   and   population   genetics.   This   has   provided   a   good   foundation   for   his   more   recent   work  investigating  the  genetic  architecture  regulating  gene  expression  and  its  role  within  a  systems  genetics  framework.    Date:  Tuesday  8  July  2014    Presentation  title:  Detection  and  replication  of  epistasis  influencing  transcription  in  humans    Abstract:  Epistasis  is  the  phenomenon  whereby  one  polymorphism’s  effect  on  a  trait  depends  on  other  polymorphisms  present  in   the   genome.   The   extent   to   which   epistasis   influences   complex   traits   and   contributes   to   their   variation   is   a  fundamental  question  in  evolution  and  human  genetics.  Although  often  demonstrated  in  artificial  gene  manipulation  studies  in  model  organisms,  and  some  examples  have  been  reported  in  other  species,  few  examples  exist  for  epistasis  among   natural   polymorphisms   in   human   traits.   Its   absence   from   empirical   findings   may   simply   be   due   to   low  incidence  in  the  genetic  control  of  complex  traits,  but  an  alternative  view  is  that  it  has  previously  been  too  technically  challenging  to  detect  owing  to  statistical  and  computational  issues.  Here  we  show,  using  advanced  computation  and  a  gene   expression   study   design,   that   many   instances   of   epistasis   are   found   between   common   single   nucleotide  polymorphisms  (SNPs).  In  a  cohort  of  846  individuals  with  7,339  gene  expression  levels  measured  in  peripheral  blood,  we   found   501   significant   pairwise   interactions   between   common   SNPs   influencing   the   expression   of   238   genes  (P < 2.91 × 10−16).   Replication   of   these   interactions   in   two   independent   data   sets   showed   both   concordance   of  direction  of  epistatic  effects  (P  =  5.56 × 10−31)  and  enrichment  of  interaction  P  values,  with  30  being  significant  at  a  conservative   threshold   of  P < 9.98 × 10−5.   Forty-­‐four   of   the   genetic   interactions   are   located  within   5  megabases   of  regions   of   known   physical   chromosome   interactions   (P   =   1.8 × 10−10).   Epistatic   networks   of   three   SNPs   or   more  influence  the  expression  levels  of  129  genes,  whereby  one  cis-­‐acting  SNP  is  modulated  by  several  trans-­‐acting  SNPs.  For   example,  MBNL1   is   influenced  by   an  additive  effect   at   rs13069559,  which   itself   is  masked  by   trans-­‐SNPs  on  14  different   chromosomes,   with   nearly   identical   genotype–phenotype  maps   for   each   cis–trans   interaction.   This   study  presents  the  first  evidence,  to  our  knowledge,  for  many  instances  of  segregating  common  polymorphisms  interacting  to  influence  human  traits.        

Page 28: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

19  

   

               

Professor  Graham  Taylor  Department  of  Pathology  University  of  Melbourne  

   Biography:  Professor   Graham   Taylor   is   the   Herman   Professor   of   Genomic   Medicine,   Department   of   Pathology,   University   of  Melbourne,  and  Director  of  the  Australian  Node  of  the  Human  Variome  Project.      In  2006  he  led  the  UK  Department  of  Health  Funded  project  “New  genetic  diagnostic  technologies  for  consanguineous  families  at  risk  of  recessive  genetic  disease”  and  became  Head  of  Genomic  Services  for  Cancer  Research  UK,  chairing  the   advisory   committee   for   genome  wide   association   (GWA)   studies   and   leading   a   review   of   CR-­‐UK   bioinformatics  demand  and  capacity  and  an  evaluation  of  Next  Generation  Sequencing  (NGS)  technology.    In  2009  he  joined  the  Leeds  Teaching  Hospitals  and  Leeds  University  as  Professorial  Head  of  the  Genomics  Translation  Unit.  The  Unit  was  instrumental  in  establishing  the  Leeds  Genetics  Service  as  the  leading  provider  of  genetic  diagnosis  using  NGS  within  the  NHS.  His  team  developed  the  Grouped  Read  Typing  method  for  diagnostic  amplicon  sequencing  in  fixed  tissue,  copy  number  variation  analysis  by  NGS  and  streamlined  conventional  genetic  testing  by  NGS.  In  2012  he  joined  the  University  of  Melbourne.    Date:  Wednesday  9  July  2014    Presentation  title:  The  future  of  DNA  sequencing  technology    Abstract:  I  will   review   the   recent   history   of   “post-­‐Sanger”   sequencing   technology,   and   then  make   some  wild   and   unjustified  extrapolations  into  the  future  based  on  too  few  data  points.    I  will  review  some  of  the  technologies  on  the  horizon  and  ask  how  we  can  appraise  them.      For  example,  if  we  can  define  sequence  read  quality  as  a  composite  of  read  length  and  base-­‐calling  accuracy,  recent  trends  have  overwhelmingly  been  in  the  direction  of  quantity  at  the  expense  of  quality.  As  a  consequence  a  great  deal  of   informatics  effort  has  been  expended  in  managing  rather  poor  quality  data.  Of  course  the  human  genome,  along  with  many  other  genomes,  is  not  particularly  amenable  to  analysis,  contain  entities  such  as  pseudogenes,  non-­‐coding  regions  (sometimes  referred  to  as  “junk”,  sometimes  claimed  to  be  functionally  important)  and  short  repeats.  So  how  does  the  collision  of  a  relatively  refractory  analyte  like  the  human  genome  and  an  imperfect  sequencing  method  result  in  a  “genomics  revolution”?  What  have  we  gained  and  what  are  the  current  limitations  that  need  to  be  addressed  in  future  technologies?    I  will  look  at  two  examples  of  the  impact  of  current  and  pending  sequencing  technology:  tumour  analysis  in  fixed  and  fresh  tissue  and  the  identification  of  allele  expansions.            

Page 29: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

20  

   

             

Dr  Denis  Bauer  Computational  Informatics  (CCI)  

CSIRO      Biography:  Dr.  Bauer  is  interested  in  high-­‐performance  computer  systems  for  integrating  large  data-­‐volumes  to  inform  strategic  interventions  for  human  health.  She  has  a  PhD  in  Bioinformatics  and  post-­‐docs  in  machine-­‐learning  and  genetics,  has  published   in   Nature   Genetics   and   Genome   Research,   was   an   invited   speaker   at   Bio-­‐IT   World   Asia   2013,   and   has  attracted  more  than  AU$360,000  in  funding  (NSW  Cancer  Institute,  CSIRO).    Date:  Wednesday  9  July  2014    Presentation  title:  Population-­‐scale  high-­‐thoughput  sequencing  data  analysis    Abstract:  Unprecedented   computational   capabilities   and   high-­‐throughput   data   collection   methods   promise   a   new   era   of  personalised,   evidence-­‐based   healthcare,   utilising   individual   genomic   profiles   to   tailor   health   management   as  demonstrated   by   recent   successes   in   rare   genetic   disorders   or   stratified   cancer   treatments.   However,   processing  genomic   information   at   a   scale   relevant   for   the   health-­‐system   remains   challenging   due   to   high   demands   on   data  reproducibility   and   data   provenance.   Furthermore,   the   necessary   computational   requirements   require   a   large  investment  associated  with  computer  hardware  and  IT  personnel,  which  is  a  barrier  to  entry  for  small  laboratories  and  difficult   to   maintain   at   peak   times   for   larger   institutes.   This   hampers   the   creation   of   time-­‐reliable   production  informatics  environments  for  clinical  genomics.  Commercial  cloud  computing  frameworks  like  Amazon  Web  Services  (AWS)  provide  an  economical  alternative  to   in-­‐house  compute  clusters  as  they  allow  outsourcing  of  computation  to  third-­‐party  providers,  while  retaining  the  software  and  compute  flexibility.    To   cater   for   this   resource-­‐hungry,   fast-­‐paced   yet   sensitive   environment   of   personalised   medicine,   we   developed  NGSANE,  a  Linux-­‐based,  HPC-­‐enabled  framework  that  minimises  overhead  for  set  up  and  processing  of  new  projects  yet  maintains  full  flexibility  of  custom  scripting  and  data  provenance  when  processing  raw  sequencing  data  either  on  a  local  cluster  or  Amazon’s  Elastic  Compute  Cloud  (EC2).        

Page 30: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

21  

   

               

A/Professor  Marcel  Dinger  Head  of  Clinical  Genomics  &  Genomic  Informatics  

Garvan  Institute  of  Medical  Research      Biography:  A/Prof.  Marcel  Dinger   is   the  Head  of  Clinical  Genomics  and  Genome   Informatics  at   the  Garvan   Institute  of  Medical  Research.  Prior  to  his  position  at  the  Garvan  Institute,  A/Prof.  Dinger  led  Cancer  Genomics  and  Transcriptomics  at  the  University  of  Queensland  Diamantina  Institute.  Marcel  received  his  PhD  from  the  University  of  Waikato  in  2003.  While  undertaking  his  PhD,  Marcel  founded  an  informatics  company  that  produced  a  series  of  highly  successful  products  and  services.   In  2005,  he  resumed  his  academic  career  with  a  prestigious  New  Zealand  Foundation  for  Research  Science  and  Technology  Postdoctoral  Fellowship  to  join  Professor  Mattick’s  group  at  the  Institute  for  Molecular  Bioscience  at  The  University  of  Queensland  to  study  the  role  of   long  noncoding  RNAs   in  mammalian  development  and  disease.   In  2009,   he   was   awarded   an   NHMRC   Career   Development   Award   and   a   Queensland   Government   Smart   Futures  Fellowship.    Date:  Wednesday  9  July  2014    Presentation  title:  Translating  exome  and  whole  genome  sequencing  to  the  clinic    Abstract:  Since  sequencing  the  draft  human  genome  in  2001,  the  number  of  diseases  with  known  genetic  basis  has   increased  >50-­‐fold  to  over  3000.  Despite  this  remarkable  success,  >2000  Mendelian  disorders  remain  unsolved,  and  up  to  70%  of   patients   presenting   at   the   clinic   with   genetic   disorders   remain   undiagnosed.   Clinical-­‐grade   genome   sequencing  holds   the   dual   promise   of   improving   diagnostic   rates,   and   empowering   genetic   research   through   the   discovery   of  novel  disease-­‐associated  variants.  The  long-­‐term  research  value  of  performing  whole  exome  and  genome  sequencing  in  a  diagnostic  setting  on  thousands  of  individuals  will  offset  the  initially  higher  cost  and  complexity,  than  a  targeted  gene-­‐panel  approach.    In  late  2012,  we  established  the  Kinghorn  Centre  for  Clinical  Genomics  (KCCG)  with  the  aim  of  implementing  genomic  medicine  in  Sydney.  At  the  heart  of  the  KCCG  are  2  Illumina  HiSeq  2500  sequencers  that  are  used  for  rapid  turnover  exome   sequencing,   and   more   recently,   one   the   world’s   first   HiSeq   X   Ten   sequencing   suites,   with   capability   of  sequencing  more  than  300  whole  human  genomes  per  week.  Since  we  intend  to  provide  NATA-­‐certified,  clinical-­‐grade  sequencing,   much   of   our   work   over   the   past   12   months   has   been   focused   on   the   development   of   standardised  procedures  for  test  procurement  in  the  clinic  through  to  wet-­‐lab  processes,  bioinformatics  and  clinical  reporting.  The  bioinformatics   workflow   includes   phenotype   capture,   read   alignment,   mutation   calling,   variant   annotation   and  filtering  by  inheritance  pattern,  rarity,  predicted  functional  impact  and  known  disease  association.    To   date,   we   have   sequenced   exomes   from   >100   patients,   from   a   range   of   conditions,   largely   reflecting   the  undiagnosed  caseload  at  the  Sydney  Children’s  Hospital.  We  will  present  some  early  success  stories  from  sequencing  these   exomes   and   reflect   on   the   possibilities   presented   by   low-­‐cost  whole   genome   sequencing   in   the   diagnosis   of  inherited  disease.      

Marcel  E.  Dinger1,  Mark  J.  Cowley1,  Kevin  Ying1,  Jiang  Tao1,  Liviu  Constantinescu1,  Derrick  Lin1,  Paula  Morris1,  Kerith-­‐Rae  Dias1,  Warren  Kaplan1,  Lisa  Ewans2,  Tony  Roscioli2    1.   Kinghorn  Centre  for  Clinical  Genomics,  Garvan  Institute  of  Medical  Research,  Darlinghurst,  NSW,  Australia  2.   Sydney  Children’s  Hospital  and  the  School  of  Women’s  and  Children’s  Health,  UNSW,  Randwick,  NSW,  Australia  

     

Page 31: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

22  

   

               

Professor  John  Quackenbush  Dana  Farber  Cancer  Institute  &  Harvard  School  of  Public  Health  

USA      Biography:  John   Quackenbush   is   a   Professor   of   Computational   Biology   and   Bioinformatics   in   the   Department   of   Biostatistics,  Harvard   School   of   Public   Health   and   at   the   Dana-­‐Farber   Cancer   Institute.   He   has   received   his   PhD   in   1990   in  theoretical   physics   from  UCLA  on   string   theory  models.   Following   two  years   as   a  postdoctoral   fellow   in  physics,  Dr  Quackenbush   applied   for   and   received   a   Special   Emphasis   Research   Career   Award   from   the   National   Center   for  Human  Genome  Research  to  work  on  the  Human  Genome  Project.  He  spent  two  years  at  the  Salk  Institute  and  two  years  at  Stanford  University  working  at   the   interface  of  genomics  and  computational  biology.   In  1997  he   joined  the  faculty   of   The   Institute   for   Genomic   Research   (TIGR)   where   his   focus   began   to   shift   to   understanding   what   was  encoded  within  the  human  genome.  Since   joining  the  faculties  of   the  Dana-­‐Farber  Cancer   Institute  and  the  Harvard  School  of  Public  Health  in  2005,  his  work  has  focused  on  the  use  of  genomic  data  to  reconstruct  the  networks  of  genes  that  drive  the  development  of  diseases  such  as  cancer  and  emphysema.    Date:  Thursday  10  July  2014    Presentation  title:  Taming  the  Big  Data  Dragon    Abstract:  Nearly  every  major   scientific   revolution   in  history  has  been  driven  by  one   thing:  data.   Today,   the  availability  of  Big  Data  from  a  wide  variety  of  sources  is  transforming  health  and  biomedical  research  into  an  information  science,  where  discovery   is   driven   by   our   ability   to   effectively   collect,  manage,   analyse,   and   interpret   data.   New   technologies   are  providing  abundance  levels  of  thousands  of  proteins,  population  levels  of  thousands  of  microbial  species,  expression  measures  for  tens  of  thousands  of  genes,  information  on  patterns  of  genetic  variation  at  millions  of  locations  across  the  genome,  and  quantitative  imaging  data—all  on  the  same  biological  sample.  These  omic  data  can  be  linked  to  vast  quantities  of  clinical  metadata,  allowing  us  to  search  for  complex  patterns  that  correlate  with  meaningful  health  and  medical  endpoints.  Environmental  sampling  and  satellite  data  can  be  cross-­‐referenced  with  health  claims  information  and  Internet  searches  to  provide  insights  into  the  impact  of  atmospheric  pollution  on  human  health.  Anonymised  data  from   cell-­‐phone   records   and   text   messages   can   be   tied   to   health   outcomes   data,   helping   us   explore   disease  transmission  networks.  Realising  the  full  potential  of  Big  Data  will  require  that  we  develop  new  analytical  methods  to  address  a  number  of  fundamental   issues  and  that  we  develop  new  ways  of   integrating,  comparing,  and  synthesising  information  to  leverage  the  volume,  variety,  and  velocity  of  Big  Data.  Using  concrete  examples  from  our  work,  I  will  present  some  examples  that  highlight  the  challenges  and  opportunities  that  present  themselves   in  today’s  data  rich  environment.        

Page 32: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

23  

   

                 

Professor  Falk  Schreiber  Monash  University;  and  

University  Halle-­‐Wittenberg,  Germany      Biography:  Falk  Schreiber  was  awarded  a  PhD  and  a  habilitation  in  Computer  Science  from  the  University  of  Passau  (Germany).  In  2001-­‐2002  he  worked  as  a  Research  Fellow  and  Lecturer  at  the  University  of  Sydney.  Since  2003,  he  has  been  head  of  a   bioinformatics   research   group   at   the   Leibniz   Institute   of   Plant   Genetics   and   Crop   Plant   Research   Gatersleben,  Germany.   In   2007   he   was   appointed   professor   of   Bioinformatics   at   the  Martin   Luther   University   Halle-­‐Wittenberg  (Germany)   and   additionally   Bioinformatics   coordinator   at   the   IPK   Gatersleben.   He   is   currently   taking   a   position   as  professor  in  the  Faculty  of  IT  at  Monash  University.      Dr  Schreiber  has  been  researching  topics  in  bioinformatics  and  computational  systems  biology  more  than  15  years.  His  main   interests   are   visual   computing   and   visual   analytics   of   biological   data,   analysis   of   structure   and   dynamics   of  biological  networks,   integrative  analysis  of  omics  data,  graphical  standards  for  systems  biology,  as  well  as  modelling  and  analysis  of  metabolism.    Date:  Thursday  10  July  2014    Presentation  title:    From  Big  Data  to  smart  knowledge  -­‐  integrating  multimodal  biological  data  and  modelling  metabolism    Abstract:  Modern  data  acquisition  methods   in  the   life  sciences  allow  the  procurement  of  different  types  of  data   in   increasing  quantity,   facilitating   a   comprehensive   view   of   biological   systems.   As   data   are   usually   gathered   and   interpreted   by  separate  domain  scientists,   it   is  hard  to  grasp  multi-­‐domain  properties  and  structures.  Consequently  there  is  a  need  for  the  integration,  analysis,  modelling,  simulation,  and  visualisation  of  life  science  data  from  different  sources  and  of  different  types.      This  talk  focuses  on  these  two  aspects:  firstly,  methods  for  the  integration  and  visualisation  of  multimodal  biological  data  are  presented.  This   is  achieved  based  on   two  graphs   representing   the  meta-­‐relations  between  biological  data,  and   the   measurement   combinations,   respectively.   Both   graphs   are   linked   and   serve   as   different   views   of   the  integrated   data   with   navigation   and   exploration   possibilities.   Data   can   be   combined   and   visualised   multifariously,  resulting  in  views  of  the  integrated  biological  data.  Secondly,  methods  to  reconstruct,  simulate,  and  analyse  detailed  metabolic  models   are   presented.  We  will   focus   on   stoichiometric  models,   and   see   how  different   types   of   data   are  used  to  gather  new  insights  into  metabolic  processes  shown  on  an  example  of  metabolism  in  plants.        

Page 33: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

24  

   

               

Professor  Seok-­‐Hee  Hong  ARC  Future  Fellow  

School  of  Information  Technologies  University  of  Sydney  

 Biography:  Prof.  Hong  is  a  Professor  and  a  Future  Fellow  at  the  School  of  IT,  University  of  Sydney.  She  was  a  Humboldt  Fellow  in  2013-­‐2014,  and  a  project   leader  of  VALACON  (Visualisation  and  Analysis  of  Large  and  Complex  Networks)  project  at  NICTA   (National   ICT  Australia)   in   2004-­‐2007.  Her   research   interests   include   garph   drawing,   algorithms,   information  visualisation  and  visual  analytics.    In  2006,  she  won  the  CORE  (Computing  Research  and  Education  Association  of  Australasia)  Chris  Wallace  Award  for  Outstanding  Research  Contribution  in  the  field  of  Computer  Science,  for  her  research  "Theory  and  Practice  of  Graph  Drawing".  The  award  was  given  for  notable  breakthroughs  and  a  contribution  of  particular  significance.      Prof.   Hong   has   held   research   funding   of   $4.5M,   from   her   three   fellowships   (Future   Fellowship,   ARC   Research  Fellowship  and  Humboldt  Fellowship),  three  ARC  Discovery  Projects  and  two  ARC  Linkage  Projects  including  her  latest  project   on   "Algorithmics   for   Visual   Analytics   of  Massive   Complex   Networks”.   She   has   more   than   140   publications  including  10  edited  books,  7  book  chapters,  40  journal  papers,  and  90  conference  papers,  and  she  has  given  10  invited  talks  at  international  conferences  as  well  as  50  invited  seminars  worldwide.  In  particular,  she  has  developed  an  open  source  visual  analytic  software  GEOMI  with  her  research  team  members.    Prof.   Hong   serves   as   a   Steering   Committee   member   of   GD   (International   Symposium   on   Graph   Drawing),   IEEE  PacificVis   (International  Symposium  on  Pacific  Visualisation)  and   ISAAC  (International  Symposium  on  Algorithms  and  Computations)  and  an  editor  of   JGAA   (Journal  of  Graph  Algorithms  and  Applications).   She  has   served  as  a  Program  Committee  Chair  of  AWOCA  2004,  APVIS  2005/2007,  GD  2007,   ISAAC  2008  and   IEEE  PacificVis  2013,  and  a  Program  Committee   Member   of   50   international   conferences.   In   particular,   she   has   formed   the   Information   Visualisation  research  community  in  the  Asia-­‐Pacific  Region,  by  founding  IEEE  PacificVis  Symposium.    Date:  Thursday  10  July  2014    Presentation  title:  Visual  analytics  of  Big  Data    Abstract:  Recent  technological  advances  have  led  to  the  production  of  a  Big  Data,  and  consequently  have  led  to  many  massive  complex  network  models   in  many  domains   including  science  and  engineering.  Examples   include  biological  networks  such   as   phylogenetic   network,   gene   regulatory   network,   metabolic   pathways,   biochemical   network   and   protein-­‐protein   interaction   networks.   Other   examples   are   social   networks   such   as   Facebook,   Twitter,   Linked-­‐in,   telephone  calls,  patents,  citations  and  collaborations.    Visualisation   is   an   effective   analysis   tool   for   such   networks.   Good   visualisation   reveals   the   hidden   structure   of   the  networks  and  amplifies  human  understanding,  thus   leading  to  new  insights,  new  findings  and  predictions.  However,  constructing  good  visualisation  of  Big  Data  can  be  challenging.    In   this   talk,   I   will   present   a   framework   for   visual   analytics   of   Big   Data.   Visual   Analytics   is   the   science   of   analytical  reasoning   facilitated   by   interactive   visual   interfaces.   Our   framework   is   based   on   the   tight   integration   of   network  analysis  methods  with  visualisation  methods  to  address  the  scalability  and  complexity  issues.  I  will  present  a  number  of  case  studies  using  various  networks  derived  from  Big  Data,  in  particular  social  networks  and  biological  networks.        

Page 34: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

25  

   

                   

Dr  Andrew  Treloar  Director  of  Technology  

Australian  National  Data  Services  (ANDS)      Biography:  Dr  Andrew  Treloar  is  the  Director  of  Technology  for  the  Australian  National  Data  Service  (ANDS)  (http://ands.org.au/),  with   particular   responsibility   for   international   engagement.   In   2008   he   led   the   project   to   establish   ANDS.   He   is  currently  co-­‐chair  of  the  Research  Data  Alliance  (http://rd-­‐alliance.org/)  Technical  Advisory  Board  and  Visiting  Fellow  at   the   Data   Archive   and   Network   Services   organisation   in   the   Netherlands   (http://dans.knaw.nl/).   His   research  interests  include  data  management  and  scholarly  communication.  He  never  seems  to  be  able  to  make  enough  time  for  practising  his  cello,  or  reading,  but  does  try  to  prioritise  talking  to  his  chickens  and  working   in  his  vegetable  garden  and  orchard.  Further  details  at  http://andrew.treloar.net/  or  follow  him  on  Twitter  as  @atreloar.    Date:  Thursday  10  July  2014    Presentation  title:  The  life-­‐sciences  as  a  pathfinder  in  data-­‐intensive  research  practice    Abstract:  The   advent   of   the   Internet   is   bringing   about   fundamental   changes   in   the   ways   that   research   is   performed   and  communicated.  These  have  been  particularly  driven  by  the  growing  importance  of  data,  as  well  as  the  tools  available  to  work  with  this  data.  This  presentation  will  examine  this  shift,  drawing  on  examples  from  the  life-­‐sciences,  and  try  to  make  some  predictions  about  the  next  five  years.        

Page 35: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

26  

   

                     

Dr  Alec  Zwart  CSIRO,  Canberra  

   Biography:  Dr   Zwart   holds   three   degrees   from   the   University   of   Waikato   in   New   Zealand:   BSc   Honours   in   Computing   and  Mathematical  Sciences,  PhD  in  Industrial  Magnetohydrodynamics  and  Master  of  Science  in  Statistics.      After  completing  his  PhD  in  1998,  he  joined  New  Zealand's  National  Institute  for  Water  and  Atmospheric  Research  as  a  mathematical  modeller.  He  then  completed  his  Master  of  Science  in  Statistics  in  2002  and  worked  as  a  tutor,  lecturer  and  part  time  statistical  consultant.      Alec   joined  CSIRO   in  Canberra   in  2006  as  a  biometrician.  He  has  particular   interests   in  agricultural  and  horticultural  statistics,   particularly   the   robust   design   of   agricultural/horticultural   experiments   and   field   trials   and   the   analysis   of  datasets  arising  from  such  experiments.    Date:  Thursday  10  July  2014    Presentation  title:  Statistical  experiment  design  principles  for  biological  studies    Abstract:    

“To   consult   the   statistician   after   an   experiment   is   finished   is   often   merely   to   ask   them   to  conduct  a  post  mortem  examination.  They  can  perhaps  say  what  the  experiment  died  of.”  

 -­‐  Sir  Ronald  Aylmer  Fisher,  the  father  of  modern  statistics.  

 Statistical  experimental  design,  accompanied  by  the  appropriate  statistical  analyses,  plays  a  crucial  role  in  producing  valid   and   precise   inferences,   and   avoiding   ‘design   disasters’   in   empirical   science.   I  will   quickly   refresh   some  of   the  basic  elements  of  experimental  design  in  general,  and  discuss  some  key  issues  and  examples  that  arise  in  the  areas  of  genetics/genomics  and  high  throughput  data.        

Page 36: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

27  

   

                   

Professor  David  Evans  The  University  of  Queensland  Diamantina  Institute  

   Biography:  David   Evans   is   Professor   of   Statistical   Genetics   and   Head   of   Genomic   Medicine   at   the   University   of   Queensland  Diamantina   Institute.  He  obtained  his  PhD  at   the  University  of  Queensland   in  2003,  before  undertaking  a   four  year  postdoctoral  fellowship  in  statistical  genetics  at  the  Wellcome  Trust  Centre  for  Human  Genetics,  University  of  Oxford.  In  2007  he  moved  to  take  up  a  Senior  Lecturer  then  Reader  position  at  the  University  of  Bristol  where  he  has  led  the  genome-­‐wide  association  studies  work  in  the  Avon  Longitudinal  Study  of  Parents  and  Children  (ALSPAC).  His  research  interests   include   the   genetic   study   of   several   complex   traits   and   diseases   including   ankylosing   spondylitis,  osteoporosis,  atopic  dermatitis  and  three  dimensional   face  shape  via  genome-­‐wide  association  and  next  generation  sequencing  approaches.  His  other  main  research  interest  is  in  the  development  of  statistical  methodologies  in  genetic  epidemiology  including  approaches  for  gene  mapping,  individual  risk  prediction,  casual  modelling  including  Mendelian  randomisation  and  dissecting  the  genetic  architecture  of  complex  traits.  On  weekends  he  likes  to  surf  and  is  enjoying  the  temperature  difference  between  Queensland  waters  and  the  northern  coast  of  Devon.    Date:  Thursday  10  July  2014    Presentation  title:  Genome-­‐wide  association  studies    Abstract:  Genome-­‐wide  association   studies  have  been  spectacularly   successful  over   the   last   few  years   in   terms  of   identifying  common  genetic  variants  associated  with  complex  traits  and  diseases.  David  will  explain  how  simple  statistical   tests  can  be  used  to  map  genetic  loci  associated  with  complex  traits.  This  will  include  a  discussion  of  genotype  imputation,  meta-­‐analysis,  approaches  to  detect  and  correct  for  population  stratification,  as  well  as  some  guidelines  on  how  the  results  from  genome-­‐wide  association  studies  should  be  interpreted  and  replicated.        

Page 37: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

28  

   

                 

Dr  Marie-­‐Laure  Martin  Magniette  Mathématiques  et  Informatique  Appliquées  

Institut  National  de  la  Recherche  Agronomique  France  

   Biography:  Marie-­‐Laure  Martin-­‐Magniette   is   a   director   of   research   at   the   French   National   Institute   for   Agronomical   Research  (INRA)   in   the   Unit   of   Applied   Mathematics   and   Computer   Sciences   (Statistics   &   Genome   team)   and   in   the   Plant  Genomics   Research   Unit   (Bioinformatics   for   predictive   genomics   team).   In   2001,   she   has   received   her   PhD   in  Université  Paris-­‐Sud,  France  for  the  development  of  new  survival  models  taking   into  account  measurement  error  of  covariates   and   allowing   the   estimation   of   flexible   hazard   function.   She   did   a   one   year   postdoctoral   fellowship   in  epidemiology  at   INRA  and  at  Nantes  Hospital   and  was   recruited  as   junior   researcher   at   INRA   in   the  Plant  Breeding  Department  in  2003.    Since  2003,  Marie-­‐Laure  has  been  strongly  involved  in  the  analyses  of  genomic  data  and  is  at  the  interface  between  statistics  and  molecular  biology.  She  has  been  for  11  years  in  charge  of  the  statistical  analyses  of  the  data  produced  by  the  transcriptomic  platform  of  the  Plant  Genomics  Research  Unit.  Since  2003,  she  has  acquired  a  strong  expertise  on  the  data  normalisation  and  the  differential  analysis  for  microarray  and  high-­‐throughput  sequencing  technologies.  She  has  also  investigated  the  analysis  of  chIP-­‐chip  data  to  detect  enriched  regions  and  differentially  methylated  regions.    Since  2005  she  has  been  focused  on  the  discovery  and  characteristics  of  underlying  structures   in  genomic  data  with  mixture  models  and  Hidden  Markov  Models.  She  conceived  these  models  in  close  collaboration  with  fellow  biologists  and   statisticians.   Since   September   2013,   she   has   led   the   team   Bioinformatics   for   predictive   genomics   of   the   Plant  Genomics   Research   Unit.   Her   team   project   is   highly   interdisciplinary   and   deals   with   the   construction   of   genomic  networks   of   the   plant   model   Arabidopsis   thaliana   for   the   discovery   of   functional   modules   and   the   prediction   of  functions  of  orphan  genes  involved  in  stress  responses.    Date:  Thursday  10  July  2014    Presentation  title:  Mixture  models  for  analysing  transcriptome  and  chIP-­‐chip  data    Abstract:  Mixture  models   are   useful   for   identifying   underlying   structures.   In   such  models,   the   density   of   the   observations   is  modelled  by   a  weighted   sum  of   parametric   density   (e.g.   each   component   is   a  Gaussian  distribution)   and   each  one  represents  a  subpopulation  composed  of  observations  sharing  common  characteristics.  The  first  part  of  my  talk  will  be  dedicated  to  a  presentation  of  the  mixture  models.  I  will  explain  the  concept  and  the  outputs  of  an  analysis  based  on  a  mixture   through   easy   examples.   In   the   second   part   of  my   talk,   I  will   show   how  mixture  models   can   be   applied   to  analyse   transcriptomic   (co-­‐expression   analysis   of   Arabidopsis   thaliana   genes)   and   chIP-­‐chip   data   (detection   of  enriched  regions  and  of  differentially  methylated  regions).          

Page 38: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

29  

   

                 

Dr  Kim-­‐Anh  Lê  Cao  The  University  of  Queensland  Diamantina  Institute  

   Biography:  Dr  Kim-­‐Anh  Lê  Cao  was  awarded  her  PhD   in  2008   in  Université  de  Toulouse,   France.   She  was  awarded   the   "Marie-­‐Jeanne  Laurent-­‐Duhamel"  prize  2009  of   the  Société  Française  de  Statistique   (French  Statistical   Society)   for  her  PhD  thesis.    She  started  her  postdoc  in  late  2008  in  the  ARC  Centre  of  Excellence  in  Bioinformatics    with  Prof.  Geoff  McLachan  and  then  worked  as  a  research-­‐only  academic  in  QFAB  Bioinformatics.  She  is  now  based  in  the  University  of  Queensland  Diamantina  Institute.    Since   the   beginning   of   her   PhD   Kim-­‐Anh   has   initiated   a   wide   range   of   valuable   collaborative   and   research  opportunities   in   both   statistics   and  molecular   biology.  Her   research   interests   are  multidisciplinary   as   they   focus  on  mathematical   statistics   characterisation   of  molecular   biological   systems,   and   she   is   interested   in   developing   sound  statistical   frameworks   applied   to   addressing   new   biological   questions   arising   from   these   frontier   molecular  technologies.  Her  main  research  focus  is  on  variable  selection  for  biological  data  (‘omics’  data)  coming  from  different  functional  levels  by  the  means  of  dimension  reduction  approaches.    Date:  Thursday  10  July  2014    Presentation  title:  Multivariate  models  for  dimension  reduction  and  biomarker  selection  in  omics  data  

Abstract:  Recent   advances   in   high   throughput   ’omics’   technologies   enable   quantitative   measurements   of   expression   or  abundance  of  biological  molecules  of  a  whole  biological  system.  The  transcriptome,  proteome  and  metabolome  are  dynamic   entities,   with   the   presence,   abundance   and   function   of   each   transcript,   protein   and   metabolite   being  critically  dependent  on  its  temporal  and  spatial  location.    Whilst   single   omics   analyses   are   commonly   performed   to   detect   between-­‐groups   difference   from   either   static   or  dynamic   experiments,   the   integration   or   combination   of   multi-­‐layer   information   is   required   to   fully   unravel   the  complexities  of  a  biological  system.  Data  integration  relies  on  the  currently  accepted  biological  assumption  that  each  functional   level   is   related   to   each   other.   Therefore,   considering   all   the   biological   entities   (transcripts,   proteins,  metabolites)  as  part  of  a  whole  biological  system  is  crucial  to  unravel  the  complexity  of  living  organisms.    With  many  contributors  and  collaborators,  we  have  further  developed  several  multivariate  approaches  to  project  high  dimensional  data   into  a  smaller  space  and  select   relevant  biological   features,  while  capturing   the   largest  sources  of  variation   in   the   data.   These   approaches   are   based   on   variants   of   partial   least   squares   regression   and   canonical  correlation  analysis  and  enable  the  integration  of  several  types  of  omics  data.    In  this  presentation,  I  will  illustrate  how  various  techniques  enable  exploration,  biomarker  selection  and  visualisation  for  different  types  of  analytical  frameworks.        

Page 39: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

30  

   

                 

Dr  Rob  Lanfear  Senior  Lecturer  

Department  of  Biological  Sciences  Macquarie  University  

   Biography:  Rob  is  a  senior  lecturer  at  Macquarie  University  in  Sydney,  where  he  works  on  molecular  evolution  and  phylogenetics  with   the   aim   of   understanding   the   causes   and   consequences   of  molecular   evolution.   His   work   bridges   spatial   and  temporal  scales:  from  developing  methods  to  identify  and  understand  mutations  that  occur  within  a  single  individual  over   a   few   decades,   to   analysing   the   long-­‐term   evolution   of   globally-­‐distributed   clades   of   species   over  millions   of  years.  He  also  investigates  theoretical  aspects  of  molecular  evolution,  and  has  developed  new  statistical  methods  and  software  to  help  infer  phylogenies  from  huge  DNA  datasets.  Rob  has  an  undergraduate  degree  in  Ecology  (Durham),  a  Masters   in   Artificial   Intelligence   (Sussex),   and   PhD   in   Developmental   Biology   (Sussex).   He   switched   to   studying  molecular  evolution  and  phylogenetics  full-­‐time  during  his  postdoctoral  work  at  the  Australian  National  University.    Date:  Friday  11  July  2014    Presentation  title:  An  introduction  to  phylogenetic  inference    Abstract:  Phylogenies  are   fantastically   important   in  biology.   In  addition   to   telling  us   the   relationships  among  organisms,   they  can   be   used   to   date   evolutionary   divergences,   delineate   species,   track   disease   outbreaks,   understand   molecular  evolution,  and  inform  conservation  decisions.  This  talk  will  give  a  quick  overview  of  some  of  these  applications,  and  then  delve  deeper  into  the  methods  that  can  be  used  to  infer  phylogenies  from  molecular  sequence  data.  The  talk  will  explain   and   compare   parsimony   methods,   distance   methods,   maximum   likelihood,   and   Bayesian   approaches   to  phylogenetic  inference.  It  will  finish  up  by  introducing  some  of  the  most-­‐recent  methodological  advances  for  inferring  phylogenies   from   phylogenomic   datasets   –   gigantic   datasets   that   can   include   thousands   genes   from   thousands   of  species.              

Page 40: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

31  

   

             

Distinguished  Professor  David  Penny  Institute  of  Fundamental  Sciences  

Massey  University  New  Zealand  

   Biography:  David  Penny  has  been   involved  with  reconstructing  evolutionary  trees  from  DNA  and  protein  sequences  for  over  30  years,  and  is  now  extending  this  to  predicted  tertiary  structures.  As  a  biologist,  he  has  worked  with  mathematicians  (particularly  Professor  Mike  Hendy)  in  order  to  allow  quantitative  evaluation  of  the  results,  and  to  measure  the  rate  of  convergence  as  sequences  get   longer.  His   interests   include  any   likely  deviations  from  the  model  of  evolution  that   is  assumed,  and  what  effect,  if  any,  this  is  likely  to  have  on  the  tree  that  is  produced.      David  holds  undergraduate  degrees  in  Botany  (BSc)  and  Chemistry  (BSc  Honours)  from  Canterbury  University  College  (Christchurch  NZ),  and  a  PhD  in  Biology  from  Yale  University.  Following  postdoctoral  research  at  McMaster  University  (Hamilton,   Ontario,   Canada)   he   returned   to   New   Zealand   (Massey   University),   where   he   is   now   Distinguished  Professor  of  Theoretical  Biology.   In  2000  he  was  awarded  the  Marsden  Medal  of   the  NZ  Association  of  Scientists   in  recognition  of  his  outstanding  service  to  science.  He  is  a  Fellow  of  the  Royal  Society  of  New  Zealand,  and  in  2004  was  awarded  the  Rutherford  Medal  in  recognition  of  his  contributions  in  theoretical  biology,  molecular  evolution,  and  the  analysis  of  DNA.  In  2006  he  was  made  a  Companion  of  the  New  Zealand  Order  of  Merit  for  services  to  science.  He  is  a  former  president  of  the  NZ  Association  of  Scientists.    Date:  Friday  11  July  2014    Presentation  title:  Loss  of  information  at  deeper  divergences,  and  what  we  can  do  about  it    Abstract:  It  has  been  shown  by  Mossel  and  Steel  (2004)  that  simple  Markov  models  lose  information  at  the  deepest  divergences  (say,  greater  than  400  million  years  ago);  and  that  the  fall-­‐off  is  exponential  at  deeper  times.  However,  that  does  not  mean   that   there   is   no   information   left;   for   example,   the   three-­‐dimensional   structure   of   proteins   should   still   retain  information  about  deeper  divergences,   although  we  may  not   yet   know  how   to  use   that   information.  Biologists   still  want  to  estimate  the  deeper  divergences  and  thus  it  is  a  significant  question  to  find  additional  sources  of  information.  Several  suggestions  are  offered  that  require  a  more  formal  analysis.  Firstly,  we  probably  expect  that  where  there  is  a  real   Gamma   distribution   of   rates,   information   may   be   retained   for   longer.   Secondly,   if   there   is   really   a   bimodal  distribution  of  rates,  then  identifying,  and  eliminating  these  faster-­‐evolving  sites  should  help.  Thirdly,  the  inference  of  ancestral   sequences   at   deeper   divergences   appears   quite   robust,   and   there   is   some   evidence   that   this   may   help  recover   deeper   divergences.   Fourthly,   it   is   increasingly   possible   to   infer   three-­‐dimensional   structures,   and   these  should   retain   information   longer.   Fifthly,   there   may   be   differences   between   the   loop   regions   of   Akaryote   and  Eukaryote   proteins,   and   only   taking   the   regions   crossing   the   central   3D   region  might   help.  Sixthly,   an   approach   of  weighting,  not  of  characters,  but  of  the  partitions  they  are  consistent  with,  might  help.  Seventhly,  possibly  gene  order  information   might   be   helpful.   Several   examples   of   such   approaches   will   be   presented,   and   a   challenge   issued   to  theoreticians  to  solve  some  of  these  fundamental  issues.  There  is  still  a  lot  to  learn  about  protein  evolution.        

Page 41: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

32  

   

               

Professor  Lindell  Bromham  Centre  for  Macroevolution  &  Macroecology,  

Evolution,  Ecology  &  Genetics  Research  School  of  Biology  

Australian  National  University      Biography:  I   am  an   evolutionary   biologist,   and   I   am   interested   in  ways   of   testing   ideas   about  macroevolutionary   patterns   and  mechanisms,  particularly  the  way  that  phylogenies  constructed  from  DNA  sequence  data  can  be  used  to  understand  evolutionary  past  and  processes.   I  have  used  comparative  analyses   to   investigate  processes  of  evolutionary   change  spanning   timescales   from   current   patterns   of   biodiversity   to   ancient   evolutionary   patterns.   But   in   order   to   use  molecular   data   to   understand   evolution,   we   need   to   understand   how   evolutionary   information   is   recorded   in   the  genome,   so   I   also   study   the   way   that   patterns   and   rates   of   molecular   evolution   are   influenced   by   species  characteristics,  environment,  and  macroevolutionary  processes.    Date:  Friday  11  July  2014    Presentation  title:  From  mutation  to  macroevolution    Abstract:  Molecular  phylogenetics  allows  us  to  use  the  patterns  of  changes  in  the  genomes  of  different  species  to  reconstruct  evolutionary  history.  This  has  revolutionised  studies  of  macroevolution,  which  focus  on  the  patterns  and  processes  of  variation   in   biodiversity   over   time,   space   or   lineages.   But   molecular   phylogenies   are   not   just   a   useful   tool   in  macroevolution,   they   are   also   a   way   of   thinking   about   the   connection   between   change   at   the   genomic   level   and  evolution  at  the   level  of  global  biodiversity.   I  will  use  a  number  of  examples  to  explore  how  molecular  phylogenetic  analysis   has   the   potential   to   overcome   the   hierarchical   distinction   between  macroevolution   and  microevolution   by  allowing  us  to  consider  us  to  consider  genome-­‐level,  population-­‐level  and  lineage-­‐level  patterns  in  a  single  analysis.        

Page 42: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

BIOGRAPHY  AND  ABSTRACT  

33  

   

               

Professor  Mike  Wilkinson  Head,  School  of  Agriculture,  Food  and  Wine  

The  University  of  Adelaide      Biography:  Professor   Mike  Wilkinson   is   Head   of   the   School   of   Agriculture,   Food   and  Wine   at   the   University   of   Adelaide   and  Director   of   the  Waite   Research   Institute.   The   UK-­‐born   research   scientist,   who   joined   the   University   in   September  2011,  is  best  known  for  his  work  on  quantifying  the  risks  associated  with  GM  crops,  and  has  published  extensively  in  this  area.    He   has   a   PhD   from   the  University   of   Leicester   in   hybridisation   and   evolutionary   processes   in  wild   grasses.   Prior   to  immigrating   to   Adelaide   in   2011,   Professor   Wilkinson   established   the   world’s   first   Master   of   Science   focused   on  training  regulators  of  GM  crops,  a  project  funded  by  the  Bill  and  Melinda  Gates  Foundation.    A  specialist  in  plant  genetics,  Professor  Wilkinson  has  previously  worked  at  the  Scottish  Crop  Research  Institute  in  crop  research   and   cytogenetics,   was   Director   of   the   Institute   of   Biological   Sciences   at   Aberystwyth   University   and   also  Trustee  of  the  National  Botanic  Gardens  in  Wales.    Professor  Wilkinson  has  over  20  years  of  research  experience  in  plant  and  animal  genetics  and  has  published  several  significant  works  in  the  area  of  plant  epigenetics.  Most  recently,  his  studies  into  epigenetics  featured  several  papers  in  high-­‐impact   international   journals   including   Nature   Communications   (on   the   epigenetics   of   the   human   parasite  schistosomiasis),  Analytical  Chemistry  (two  works  on  the  chemistry  of  DNA  methylation)  and  Journal  of  Experimental  Botany  (on  heritable  epigenetic  effects).  He  also  holds  three  patents  in  the  field  and  has  secured  several  million  dollars  of  external  funding   in  support  of  epigenetics  research.  He  is  well  acquainted  with  all  the  methods  to  be  used  in  the  project   and   co-­‐developed   some   of   them.   Over   the   course   of   his   career,   he   has   supervised   >30   PhD   students   to  completion  (all  within  4  years).    Date:  Friday  11  July  2014    Presentation  title:  The  application  of  high  throughput  DNA  barcoding  for  landscape  ecology  and  management    Abstract:  One  of  the  chief  justifications  for  the  development  of  DNA  barcoding  for  species  identification  rested  in  the  potential  the   rapid   identification  of   cryptic   species  or   of   representatives   from   taxonomically   problematic   groups  without   the  need  for  detailed  anatomical  characterisation  or  reference  to  a  small  number  of  specialists  for  the  group.  This  need  is  most   keenly   felt   in   poorly   studied   regions   of   high   biodiversity   or   in   cases   where   morphological   identification   is  rendered  impossible  because  of  incomplete  or  degraded  specimens,  or  for  mixed  samples  containing  multiple  species.  In   this   presentation,   I   will   provide   a   series   of   case   studies   to   illustrate   the   value   of   next-­‐generation   sequencing   in  enhancing  the  potential  of  DNA  barcoding  for  the  purposes  of  species  discovery,  the  risk  assessment  of  GM  crops,  diet  reconstruction  and  the  study  of  ancient  DNA.      

Page 43: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ
Page 44: MATHEMATICAL & COMPUTATIONAL BIOLOGY - Bioinformaticsbioinformatics.org.au/ws14/wp-content/uploads/ws14/... · îìíðt/Ed Z^ , KK> /E Dd, Dd/ > KDWhdd /KE> /K>K'z µ ] } ]µu Yµ

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL & COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellence in Bioinformatics

IMB

2014 WINTER SCHOOLIN

MATHEMATICAL& COMPUTATIONAL BIOLOGY

AuditoriumQueensland Bioscience PrecinctThe University of Queensland

Brisbane, Australia

Hosted by:

PROGRAM

7-11 July 2014

ARC Centre of Excellencein Bioinformatics

IMB Sponsored by: