Top Banner
Iden%fying Behavioral Strategies through Large Scale Phenotyping and Sta%s%cal Analysis Stephen Helms, Ph.D. April 8, 2014 – EYR Global FOM Ins%tute AMOLF, Amsterdam, Netherlands Leon Avery (VCU), Greg Stephens (VU Amsterdam/OIST), Tom Shimizu (AMOLF)
16

Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Jul 31, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Iden%fying  Behavioral  Strategies  through  Large  Scale  Phenotyping  and  Sta%s%cal  Analysis  

Stephen  Helms,  Ph.D.  April  8,  2014  –  EYR  Global  FOM  Ins%tute  AMOLF,  Amsterdam,  Netherlands  Leon  Avery  (VCU),  Greg  Stephens  (VU  Amsterdam/OIST),  Tom  Shimizu  (AMOLF)  

 

Page 2: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

How  Do  We  Understand  Complex  Systems  With  Many  Parts?  

(Also  a  general  “big  data”  ques%on!)  

A  Model  Complex  System  

Tradi%onal  approaches  for  understanding  

complex  biological  systems  

Sta%s%cal  approach  for  understanding  

biological  systems  

Data  and  computa%on  problems  

Proposed  computa%onal  

pla^orm  

Outlook  for  the  future  

Page 3: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

A  Simple  Model  Nervous  System:  C.  elegans  

S%muli   Response  

The  Worm  •  ~1000  total  cells  

•  302  neurons  •  95  muscles  

•  ~20000  genes  •  Smell  

(vola%le  odors)  •  Taste  

(soluble  chemicals)  •  Feel  

(touch,  heat)  

•  Movement  •  Neural  ac%vity  •  Biochemical  reac%ons  

Page 4: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

A  Biologist’s  Toolbox  • Break  individual  parts,  see  what  happens  Gene%cs  

• Look  at  how  parts  chemically  interact  Biochemistry  

• See  where  the  parts  are  Cell  Biology  

End  result:  • A  list  of  lots  of  details  about  what  individual  genes  and  proteins  are  doing  • But  no  clear  view  on  what  the  system  as  a  whole  does  

Page 5: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Idea:  Finding  Simple  Models  Through  Quan%ta%ve,  Compara%ve  Studies  

•  Build  quan=ta=ve  models  that  are  just  complicated  enough  to  explain  the  phenotypes  we  can  observe  and  care  about  

•  Compare  models  across  mul%ple  strains  and  species  to  see  what  phenotypes  biology  cares  about  

•  The  molecular  and  cellular  details  can  be  filled  in  later  using  tradi%onal  approaches  

•  Model  system:  Mo=le  behavior  –  Behavior  is  the  output  of  all  the  complicated  systems  of  an  organism  

Page 6: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

C.  elegans  Behavior  

•  Undulatory  mo%on  •  Occasional  reversals  •  Occasional  sharp  “omega”  turns  

•  Con%nuous  turning  

Gray  and  Lissmann  (1964)  J.  Exp.  Biol.  41:135-­‐54,  Croll  (1975)  J  Zool.  176:159–176,  Croll  (1975)  Adv  Parasitol  13:71–122,  Pierce-­‐Shimomura  et  al.  (1999)  J.  Neurosci.  19:9557-­‐69.  Iino,  Y.  &  Yoshida,  K.  (2009)  J.  Neurosci.  29:5370-­‐80.  Helms  (2013)  Figshare.hqp://dx.doi.org/10.6084/m9.figshare.705155  

Page 7: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Experimental  Overview  

Record  video  of  freely  moving  worms  up  to  30  

minutes  Extract  behavioral  data   Develop  models  

Page 8: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Sampling  Behavioral  Variability:  Individual,  Intra-­‐  and  Inter-­‐Species  

Holovachov,  O.  et  al.  (2009)  Nematology  11(6):927-­‐950.  Chiang,  J.-­‐T.A.  et  al.  (2006)  J.  Exp.  Biol.  209(10):1859-­‐73.  Andersen,  E.C.  et  al.  (2012)  Nat.  Genet.  44(3):285-­‐90.  

Up  to  20  individuals  per  strain  

Page 9: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Building  Quan%ta%ve  Models  

• Correla%on  func%ons  • Phase  spaces  • Firng  linear  models  

Determinis%c  dynamics  

• Distribu%ons  Stochas%c  components  

• Monte  Carlo  simula%ons  • Comparison  with  sta%s%cs  of  data  

Simula%ons  

Page 10: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Comparing  Quan%ta%ve  Models  

Parameter  Correla%on  Matrix   Paqerns  (Modes)   Simula%ons  

Page 11: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Data  Challenges  

Storage  

• Videos  are  large  • 240  GB/h  raw  • 12  GB/h  compressed  

• Using  ~1  TB  of  storage  for  a  proof  of  concept  project  

• Want  to  scale  up:  • #  individuals  by  10-­‐fold  

• Sampling  rate  by  3-­‐fold  

Processing  

• >3-­‐fold  slower  than  data  collec%on  on  a  desktop  computer  

• Results  in:  • A  backlog  of  data  to  analyze  

• A  long  delay  before  experiments  can  be  interpreted  

Sharing  

• Videos  are  too  big  to  regularly  transfer  around  

• Extracted  data  is  also  big  • 2  GB  for  the  proof  of  concept  project  

• Limited  ability  for  others  to  explore  the  data  themselves  

Need  to  record  data  on  many  individuals  for  a  long  =me  at  high  frequency  

Page 12: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Proposal:  Centrally  located  data  processing  and  

 analysis  services  at  SURFsara  

SURFsara  Video  storage  

Video  processing  Standard  analyses  

Experimental  Users  (AMOLF,  VCU,  etc.)  Generate  videos  Visualize  data  

Develop  analyses  

Theory  Users  (VU,  OIST,  etc.)  Visualize  data  

Develop  analyses  

Exchange  datasets  and  analysis  results  (few  GBs,  weekly)  

Upload  videos  Download  datasets  (hundreds  of  GBs,  daily  at  peak)  

Download  datasets  (tens  of  GBs,  weekly)  

• Loading  large  (>10  GB)  videos  • Processing  104-­‐106  frames  /  video  

Page 13: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

How  EYR  Is  Helping  

Storage  

•  SURFsara  will  provide  up  to  20  TB  of  storage  for  the  video  data  

Processing  

•  SURFsara  will  provide  compu%ng  resources  • Cloud  or  grid  

•  eScience  Center  is  helping  with  migra%ng  analysis  code  to  run  on  HPC  infrastructure  

Sharing  

•  Internet2  and  SURFnet  are  connec%ng  the  involved  ins%tutes  with  SURFsara  using  high-­‐speed  lightpath  connec%ons  •  FOM  Ins%tute  AMOLF  • VU    • Okinawa  Ins%tute  of  Science  and  Tech  

• Virginia  Commonwealth  University  

Page 14: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Growth  Prospects  •  Open  source  aspects  of  C.  elegans  community  

–  WormBook  -­‐  textbook  –  WormBase  -­‐  gene%cs  –  WormAtlas  -­‐  anatomy  –  etc.  

•  As  an  analysis  service  available  to  other  researchers  –  Mo%lity  is  widely  used  as  a  simple  phenotype  by  C.  elegans  researchers  

•  Collabora%ve  development  of  new  analysis  methods  –  Other  researchers  developing  sta%s%cal  analysis  approaches  for  worm  behavior  

•  Integra%on  of  neuronal  imaging  data  –  Ongoing  experiments  in  the  systems  biology  group  at  AMOLF  

R.  Doornekamp,  FOM  InsBtute  AMOLF  

Page 15: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

These  Are  General  Challenges  

•  Increasing  temporal  and  spa%al  resolu%on  à  more  data  

Advances  in  imaging  sensors  

•  Increasing  experimental  throughput  à  more  data,  access  to  sta%s%cal  approaches  

Advances  in  experimental  techniques  

• Distor%on  of  data  due  to  compression  ar%facts  is  a  major  concern  among  experimentalists  

Lack  of  compression  

op%ons  

Page 16: Iden%fying*Behavioral*Strategies*through* …meetings.internet2.edu/media/medialibrary/2014/04/22/20140408-hel… · 22-04-2014  · Iden%fying*Behavioral*Strategies*through* Large*Scale*Phenotyping*

Acknowledgements  •  Enlighten  Your  Research  4  and  Global  Teams  

–  Nicole  Gregoire  (SURFnet)  –  Sylvia  Kuijpers  (SURFnet)  –  Jan  Bot  (SURFsara)  –  Frank  Seinstra  (eScience  Center)  

•  eScience  Center  –  Rob  van  Nieuwpoort  –  Elena  Ranguelova  

•  Everyone  else  involved  @  SURFnet,  SURFsara,  Internet2  •  Local  ICT  members  

–  Carl  Schulz  (AMOLF)