Top Banner
Towards Incidental Collaboratories For Experimental Data Anita de Waard VP Research Data Collabora>ons Elsevier RDS, Jericho, VT, USA Thanks: Maryann Martone, Anita Bandrowski, NIF, UCSD Nathan Urban, Shreejoy Thripathy, CMU Ed Hovy, Gully Burns, ISI/CMU; Phil Bourne, UCSD
19

Towards Incidental Collaboratories For Experimental Data

May 10, 2015

Download

Documents

Anita de Waard

Talk at 3D virtual cell meeting, San Diego, CA, December 13-14 2012
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Incidental Collaboratories For Experimental Data

Towards  Incidental  Collaboratories  For  Experimental  Data  

Anita  de  Waard  VP  Research  Data  Collabora>ons  

Elsevier  RDS,  Jericho,  VT,  USA  

Thanks:    Maryann  Martone,  Anita  Bandrowski,  NIF,  UCSD  Nathan  Urban,  Shreejoy  Thripathy,  CMU  Ed  Hovy,  Gully  Burns,  ISI/CMU;  Phil  Bourne,  UCSD  

Page 2: Towards Incidental Collaboratories For Experimental Data

Problem:  a  rose  is  not  a  rose:  

•  “…there  was  significant  variability  of  the  injected  venom  composi>on  from  specimen  to  specimen,  in  spite  of  their  common  biogeographic  origin.”  

Jose  A.  Rivera-­‐Or>z,  Herminsul  Cano,  Frank  Marí,  Intraspecies  variability  of  the  injected  venom  of  Conus  ermineus,  doi:10.1016/j.pep>des.2010.11.014  

•  “…Strains  DV-­‐3/84  DV-­‐7/84  (group  3)  showed  76.6%  similarity  to  each  other  and  were  similar  to  all  other  strains  at  the  67.6%  level.”  

Zofia  Dzierżewicz  et  al.,  Intraspecies  variability  of  Desulfovibrio  desulfuricans  strains  determined  by  the  gene>c  profiles,  FEMS  Microbiology  Leeers,  Volume  219,  Issue  1,  14  February  2003,  Pages  69–74,  doi:10.1016/S0378-­‐1097(02)01199-­‐0    

=>  A  specimen  is  not  a  species!  

Page 3: Towards Incidental Collaboratories For Experimental Data

Problem:  gene  expression  varies  with:  Age:  “SIRT1-­‐Associated  genes  are  deregulated  in  the  aged  brain”  

Philipp  Oberdoerffer  et  al.,  SIRT1  RedistribuJon  on  ChromaJn  Promotes  Genomic  Stability  but  Alters  Gene  Expression  during  Aging,  Cell,  Volume  135,  Issue  5,  28  November  2008,  Pages  907–918,  doi:10.1016/j.cell.2008.10.025  

Smell:  “…major  urinary  proteins  […]  mediate  the  pregnancy  blocking  effects  of  male  urine”  

P.A.  Brennan,  et  al,  PaOerns  of  expression  of  the  immediate-­‐early  gene  egr-­‐1  in  the  accessory  olfactory  bulb  of  female  mice  exposed  to  pheromonal  consJtuents  of  male  urine,  Neuroscience,  Volume  90,  Issue  4,  June  1999,  P  1463–1470,  doi:10.1016/S0306-­‐4522(98)00556-­‐9  

Hunger:  “Out  of  the  ~30K  genes,  about  10K  are  differen>ally  expressed  in  liver  cells  when  an  animal  is  in  different  states  of  sa>ety.“  

Zhang  F,  Xu  X,  Zhou  B,  He  Z,  Zhai  Q  (2011)  Gene  Expression  Profile  Change  and  Associated  Physiological  and  Pathological  Effects  in  Mouse  Liver  Induced  by  Fas>ng  and  Refeeding.    PLoS  ONE  6(11):  e27553.  doi:10.1371/journal.pone.002755    

Light:  “Longer-­‐term  enrichment  training  also  altered  the  mRNA  levels  of  many  genes  associated  with  structural  changes  that  occur  during  neuronal  growth.”  

Cailoeo  C.,  et  al.  (2009)  Effects  of  Nocturnal  Light  on  (Clock)  Gene  Expression  in  Peripheral  Organs:  A  Role  for  the  Autonomic  Innerva>on  of  the  Liver.  PLoS  ONE  4(5):  e5650.  doi:10.1371/journal.pone.0005650:    

  =>  Knowing  genes  is  not  knowing    how  they  are  expressed!  

Page 4: Towards Incidental Collaboratories For Experimental Data

•  “We  found  the  diversity  and  abundance  of  each  habitat’s  signature  microbes  to  vary  widely  even  among  healthy  subjects,  with  strong  niche  specializa>on  both  within  and  among  individuals.”  

The  Human  Microbiome  Project  Consor>um,  Structure,  func>on  and  diversity  of  the  healthy  human  microbiome,  Nature  486,  207–214  (14  June  2012)  doi:10.1038/nature11234  

•  “Coloniza>on  of  an  infant’s  gastrointes>nal  tract  begins  at  birth.  The  acquisi>on  and  normal  development  of  the  neonatal  microflora  is  vital  for  the  healthy  matura>on  of  the  immune  system.”    

Mackie  RI,  Sghir  A,  Gaskins  HR.,  Developmental  microbial  ecology  of  the  neonatal  gastrointes>nal  tract.  Am  J  Clin  Nutr.  1999  May;69(5):1035S-­‐1045S  

Problem:  no  man  (or  mouse)  is  an  island…    

=>  An  animal  is  an  ecosystem!  

Page 5: Towards Incidental Collaboratories For Experimental Data

Interac>ons  create  more  complexity:    •  Compu>ng  cancer:  “No  amount  of  informa,on  about  what  happens  inside  a  single  cell  can  ever  tell  you  what  a  ,ssue  is  going  to  do,”  [Glazier]  said.  “Much  of  the  informa>on  and  complexity  of  >ssues  and  life  is  embedded  in  the  way  cells  talk  to  each  other  and  the  extracellular  environment.”    

•  Megadata:“These  complex  emergent  systems  are  impossible  to  understand,”,”[we]  founded  Applied  Proteomics  to  create  a  protein  diagnos>c  that  reveals  not  just  where  a  cancer  is,  but  how  it  interacts  with  the  body..”   Nature  Special  Issue  Vol.  491  No.  7425  

‘Physical  Scien>sts  Take  On  Cancer’  :    

=>  The  whole  is  more  than  the  sum  of  its  parts!  

Page 6: Towards Incidental Collaboratories For Experimental Data

Big  problems  in  biology:  

hep://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg  

•  Interspecies  variability  >  A  specimen  is  not  a  species!  •  Gene  expression  variability  >    Knowing  genes  is  not    

knowing  how  they  are  expressed!  •  Microbiome  >    An  animal  is  an  ecosystem!  •  Systems  biology  >  Whole  is  more  than  the  sum  of  its  parts!  •  Models  vs.  experiment  >  Are  we  talking  about  the  same  

things?  In  a  way  we  can  all  use?    •  Dynamics  >  Life  is  not  in  equilibrium!           Life  is  complicated!  

Reduc>onism  doesn’t  work  for  living  systems.  

Page 7: Towards Incidental Collaboratories For Experimental Data

Sta>s>cs  to  the  rescue!    With  enough  observa>ons,  trends  and  anomalies  can  be  detected:  •   “Here  we  present  resources  from  a  popula>on  of  242  

healthy  adults  sampled  at  15  or  18  body  sites  up  to  three  >mes,  which  have  generated  5,177  microbial  taxonomic  profiles  from  16S  ribosomal  RNA  genes  and  over  3.5  terabases  of  metagenomic  sequence  so  far.”    

The  Human  Microbiome  Project  Consor>um,  Structure,  func>on  and  diversity  of  the  healthy  human  microbiome,  Nature  486,  207–214  (14  June  2012)  doi:10.1038/nature11234  

•  “The  large  sample  size  —  4,298  North  Americans  of  European  descent  and  2,217  African  Americans  —  has  enabled  the  researchers  to  mine  down  into  the  human  genome.”    

Nidhi  Subbaraman,  Nature  News,  28  November  2012,  High-­‐resolu>on  sequencing  study  emphasizes  importance  of  rare  variants  in  disease.  

 

Page 8: Towards Incidental Collaboratories For Experimental Data

•  Collect:  store  data  at  the  level  of  the  experiment:  – Accessible  through  a  single  interface  – Add  enough  metadata  to  know  what  was  done/seen  

•  Connect:  allow  analyses  over:    –  Similar  experiment  types    –  Experiments  done  with/on  similar  biological  ‘things’    (species,  strains,  systems,  cells  etc.)  

–  In  a  way  that  can  be  used  by  modelers!    •  Keep:  –  Long-­‐term  preserva>on  of  data  and  so}ware      –  Fulfill  Data  Management  Plan  requirements  – Allow  ‘gated’  access  when  and  to  whom  researcher  wants  

 

Enable  ‘incidental  collaboratories’:  

Page 9: Towards Incidental Collaboratories For Experimental Data

Problem:  biological  research  is  quite  insular    •  Biology  is  small:  size  10^-­‐5  –  10^2  m,  scien>st  can  work  alone  (‘King’  and  ‘subjects’).    

•  Biology  is  messy:  it  doesn’t  happen  behind  a  terminal.    

•  Biology  is  compe>>ve:  many    people  with  similar  skill  sets,    vying  for  the  same  grants      

•  In  summary:  the  structure  of  biological  research  does  not  inherently  promote  collabora>on  (vs.,  for  instance,  big  physics  or  astronomy).  

Prepare  

Observe  

Analyze  

Ponder  

Communicate  

Page 10: Towards Incidental Collaboratories For Experimental Data

Let’s  look  at  a  typical  lab:  

•  How  to  get  the  right    an>body  IDs    

•  And  messy  bits      •  From  the  lab  notebook    •  Into  the  PI’s  command    center?  

Page 11: Towards Incidental Collaboratories For Experimental Data

Objec>ons  and  rebueals  re.  data  sharing  Objec,on:   Rebu=al:  

“But  our  lab  notebooks  are  all  on  paper”  

Develop  smart  phone/tablet  apps  for  data  input  

“I  need  to  see  a  direct  benefit  from  something  I  spend  my  >me  on”    

Develop  ‘data  manipula,on  dashboard’  for  PI  to  allow  beeer  access  to  full  experimental  output  for  his/her  lab  

“I  want  things  to  be  peer  reviewed  before  I  expose  them”    

Allow  reviewers  access  to  experimental  database  before  publica>on  (of  data  or  paper)  

“I  don’t  really  trust  anyone  else’s  data  –  well,  except  for  the  guys  I  went  to  Grad  School  with…”    

Add  a  social  networking  component  to  this  data  repository  so  you  know  who  (to  the  individual)  created  that  data  point.    

“I  am  afraid  other  people  might  scoop  my  discoveries”  

=>  Reward  system  moves  from  a  compe,,on  to  a  ‘shared  mission’  

Page 12: Towards Incidental Collaboratories For Experimental Data

Data  sharing  enables  collaboratories:  

Prepare  

Analyze   Communicate  

Think  

Prepare  

Analyze   Communicate  

Prepare  

Analyze   Communicate  

Observa>ons  

Observa>ons  

Observa>ons  

Labs  go  from  being  informa>on  islands  to  being  ‘sensors  in  a  network’  ‘Conglomera>on  of  evidence’  can  happen  Allow  place  to  share  nega>ve  data  –  reproducing  experiments.  

Page 13: Towards Incidental Collaboratories For Experimental Data

So  we  can  do  joint  experiments:  

Prepare  

Analyze   Communicate  

Prepare  

Analyze   Communicate  

Observa>ons  

Observa>ons  

Observa>ons  

Across  labs,  experiments:  track  reagents  and  how  they  are  used  

Page 14: Towards Incidental Collaboratories For Experimental Data

So  we  can  do  joint  experiments:  

Prepare  

Analyze   Communicate  

Prepare  

Analyze   Communicate  

Observa>ons  

Observa>ons  

Observa>ons  

Compare  outcome  of  interac>ons  with  these  en>>es  

Page 15: Towards Incidental Collaboratories For Experimental Data

So  we  can  do  joint  experiments:  

Prepare  

Analyze   Communicate  

Prepare  

Analyze   Communicate  

Observa>ons  

Observa>ons  

Observa>ons  

Build  a  ‘virtual  reagent  spectrogram’  by  comparing    how  different  en>>es    interacted  in  different  experiments  

Page 16: Towards Incidental Collaboratories For Experimental Data

Calculate,  coordinate…    

Compile,  comment,  compare…  

6.  Allow  apps/tools  to  integrate    

A  single  environment  to  perform,  store,  share  and  report  on  experiments:  

1.  Store  metadata  on  all  materials  metadata  

metadata  

metadata  

metadata  

metadata  

5.  Invite  reviews;  open  data  to  trusted  par>es,  at  trusted  >me  

2.  Track  the  methods  while  doing  them  

4.  Don’t  ‘send’  your  papers  –  just  expose  them  to  the  outside  world  

Review  Edit  

Revise  

Rats  were  subjected  to  two  grueling  tests  (click  on  fig  2  to  see  underlying  data).  These  results  suggest  that  the  neurological  pain  pro-­‐  

3.  Write  papers  that  ‘wrap  around’  this  

Page 17: Towards Incidental Collaboratories For Experimental Data

Elsevier  Research  Data  Services:  

1.  Help  increase  the  amount  of  data  shared  from  the  lab,  enabling  incidental  collaboratories  

2.  Help  increase  the  value  of  the  data  shared  by  increasing  annota>on,  normaliza>on,  provenance  enabling  enhanced  interoperability  

3.  Help  measure  and  deliver  credit  for  shared  data,  the  researchers,  the  ins>tute,  and  the  funding  body,  enabling  more  sustainable  pla�orms  

Page 18: Towards Incidental Collaboratories For Experimental Data

Plans  with  CMU/Neuroelectro.org:  •  Do  a  pilot  in  Q3  2013,  using:  – 7”  Tablets  for  data  input  – Can  we  link  to  barcodes  for  AB-­‐s,  scan  on  tablet  (so  we  can  include  the  batch’s  provenance?)  – Links  to  local  so}ware  to  connect  to  runs  – Dashboard  for  the  PI  to  keep  track/play  with  experiments    – Gated  exports  to    • Neuroelectro.org  • NIF  

– Address  NSF  Data  Management  Plan  requirements?    

Page 19: Towards Incidental Collaboratories For Experimental Data

In  summary:  

•  Life  is  complicated!    •  We  need  to  connect  experiments  •  To  do  so,  overcome  technical  barriers  and  social  barriers  (more  difficult)  

•  Maybe  3D  VC  can  help  define  a  common  mission?  

   

[email protected]