Top Banner
Integrating layers of omics data models and compute spaces needed to build a “Convivial Knowledge Expert” Use of Bionetworks to Build Maps of Diseases Moving beyond the linear Stephen Friend MD PhD Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ Amsterdam SciLife September 21st, 2011
85

Stephen Friend SciLife 2011-09-20

Nov 28, 2014

Download

Health & Medicine

Sage Base

Stephen Friend, Sept 20, 2011. SciLife, Stockholm, Sweden
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Stephen Friend SciLife 2011-09-20

Integrating layers of omics data models and compute spaces needed to build a “Convivial Knowledge Expert”

Use of Bionetworks to Build Maps of Diseases Moving beyond the linear

Stephen Friend MD PhD

Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ Amsterdam

SciLife September 21st, 2011

Page 2: Stephen Friend SciLife 2011-09-20

why consider the fourth paradigm- data intensive science

thinking beyond the narrative, beyond pathways

advantages of an open innovation compute space

it is more about how than what

Page 3: Stephen Friend SciLife 2011-09-20

Alzheimer’s Diabetes

Cancer Obesity Treating Symptoms v.s. Modifying Diseases

Will it work for me? Biomarkers?

Page 4: Stephen Friend SciLife 2011-09-20

Familiar but Incomplete

Page 5: Stephen Friend SciLife 2011-09-20

Reality: Overlapping Pathways

Page 6: Stephen Friend SciLife 2011-09-20
Page 7: Stephen Friend SciLife 2011-09-20
Page 8: Stephen Friend SciLife 2011-09-20

WHY NOT USE “DATA INTENSIVE” SCIENCE

TO BUILD BETTER DISEASE MAPS?

Page 9: Stephen Friend SciLife 2011-09-20

Equipment capable of generating massive amounts of data

“Data Intensive Science”- “Fourth Scientific Paradigm” For building: “Better Maps of Human Disease”

Open Information System

IT Interoperability

Evolving Models hosted in a Compute Space- Knowledge Expert

Page 10: Stephen Friend SciLife 2011-09-20

It is now possible to carry out comprehensive monitoring of many traits at the population level

Monitor disease and molecular traits in populations

Putative causal gene

Disease trait

Page 11: Stephen Friend SciLife 2011-09-20

what will it take to understand disease?

DNA RNA PROTEIN (dark matter)

MOVING BEYOND ALTERED COMPONENT LISTS

Page 12: Stephen Friend SciLife 2011-09-20

2002 Can one build a “causal” model?

Page 13: Stephen Friend SciLife 2011-09-20

trait

How is genomic data used to understand biology?

“Standard” GWAS Approaches Profiling Approaches

“Integrated” Genetics Approaches

Genome scale profiling provide correlates of disease   Many examples BUT what is cause and effect?

Identifies Causative DNA Variation but provides NO mechanism

  Provide unbiased view of molecular physiology as it

relates to disease phenotypes

  Insights on mechanism

  Provide causal relationships and allows predictions

RNA amplification Microarray hybirdization

Gene Index

Tum

ors

Tum

ors

Page 14: Stephen Friend SciLife 2011-09-20

Integration of Genotypic, Gene Expression & Trait Data

Causal Inference

Schadt et al. Nature Genetics 37: 710 (2005) Millstein et al. BMC Genetics 10: 23 (2009)

Chen et al. Nature 452:429 (2008) Zhang & Horvath. Stat.Appl.Genet.Mol.Biol. 4: article 17 (2005)

Zhu et al. Cytogenet Genome Res. 105:363 (2004) Zhu et al. PLoS Comput. Biol. 3: e69 (2007)

“Global Coherent Datasets” •  population based

•  100s-1000s individuals

Page 15: Stephen Friend SciLife 2011-09-20

Constructing Co-expression Networks

Start with expression measures for genes most variant genes across 100s ++ samples

Note: NOT a gene expression heatmap

1 -0.1 -0.6 -0.8

-0.1 1 0.1 0.2

-0.6 0.1 1 0.8

-0.8 0.2 0.8 1 1

2

3

4

1 2 3 4

Correlation Matrix Brain sample

expr

essi

on

1 0 1 1 0 1 0 0 1 0 1 1 1 0 1 1 1

2

3

4

1 2 3 4

Connection Matrix

1 0 0 0 0 1 1 1 0 1 1 1 0 1 1 1 1

2

4

3

1 2 4 3

4 1

3 2

Establish a 2D correlation matrix for all gene pairs

Define Threshold eg >0.6 for edge

Clustered Connection Matrix

Hierarchically cluster

sets of genes for which many pairs interact (relative to the total number of pairs in that

set)

Network Module

Identify modules

Page 16: Stephen Friend SciLife 2011-09-20

******BYRM

Yeast segregants

Synthetic complete medium

Logorithm growth

Gene expression

Yeas

t seg

rega

nts

genotypes

Public databases

Protein-protein

interations

Transcription factor binding

sites

Bayesian network

Protein Metabolite interations

Data integration via Bayesian Network

Courtesy of Dr. Jun Zhu

Page 17: Stephen Friend SciLife 2011-09-20

Preliminary Probabalistic Models- Rosetta /Schadt

Gene symbol Gene name Variance of OFPM explained by gene expression*

Mouse model

Source

Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg

Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12]

Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple

(UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg

(Columbia University, NY) [11] C3ar1 Complement component

3a receptor 1 46% ko Purchased from Deltagen, CA

Tgfbr2 Transforming growth factor beta receptor 2

39% ko Purchased from Deltagen, CA

Networks facilitate direct identification of genes that are

causal for disease Evolutionarily tolerated weak spots

Nat Genet (2005) 205:370

Page 18: Stephen Friend SciLife 2011-09-20

  50 network papers   http://sagebase.org/research/resources.php

List of Influential Papers in Network Modeling

Page 19: Stephen Friend SciLife 2011-09-20

(Eric Schadt)

Page 20: Stephen Friend SciLife 2011-09-20

Recognition that the benefits of bionetwork based molecular models of diseases are powerful but that they require significant resources

Appreciation that it will require decades of evolving representations as real complexity emerges and needs to be integrated with therapeutic interventions

Page 21: Stephen Friend SciLife 2011-09-20

Sage Mission

Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by

contributor scientists with a shared vision to accelerate the elimination of human disease

Sagebase.org

Data Repository

Discovery Platform

Building Disease Maps

Commons Pilots

Page 22: Stephen Friend SciLife 2011-09-20

Sage Bionetworks Collaborators

  Pharma Partners   Merck, Pfizer, Takeda, Astra Zeneca, Amgen, Johnson &Johnson

22

  Foundations   Kauffman CHDI, Gates Foundation

  Government   NIH, LSDF

  Academic   Levy (Framingham)   Rosengren (Lund)   Krauss (CHORI)

  Federation   Ideker, Califarno, Butte, Schadt

Page 23: Stephen Friend SciLife 2011-09-20

RULES GOVERN

Engaging Communities of Interest

PLAT

FORM

NEW

MAP

S NEW MAPS

Disease Map and Tool Users- ( Scientists, Industry, Foundations, Regulators...)

PLATFORM Sage Platform and Infrastructure Builders-

( Academic Biotech and Industry IT Partners...)

RULES AND GOVERNANCE Data Sharing Barrier Breakers-

(Patients Advocates, Governance and Policy Makers,  Funders...)

NEW TOOLS Data Tool and Disease Map Generators- (Global coherent data sets, Cytoscape,

Clinical Trialists, Industrial Trialists, CROs…)

PILOTS= PROJECTS FOR COMMONS Data Sharing Commons Pilots-

(Federation, CCSB, Inspire2Live....)

Page 24: Stephen Friend SciLife 2011-09-20

24

Example 1: Breast Cancer

Zhang B et al., manuscript

Bayesian Network

Survival Analysis

Coexpression Networks Module combination

Partition BN

Page 25: Stephen Friend SciLife 2011-09-20

4 Public Breast Cancer Datasets

NKI: van de Vijver et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002 Dec 19;347(25):1999-2009.

Wang Y et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005 Feb 19-25;365(9460):671-9.

Miller: Pawitan Y et al. Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res. 2005;7(6):R953-64.

Christos: Sotiriou C et al.. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006 Feb 15;98(4):262-72.

25

295 samples

286 samples

159 samples

189 samples

Generation of Co-expression & Bayesian Networks from published Breast Cancer Studies

Page 26: Stephen Friend SciLife 2011-09-20

Recovery  of  EGFR  and  Her2  oncoproteins  downstream  pathways  by  super  modules  

Page 27: Stephen Friend SciLife 2011-09-20

Comparison  of  Super-­‐modules  with  EGFR  and  Her2  signaling  and  resistance  pathways  

Page 28: Stephen Friend SciLife 2011-09-20

28

Key  Driver  Analysis  •  IdenDfy  key  regulators  for  a  list  of  genes  h    and  a  network  N  •  Check  the  enrichment  of  h in  the  downstream  of  each  node  in  N  •  The  nodes  significantly  enriched  for  h  are  the  candidate  drivers  

Page 29: Stephen Friend SciLife 2011-09-20

29

A) Cell Cycle (blue)

C) Pre-mRNA Processing (brown)

B) Chromatin modification (black)

D) mRNA Processing (red)

Global driver

Global driver & RNAi validation

Page 30: Stephen Friend SciLife 2011-09-20

Signaling between Super Modules

(View Poster presented by Bin Zhang)

Page 31: Stephen Friend SciLife 2011-09-20

Example: The Sage Non-Responder Project in Cancer

Sage Bionetworks • Non-Responder Project

•  To identify Non-Responders to approved drug regimens so we can improve outcomes, spare patients unnecessary toxicities from treatments that have no benefit to them, and reduce healthcare costs

•  Co-Chairs Stephen Friend, Todd Golub, Charles Sawyers Gary Nolan & Rich Schilsky

•  AML (at first relapse)- Jerry Radich •  Non-Small Cell Lung Cancer- Roy Herbst

•  Ovarian Cancer (at first relapse)- Beth Karlan

•  Breast Cancer- Dan Hayes •  Renal Cell- Rob Moetzer

Purpose:

Leadership:

Initial Studies:

Page 32: Stephen Friend SciLife 2011-09-20

Blue module: 3000 genes Associated with Type 2 diabetes Elevated HbA1c Reduced insulin secretion

Global expression data from 64 human islet donors

340 genes in islet-specific open chromatin regions

168 overlapping genes, which have

•  Higher connectivity •  Markedly stronger association with

•  Type 2 diabetes •  Elevated HbA1c •  Reduced insulin secretion

•  Enrichment for beta-cell transcription factors and exocytotic proteins

New Type II Diabetes Disease Models Anders Rosengren

Page 33: Stephen Friend SciLife 2011-09-20

•  Search across 1300 datasets in MetaGEO at Sage for similar expression profiles Top hit: Islet dedifferentiation study where the 168 genes were upregulated in mature islets and downregulated in dedifferentiated islets (Kutlu et al., Phys Gen 2009)

•  Analyses of expression-SNPs and clinical SNPs as well as Causal Inference Test

•  Identification of candidate key genes affecting beta-cell differentiation and chromatin

Working hypothesis:

Normal beta-cell: open chromatin in islet-specific regions, high expression of beta-cell transcription factors, differentiated beta-cells and normal insulin secretion

Diabetic beta-cell: lower expression of beta-cell transcription factors affecting the identified module, dedifferentiation, reduced insulin secretion and hyperglycemia

Next steps: Validation of hypothesis and suggested key genes in human islets

Anders Rosengren

New Type II Diabetes Disease Models

Page 34: Stephen Friend SciLife 2011-09-20

Probing  complex  biology    

•  The  more  we  learn  about  it,  the  more  complicated  it  becomes!  

•  Cancers'  geneDc  fingerprints  are  highly  diverse;  most  mutaDons  were  unique  to  individual    

•  How  to  piece  everything  together?  

Page 35: Stephen Friend SciLife 2011-09-20

observa)ons  to  models  

diseases  

perturba

)on

s  

perturba

)on

s  

Page 36: Stephen Friend SciLife 2011-09-20

A  framework  for  data  integraDon �

probabilistic graphic models

Microarray data

Proteomic data

Genomics

Genetics

Medline Biocarta/Biopathway Biologists

Database

GUI Hypothesis, test

High  throughput  data  

knowledge  

Metabolomic data

Page 37: Stephen Friend SciLife 2011-09-20

******BYRM

Yeast  segregants  

Synthe)c  complete    medium  Logorithm  growth  

Gene  expression   metabolites  Yeast  segregants  

genotypes  

Public    databases  

Protein-­‐protein  intera)ons  

Transcrip)on  factor  binding  sites  

Bayesian  network  

Protein  Metabolite  intera)ons  

Page 38: Stephen Friend SciLife 2011-09-20

Bayesian  network:  Incorpora)ng  TFBS  and  PPI  data  as  a  scale-­‐free  network  prior  

•  DNA-­‐protein  binding  data  –  Knowledge  of  what  proteins  regulate  transcripDon  of  a  given  gene  

Page 39: Stephen Friend SciLife 2011-09-20

PPI:  Can  we  find  informaDon  overlapped  with  gene  expressions?  

3-­‐clique  

4-­‐clique   4-­‐clique  

3-­‐clique  

Clique  community  (par)al  clique)  

Zhu  J  et  al,  Nature  GeneDcs,    2008  

Page 40: Stephen Friend SciLife 2011-09-20

IntegraDng  transcripDon  factor  (TF)  binding  data  and  PPI  

•  Introducing  scale-­‐free  priors  for  TF  and  large  PPI  complex  

•  Fixed  prior  for  small  PPI  complex  

Zhu  J  et  al,  Nature  GeneDcs,  2008  

Page 41: Stephen Friend SciLife 2011-09-20

Integra)on  improves  network  quali)es  

BN KO data GO terms TF data

w/o any priors 125 55 26

w/ genetics priors 139 59 34

w/ genetics, TF and PPI priors 152 66 52

Zhu  et  al.,  Cytogene)cs,  2004  Zhu  et  al.  PLoS  Comp  Biol.  2007  Zhu  J  et  al.,  Nature  Gene)cs,  2008  

•  The  pair  is  independent  

•  The  pair  is  causa/reacDve  

Page 42: Stephen Friend SciLife 2011-09-20

LEU2  GCN4  

ILV6  

GCN4  

LEU2  KO  gives  rise  to  small  expression  signature  

•     LEU2  KO  sig  enriched  (p~10E-­‐18)  •     GCN4  downregulated  in  LEU2  KO    small  signature  

ILV6  gives  rise  to  large  expression  signature  

•     ILV6  KO  sig  enriched  (p~10E-­‐52)  •     GCN4  upregulated  in  ILV6  KO    large  signature  

ProspecDve  validaDon  is  the  gold  standard  

Zhu  J  et  al.,  Nature  Gene)cs,  2008  

Page 43: Stephen Friend SciLife 2011-09-20

Lung  Cancer  Bayesian  network  

•  Built  from  240  lung  cancer  samples  •  7785  genes  •  10642  links  

Page 44: Stephen Friend SciLife 2011-09-20

EMT’s proteomic signatures

epithelial mesenchymal

Cell medium signature Cell extract signature Cell surface signature

Page 45: Stephen Friend SciLife 2011-09-20

Direct  overlaps  of  EMT  proteomic  signatures  

extract surface media

extract 267 75 57

surface 31% 240 52

media 27% 25% 208

Page 46: Stephen Friend SciLife 2011-09-20

Analysis  of  EMT  signatures  through    lung  cancer  network  

Cell medium signature

Cell extract signature

Cell surface signature

De novo constructed lung cancer regulatory network signatures subnetworks

overlaps

Page 47: Stephen Friend SciLife 2011-09-20

Hallmark  features  of  EMT  

•  Decrease of E-cadherin and increase of Vimentin (criteria used in defining signatures) – CTNNA1 is in all three proteomic signatures;

•  There are 6 nodes connected to CTNNA1 in the lung cancer network including ARF1

– VIM is in all three proteomic signatures •  There are 7 nodes connected to VIM in the lung

cancer network including NOTCH2, EMP3

Page 48: Stephen Friend SciLife 2011-09-20

Lung  Network  Conclusions  

•  All  proteomic  signatures  are  coherently  co-­‐regulated  at  transcripDon  level;  

•  GOBP  annotaDons  for  signatures  in  cell  surface,  condiDoned  media  fracDons  are  expected;  

•  Lipid  metabolism  for  signatures  in  the  total  cell  extract  is  also  expected.  

•  Subnetworks  for  EMT  proteomic  signatures  contain  all  known  hallmark  features  of  EMT;  

•  These  subnetworks  can  provide  beger  context  to  understand  EMT  and  to  idenDfy  key  regulators  of  EMT.    

Page 49: Stephen Friend SciLife 2011-09-20

2008   2009   2010   2011  

Can we accelerate the pace of scientific discovery?

Page 50: Stephen Friend SciLife 2011-09-20

How is the Federation different from a “traditional” collaboration?

collaboration 2.0

Page 51: Stephen Friend SciLife 2011-09-20

  Shared data tools models and prepublications   Conflict of interests   Intellectual property   Authorship

Rules of the game: transparency & trust

Page 52: Stephen Friend SciLife 2011-09-20

Watch What I Do, Not What I Say Reduce, Reuse, Recycle

Most of the People You Need to Work with Don’t Work with You

My Other Computer is Amazon

sage bionetworks synapse project

Page 53: Stephen Friend SciLife 2011-09-20

  Type-II diabetes   Warburg effect in cancer   Human aging

sage federation: scientific pilot projects

  Type-II diabetes   Warburg effect in cancer   Human aging

Cellular  mortality  (aging)  

Cellular  immortality  (cancer)  

Page 54: Stephen Friend SciLife 2011-09-20

sage federation: warburg effect project

Page 55: Stephen Friend SciLife 2011-09-20

warburg effect project interconnected discovery

bioinformatic bioinformatic confirmation that metabolic genes implicated in the warburg effect are differentially expressed across a wide range of cancer types!

butte lab

coherent prostate dataset

taylor, sawyers, et al., mskcc!

network modeling genes associated with poor prognosis in prostate cancer are disproportionately found amongst networks regulating glycolysis genes!

sage bionetworks

network dynamics generate an aerobic glycolysis signature and ʻreverse engineerʼ master transcription factor regulators of the warburg effect transcriptional program !

califano lab

prostate

breast

lung

colon

renal

brain

b cell

layer in!additional !tissue types!

Page 56: Stephen Friend SciLife 2011-09-20

from sawyers, pcctc presentation!

warburg effect project so what? discovery integration to translation, a validation path

lists of nodes and key drivers!

from thompson, science, 2009!

target identification! clinical validation!

peter nelson!

pre-clinical target validation!

jim olson!

Page 57: Stephen Friend SciLife 2011-09-20

sage federation: human aging project

JusDn  Guinney  Stephen  Friend*  

Greg  Hannum  Januz  Dutkowski  Trey  Ideker*  Kang  Zhang*  

Mariano  Alvarez  Celine  Lefebrev  Andrea  Califano*  

Page 58: Stephen Friend SciLife 2011-09-20

sage federation: what is the impact of disease/environment on “biological age” ?

Chronological  Age  

Biological  Age  

2001  

2009  

Page 59: Stephen Friend SciLife 2011-09-20

sage federation: model of biological age

Faster Aging

Slower Aging

Clinical Association -  Gender -  BMI -  Disease Genotype Association Gene Pathway Expression Pr

edicted  Age  (liver  expression)  

Chronological  Age  (years)  

Age Differential

Page 60: Stephen Friend SciLife 2011-09-20

human aging: clinical associations with differential aging – gene expression in human liver

Bioage  Difference  (years)  

Faster  Aging  

Slower  Aging  

Fit   Overweight  Underweight  

!

!

!

!

FALSE TRUE

−10

010

20

Male  Female  

Page 61: Stephen Friend SciLife 2011-09-20

human aging: predicting bioage using whole blood methylation

!

!

!!!

!

!!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

! !!

!

!

!

!

!

!!!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!!

!

!

!

!

!

40 50 60 70 80 90 100

40

60

80

100

Training Cohort: San Diego (n=170)

Chronological Age

Bio

logic

al A

ge

RMSE=3.35

!

!!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!!

!

!

!!

!

!!

!!

!

!

!

!!!

!!

!

!

!

!

!

!

!!

!!

!!

!

!!

!

!!

!

!!

!

!

!

!!!

!

!

!

! !

!

!

!!

!

!

!

!!

!

!

!!

! !

!!!

!

!

!

!!

!

!

!!

!

!

!!

40 50 60 70 80 90

40

60

80

100

Validation Cohort: Utah (n=123)

Chronological Age

Bio

logic

al A

ge

RMSE=5.44

•  Independent training (n=170) and validation (n=123) Caucasian cohorts •  450k Illumina methylation array •  Exom sequencing •  Clinical phenotypes: Type II diabetes, BMI, gender…

Page 62: Stephen Friend SciLife 2011-09-20

human aging: clinical associations with differential bioage

!

!

!!

!!

!!

!

!!

!

!

!

!

!

! !

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!!

!

!

!

!

!

!

!

!

! !

!!

!!

!!

!

!

!!

!!

!!

!!

!

!!

!

!!

! !

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!!

!

!!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!

20 25 30 35 40 45

!10

!5

05

10

BMI vs Diff Bioage

BMI

Diff B

ioage

p=0.0362

!

!

N Type II

!10

!5

05

10

Diabetes vs Diff Bioage

Diff B

ioage

p=0.000466

Train:  San  Diego  

!!

!!

!!

!!

!

!!

!

!

!

!

!!

!

! !

!

!

!

!!! !

!

!

!

!

!

! !

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!!

!!

!

!

!

!

!

!

!!

!!

!!

!

!

!

!

!

!

!

!!

!

!

!!

!!

!

!

!

! !

!!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!!

!!!

!

!

!

!!

!

20 25 30 35 40 45 50

!1

5!

50

51

0

BMI vs Diff Bioage

BMI

Diff

Bio

ag

e

p=0.00173

!!

N Type II

!1

5!

50

51

0

Diabetes vs Diff Bioage

Diff

Bio

ag

e

p=1.34e!10

ValidaDon:  Utah  

Page 63: Stephen Friend SciLife 2011-09-20

human aging: clinical associations with combined cohorts

Univariate Analysis

Multiple Regression Gender   BMI   Diabetes   Smoker  

p=.45   p=.619   p=4.82e-­‐07   p=.02  

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

! !

!

!

!

!

!

!

!

!

! !

!

! !

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!! !

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!! !

!

! !

!

!!

!!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

! !

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

20 30 40 50

!1

0!

50

51

0

BMI vs Diff Bioage

BMI

Diff

Bio

ag

e

p=0.00406

!

!

N Type II

!1

0!

50

51

0

Diabetes vs Diff Bioage

Diff

Bio

ag

e

p=9.7e!11!

!

!

!!

FALSE TRUE

!1

0!

50

51

0

Smoker vs Diff Bioage

Diff

Bio

ag

e

p=0.00764

Page 64: Stephen Friend SciLife 2011-09-20

human aging: mechanism of biological aging

stochasDc  vs  mechanisDc  

deterioraDon  by  random  “hits”     systemaDc  process  

Page 65: Stephen Friend SciLife 2011-09-20

human aging: mechanism of aging – higher entropy?

40 50 60 70 80 90

2

4

6

8

10

12

14

Primary Cohort

Mean age per window

En

tro

py

p=6.616e!16

40 45 50 55 60 65

!56

!54

!52

!50

!48

Secondary Cohort

Mean age per window

En

tro

py

p=3.528e!05

Page 66: Stephen Friend SciLife 2011-09-20

•  Created predictive model of biological age •  Type-II diabetes strongly correlates with accelerated aging •  Identified genetic variant that may induce methyl-driven

acceleration of aging •  General mechanism of aging may involve loss of signal and

increased disorder within the methylome

human aging: research summary

Page 67: Stephen Friend SciLife 2011-09-20

sage federation alternative model of scientific collaboration?

Page 68: Stephen Friend SciLife 2011-09-20

Federated  Aging  Project  :    Combining  analysis  +  narraDve    

=Sweave Vignette Sage Lab

Califano Lab Ideker Lab

Shared  Data  Repository  

JIRA:  Source  code  repository  &  wiki  

R code + narrative

PDF(plots + text + code snippets)

Data objects

HTML

Submitted Paper

Page 69: Stephen Friend SciLife 2011-09-20

Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning

Page 70: Stephen Friend SciLife 2011-09-20

Synapse  as  a  Github  for  building  models  of  disease  

Page 71: Stephen Friend SciLife 2011-09-20

Evolution of a Software Project

Page 72: Stephen Friend SciLife 2011-09-20

Biology Tools Support Collaboration

Page 73: Stephen Friend SciLife 2011-09-20

Potential Supporting Technologies

Taverna

Addama

tranSMART

Page 74: Stephen Friend SciLife 2011-09-20

Platform for Modeling

SYNAPSE  

Page 75: Stephen Friend SciLife 2011-09-20
Page 76: Stephen Friend SciLife 2011-09-20
Page 77: Stephen Friend SciLife 2011-09-20
Page 78: Stephen Friend SciLife 2011-09-20

INTEROPERABILITY  

INTEROPERABILITY (tranSMART)

Page 79: Stephen Friend SciLife 2011-09-20

 TENURE      FEUDAL  STATES      

Page 80: Stephen Friend SciLife 2011-09-20

!

Group D LEGAL STACK-ENABLING PAIENTS: John Wilbanks

Page 81: Stephen Friend SciLife 2011-09-20

Arch2POCM  

Restructuring  Drug  Discovery  

Page 82: Stephen Friend SciLife 2011-09-20

why consider the fourth paradigm- data intensive science

thinking beyond the narrative, beyond pathways

advantages of an open innovation compute space

it is more about how than what

Page 83: Stephen Friend SciLife 2011-09-20

Moving beyond the linear

linear pathways

linear ways of building models

linear ways of working together

Page 84: Stephen Friend SciLife 2011-09-20

“What Technology Wants” pp 264 by Kevin Kelly

Convivial  ManifestaDons  of  the  Sage  Synapse  Commons  

Page 85: Stephen Friend SciLife 2011-09-20

OPPORTUNITIES FOR THE SCILIFE COMMUNITY

Data sets, Tools and Models

Joining Synapse Communities

Joining Federation Projects

Joining Arch2POCM

Change reward structures for sharing data (patients and academics)