The Elucidation of Regulatory Networks in Complex Biological Systems: The Convergence of Biology, Medicine and Computing G. Poste Stanford University,

Post on 20-Dec-2015

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

The Elucidation of Regulatory Networksin Complex Biological Systems:

The Convergenceof

Biology, Medicine and Computing

G. PosteStanford University, 15 March 2002

gposte@healthtechnetwork.com

The Analysis and Application of Principles of Biological Design

biology

biology chemistry

genomics computing

1750-1980

1980-2010

• the descriptive narrative

• empirical technology

• mechanistic reductionism

systems biology

• the encoded information content of biological systems

• mapping the basis of biological variation

• rational medicine and customized care

Biology and Medicine as Information-Based Sciences

From Reductionism to Integrated Systems Biology

• individual genes and proteins

• molecular interactions in simple systems

• limited, fragmented datasets

• poor annotation

• limited capacity for predictive simulation

• analog information

• biological circuits, pathways and networks

• assembly of higher order systems

• massive, integrated datasheets

• stringent, standardised annotation

• robust algorithms for predictive biology• biology in silico

• digital information

21st Century Biology and Medicine “SYSTEMS BIOLOGY”• the design principles of biological order and complexity• mapping the information content of biopathways and networks

BiotechnologyAnd

SystemsBiology

New Analytical

Capabilities

Large ScaleComputing

“BIG BIOLOGY”• interdisciplinary, massive datasets, information-based• infrastructure, investment and education

Convergence :The Technological Platforms Shaping

the Evolution of Healthcare

BiotechnologyAnd

SystemsBiology

New Analytical

Capabilities

Large ScaleComputing

Rule-BasedDesign Principles

Computational Biology

Exploring“Biospace”

AutomationEngineering

and Robotics

MaterialsScience

Micro-/Opto-Electronics

From Reductionism to Integrated Systems Biology

understanding the information content encoded in biological networks

mapping the design rules for progressively greater complexity of biological order

gene(s)

pathways, circuits and networks

progressively ordered assemblies: organelles, cells, tissues organs

homeostatic integration of myriad, complex, interactive networks(Physiology)

High Level Abstraction of Biological Pathways and Network Systems

Encoded Information

Pathways and Networks

Rule Sets

Plasticity• adaptive fitness• pathological peturbation

• directed evolution• biology in silico

Predictive Biology

Novel Biospaceand

Carbon : Silicon Union

Global and Nodal Pathway Map of Genomic and Proteomic Elements in Yeast Galactose Utilization

From: T. Ideker et. al. 2001. Science 292, 929

Genetic Networks

bioinformation processing involves leverage of interactive feedback loops in diverse domains- physical, chemical, electrical

genomic and proteomic codes represent a dense network of nested hyperlinks

matter becomes code

Nonlinear Complexity in Biological Systems

distinct classes of nonlinear interactions long-range (fractal) correlations self-similarity, self-dissimilar and organized

criticality pattern formation complex adaptive networks highly optimized tolerance = robustness with

fragility barriers to cascading failures deterministic chaos emergent properties

Nonlinear Complexity in Biological Systems

abrupt changes- bifurcations; intermittency/bursting;

bistability/multistability; phase transitions nonlinear oscillations

- limit cycles; phase-resetting; entrainment nonlinear waves

- spirals; scrolls; solitons complex periodic cycles and quasiperiodicities scale invariance

- fractal and multifractal scaling; long-range correlations; self-organized criticality

stochastic resonance and related noise-modulated mechanisms

time irreversibility

Informationand

Technology Platform Overload

Principal Themes in theAnalysis of Biological Systems

large scale

miniaturization

automation

parallelism

networked systems

real time, interactive, adaptive

Major Technology Gaps

rapid gene ID in complex genomes structural genomics and protein structure-function

prediction mapping the proteome

- abundance, modification, localisation and protein-protein interactions

- large scale parallelism (protein-arrays)- small organic molecule networks

mapping the metabolome- circuits, modules, networks

robust predictive algorithms for ADMET profiling of drug candidate SAR

The Need for Standards and Stringent Semantics

“... without which ….. wanton and luxuriant fancies climbing up into the Bed of Reason, do not only defile it by unchaste and illegitimate embraces, but instead of real conceptions and notices of things do impregnate the mind with nothing but Ayerie and Subventaneous Phantasmes”

Samuel Parker, FRS 1666

standards

standards

STANDARDS

The Analysis and Comprehension of Biological Systems

descriptiveignorance

complexity

defined rule sets

initialmechanisticinsights

• elucidation of patterns • defining rule sets

• disease heterogeneity• patient heterogeneity• disease predisposition

burgeoning,bewildering complexity

• elegant simplicity revealed• predictive biology

• right Rx : right disease• right Rx : right patient• from reactive treatment to proactive prevention

molecularphylogenies

andgeneology

population geneticsclinical

databanks

chemicalSAR

biologicalorder

IntegratedIntegratedDistributedDistributed

HeterogeneousHeterogeneousDatabasesDatabases

and Databanksand Databanks

datawarehousing

anddata mining

human-computerinterfacesystems

evolvinghardware

andelectronicevolution

object-orientedand pattern /spatial arrayrecognitionExpertExpert

SystemsSystemsandand

KnowledgeKnowledgeManagementManagement

Convergence, Consilience, Cognition and Computing

• more science• better science• faster science• cross-disciplinary science

• interdisciplinary convergence• technological convergence• corporate convergence

MEGADATA

Performance

Vo

lum

e

• burgeoning data volumes• more transactions• increasing diversity of datasets/apps• expanding user communities

• complexity of distributed environments• rising performance expectations• confidentiality and privacy

• pressures on network bandwidth

The Scalability

Crisis

Major Challenges for Life Sciences Computing

exponentially growing data repositories (102TB/PB)

highly variable data formats and standards as obstacles to data access and mining

inadequate attention to data Q.C./annotation standards

excessive reliance on customized solutions and fragmented data sources

inadequate access and integration of public and private datasets

primitive data visualization tools 80% time spent on data preparation tasks and

20% on productive exploration

Major Challenges for Life Sciences Computing

infrastructure scale and capital investment new tools for mining, visualization, simulation data storage conventions and technologies dynamic, adaptive, scalable systems active networks

- software into the network- subnet interoperability- integration of distributed and collaborative working

environments fast data access at all levels

- storage, I/O and networks to support analysis and simulation

expanded bandwidth for high usage and high transfer rates

Big Biology

Bracing For the Inevitable : Petabyte-Size Databases

1000 terabytes 250 billion text pages 20 million four drawer filing cabinets 2000 mile high tower of 1 billion diskettes typical US consumer generates 100 Gbytes

personal data/lifetime- education, insurance, credit, medical

100 million consumers 10,000 petabytes

Data Grids

from Napster and Gnutella

to

ubiquitous peer-to-peer exchange of data sets

to

apportioned distributed computing for solutions of computationally massive problems

Informatics for Big Biology and e.Health Networks

• instructive precedents in high end computing from other disciplines- cosmology, quantum chromodynamics, climate research, materials

• Scientific Simulation Initiative• National Computational Science Alliance• Long Term Ecological Research• NASA, DOE, NOAA• Accelerated Strategic Computing

Initiative

• UNICORE• Pangea• E-Science• LHC Challenge• E-Grid

USA Europe

•Grid Physics Network

The Bibliome

The Bibliome

Modified from : T. Berners-Lee and J. Hendler Nature 2000 410, 1023

The GlobalVirtual

Archive/Universal

Knowledge Web

Metadata

WWWI

Proof, logicand

ontologylanguages

• shared terms/ terminology• machine-machine

communication• inter-memetic translation• self-evolving translators

• Resource Description Framework• eXtensible Markup Language

• Metadata tagging standardsfor interoperable distributed archives• self-assembling datasets• self-describing documents

• HyperText Markup Language• HyperText Transfer Protocol

• The first generation Web

Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex

Biological Systems

unique complexity of biological systems multiple levels of abstraction

- organismal- ecosystem dynamics- social/memetic networks

qualitative not quantitative data- diversity of experimental conditions- inaccessibility/replication of experimental

conditions upgrading to hybrid qualitative/quantitative

analysis tools

Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex

Biological Systems entity classes : finite elements action properties : state properties intramolecular site interactions intermolecular site interactions massively parallel networks : unit modules continuum systems compartments economy and parsimony evolutionary relationships network pathways

- redundancy (degeneracy), pleiotropy- complex emergent properties

Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex Biological Systems

entity classes : finite elements action properties : state properties intramolecular site interactions intermolecular site interactions massively parallel networks : unit modules continuum systems compartments economy and parsimony evolutionary relationships network pathways

- redundancy (degeneracy), pleiotropy- complex emergent properties

submodels for searchable characteristics of functional knowledge integration of submodels into web-based distributed model networks

Jabberwocky

“ ’Twas brillig and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves And the mome raths outgrabe”

Lewis Carroll

The Divide Between Syntax and Semantics

“Colorless ideas sleep furiously”

Noam Chomsky (1957)

syntactically valid

semantically void

The Divide Between Syntax and Semantics

“Colorless green ideas sleep furiously” Noam Chomsky (1957)

encoded genome structure (syntax) and diverse expression repertoires (semantics)- alternative splicing- overlapping reading frames- nonsense mutations- differential modulation by different transcription

factors

database formats (syntax) and ontology (semantics)

The Conceptual Complexity of Ontology Design

ontology- set of axioms in a logical language- representational vocabulary with precise

definitions of shared understanding- axioms constrain interpretation of defined terms

XML versus ontology and evolution of the semantic web- XML less complex since semantics are not

represented- objective to reduce uncertainty favors

ontologies- objectives to reduce complexity favors XML

Convergence, Consilience, Cognition and Computing

scientific, technological and economicconvergence

datacomplexity

datascale

datadiversity

optimizeddata

representation

optimizeddata

comprehension

optimizeddata

utilization

• novel visualization and mining tools• human medicine interfaces

• ‘mind in the loop’ computing• modulation of brain function for optimum perceptualization

• adaptive IT• novel emergent networks

Bounded Rationality

human mind’s processing capacity is small relative to the size of the problems requiring analysis/comprehension (Simon)

objective solutions require complexity reduction in information, task and coordination

complexity reduction- omission and abstraction- division of labor (systems decomposition)

complexity reduction simultaneously increases uncertainty (Fox)

implications for evolution of ontologies for the semantic web

Enhancing Human Cognitive Capacities for Optimizing information Utilization

escalating quantities and types of information real time decision making new multi-modal, multi-sensory high performance

human : information interfaces representation and comprehensibility of

information flows- optimize information representation (perception)- modulation of brain function to optimize

comprehension systemic application of advances in cognitive

neurobiology

Enhancing Human Cognitive Capacities for Optimizing information Utilization

optimizing representations of information- perceptualization

optimizing cognitive capacities- states of the brain affect states of mind

(perception and cognition)- perceptual modulation techniques

Interdisciplinary Linquistics : Memetic Engineering

molspeak, medspeak, nerdspeak

standardization coding

speech recognition

object-oriented computing

synthetic intelligence

Molecular Medicine, Population Segmentation

andTargeted Patient Care

large-scalepopulation genetics

geno-phenotypecorrelations

in subpopulations

‘at-risk’subpopulations

individualrisk

profiling

Population Genetics

Linking Clinical Outcomes to Genetic Variation

populationgenetics

haplotype blocksSNP maps

low costhigh-throughput

genotyping

gene-diseaseassociations

ethics

dbases informatics

Large-Scale Disease Association Genetics and Disease Predisposition Risk Profiling

formidable logistics and cost

robust algorithms forcombinatorial gene interactions

slow evolution

complex ethical, legal and social issues

public acceptance and legislative controls

evidentiary standards and regulation

Legislative and Regulatory Considerations in the Creation and Management of Large Scale

Population Health Data Networks

consent identifiable (clinical) versus anonymous

(research) data authentication of communicating parties compliance

- HIPAA (USA)- EU Data Directive- individual nation/US State requirements- ICH5 Common Technical Document

e.health

Content

Care

Population Databanks and the Rise of Molecular Medicine

individual / family records

privacy and confidentiality

gene-disease correlations

gene-outcome correlations

gene-disease predispositionassociations

individual (targeted) care- optimum Tx- predisposition and

proactive risk management

infection

CNS

CPD

CVD

diabetes

renal

stroke

cancer

Shaping Physician Behaviour

decision support / control Dx/PDx Rx, PRx clinical guidelines education

Rx validation utilization compliance AE avoidance

wellness education compliance risk mitigation remote monitoring

populationdBase

individualrecord

andrisk

profile

Shaping Consumer / Patient Behaviour

PhysicianDesk-TopNetwork

e.Pharmacy

e.HomeHealth

Who Knows Wins!

Health Databanks

Population Segmentation and Individual Patient

Profiling

• clinical • pharmacy• lab data• outcomes

“The average person will have three to five internet devices on their body by the end of 2010…..not just the mobile phone,but health monitors,maybe even an implanted device,a GPS type of system, etc………..”

John ChambersCisco Systems

dot.CEO January 2001, p. 53

Consumer Health InformationSystems and Services

in-home to physician / pharmacy links

next generation tele-medicine andpersonal health monitoring

compliance monitoring

independent living

emergency management

integration of new imaging /diagnostic sensor systems

Biology and Medicine as Information-Based Disciplines

on-body / in-body / in-home remote devices for health status / compliance monitoring

interactive computational software and Rx of behavioral disorders

ubiquitous physician decision-support software to optimize clinical care and compliance

Cyber-Medicine

The Evolution of Large-Scale Biologygenome sequencingcomparative genomics

proteomicsfunctional genomics

structural genomics

genetic circuitsbiological order

complex systems

SNPs and gene-diseaseassociation studies

large-scale populationand statistical genetics

robust geno-phenotypecorrelations

individual genotypingand disease risk profiling

INFORMATICS

Biology and Medicine as Information-Based Disciplines

understanding the encoded instructions for biological design- genes proteins higher order assemblies- abnormal information coding in disease

assembly of large-scale population databases- gene-disease correlations- gene-Rx outcome correlations- individual genotyping and disease

predisposition risk profiling

Research

Clinical Medicine

Systems AnalysisBiology as an Informational Science

new technological platforms- automation, miniaturization, high-throughput- parallelism

new computational tools- scale, diversity of content- mining algorithms

new organizational linkages- convergence of biology and computing (science)- health / telco / compco (technology)

Systems AnalysisBiology as an Informational Science

new skills- graduate / post-graduate curricula- clinical training

new organizational structures- inter-disciplinary

new policies- grant agencies- national / international science- regulation, legislation

Computational Biology

predictive simulation of gene regulation and genetic networks- from genotype to phenotype

fast algorithms for molecular simulations modeling of molecular interactions, chemical

dynamics, transport and compartmentalization in cells

metabolic and physiological simulations scalar modeling

- molecules to cells to tissues to organs to organisms to populations

predictive tools for pre-emptive stabilization of system dysregulation

Grand Challenges

From Bioinformatics to Computational Biology

Bioinformatics : The Phenomenological Era

Computational Biology : The Theoretical Era

• ID and classification of statistical regulation among the most recurrent objects• optimum database design• fast classification/clustering algorithms• data mining software and ontological relationships

• elucidation of robust design rules• higher order multistate detector and component interactions• contextual recognition• pathways, circuits, networks and higher order assemblies• predictive biology

biology and medicine are in transition to become information-based sciences

this transition will shift R&D focus from the current reductionist framework to the analysis of biological complexity (systems biology)

these transitions will demand adoption of large scale analyses (big biology) and obligate adoption of more stringent standardization- data QC, annotation, curation- dBase formats and clinical profiling tools- massive computational capacity and dynamic,

scalable networks- distributed computing and collaborative

networks- from bioinformatics to ‘rules-based’

computational biology and cybermedicine

top related