Top Banner
VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014
37

VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Dec 22, 2015

Download

Documents

Laurence Haynes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

VAISHNAVI GOWRISANKAR66559137104/07 /2015

Human Disease Symptom Network

Zhou et.al., 2014

Page 2: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Outline

Introduction Human Symptom Disease network (HSDN)

Construction of HSDNResults

Performance evaluation of HSDN Integrating gene disease associations Integrating shared protein interactions Diversity of disease manifestation and molecular mechanism Disease Groups

DiscussionLimitationsFuture Directions

Page 3: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Introduction

Networks used to study entangled relationship between diseases

This construction of networks has been widely used to infer comorbidity links between disorders and

disease history of patients1

disease phenotypic network using comorbidity patterns have been used to understand disease progression patterns2

Page 4: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Introduction

Symptoms and signs which a patients presents overlooked

Symptoms are most directly observable characteristics of a disease and the very basis of clinical disease classification

Connection between shared symptoms and genes of 2 diseases could bridge gap between biological discovery and bed side clinical observations.

Page 5: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

In this article, a large scale medical bibliographic records (PubMed – including MEDLINE) and the related Medical Subjects Headings (MeSH) metadata was used to generate a symptom-based network of human diseases - HSDN

The link weight between 2 diseases quantifies the similarity of their respective symptoms

By integrating disease-gene association and protein-protein interaction data, the correlations between the symptom similarity of the disease and their degree of shared genes was investigated

Introduction

Page 6: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Basic Datasets: Construction of symptom-based disease network requires a basic taxonomy for diseases and symptoms

(MeSH) a corpus of data from which to extract their

relations (PubMed)MeSH vocabulary and PubMed literature

database was chosen from several possible combinations

ICD9/10, HPO and OMIM

MeSH is used directly to index all articles in the massive PubMed database

Page 7: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

MeSH is designed as hierarchical structure with general categories (Animals, Diseases, Phenomena and Process)

Diseases contains the sub-category Symptoms and Signs – terms related to clinical manifestations observed by physicians and perceived by patients

All terms in the diseases category except ‘animal diseases’ was included

Page 8: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Finally 4442 distinct MeSH diseases terms and 322 distinct MeSH symptom terms were used in PubMed query which resulted in 7,109,429 PubMed records

The above 7,109,429 PubMed records are filtered for the co-occurrence of at least one disease and one symptom term 849,103 records was obtained

Page 9: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Illustration of the protocol

Page 10: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Extracting the disease-symptom relationships from PubMed bibliographic literature database. The association between symptoms and diseases are based on their co-occurrence in the MeSH metadata fields of PubMed

Page 11: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Symptom based disease similarity• To quantify the relationship between a symptom and

a disease, Tf-Idf is used• Every disease j by a vector of symptoms dj & wi,j

quantifies the strength of the association between symptom i and disease j

• To avoid absolute co-occurrence due to highly abundant symptoms and publication biases towards certain diseases, Tf-Idf is used instead of wi,j

Page 12: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Term frequency-Inverse document frequency (Tf-Idf)

Wi,j is the strength of an association between symptom i and disease j

N – number of all diseases in the dataset ni – number of diseases where symptom i appears

Page 13: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Similarity between 2 diseases is defined by the cosine similarity of the respective disease vectors Vectors dx and dy of 2 diseases x and y

Cosine similarity ranges from 0 (no shared symptoms) to 1 (identical symptoms)

Page 14: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

A disease network is constructed, in which nodes represent diseases and links represent symptom similarities between diseases

Page 15: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Integrating gene disease association and PPI databases to obtain shared genes/PPI between diseases

Page 16: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

Resulting diseases network in which links represent shared genes/PPI

Page 17: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Construction of HSDN

The backbone of the HSDN. Highly clustered regions of the network belong to same broad disease categories

Page 18: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

Performance Evaluation of HSDN Manual evaluation of retrieved co-occurrences

1000 records were randomly selected from 849,103 PubMed records and extracted disease-symptoms relationship with the help of medical experts.

Our evaluation focused on the issues disease-symptom relationship is direct and not

influenced by drugs or coincidental co-occurrence reported symptoms-disease relations are very specific.

57% of the records point to one disease, 28.5% point to 2 diseases and only 14.5% pointing 2 or more.

minimal false positives only 0.8% (disease x is not related to symptom y)

Page 19: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

Performance Evaluation of HSDN Reliability test for the disease similarity score

Construction of benchmark disease network (HPO) and comparing it with HSDN Construction of HPO (Human Phenotype Ontology)

• Manually curated database derived from OMIM (Online catalogue of human genes and genetic disorders)

• Covers all phenotypic abnormalities in commonly human monogenic disease

• MeSH disease terms are typically more general and therefore several OMIM identifiers may map to one MeSH term

• Final HPO used to benchmark the HSDN contained 940 diseases map both on OMIM and MeSH with 121,945 links indicating shared symptoms

Page 20: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

This network is much smaller than HSDN but arguably of higher quality (OMIM disease identifiers are much more specific when mapped with MeSH)

Higher symptom similarity in HSDN is related to higher edge overlap in HPO

Pearson correlation coefficient between ratio of shared disease links and disease similarity is very high (0.96) indicating HSDN obtains a reliable disease similarity score

Page 21: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

Shared symptoms indicate shared genes between diseases Integrated 3 genotype-phenotype databases and

constructed a Human Disease Network (HDN) as described by Goh et.al3

In HDN, 2 diseases are connected if they share a gene Comparing HSDN with HDN, overlapping link ratio shows a

strong positive correlation between disease similarity Overlapping link ratio is a fraction of disease pairs with both

shared symptoms and shared genes of all disease pairs with shared symptoms.

It can be inferred that diseases with more similar symptoms are more likely to have common gene associations

Page 22: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Shared symptoms indicate shared protein interactions Not only gene association but also close interaction of

proteins Integrated 5 publicly available PPI databases into 1 binary

PPI network Constructed disease networks in which 2 diseases are linked

if they share first and second order PPI interactions Proteins associated to the same human disease/disease

category or phenotype tend to interact with each other and so HDSN focuses only on symptoms and includes all diseases categories, thereby providing robust evidence that interacting proteins between diseases are also connected to similar higher level manifestations.

Results

Page 23: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

High symptoms similarity strongly correlates with shared genes as well as first and second order protein interactions suggesting general relationship between phenotypic similarity on one hand and path lengths on the PPI network on the other hand

To test this we calculate the shortest path (DijKstra’s algorithm) link for all protein pairs and the minimum shortest PPI path length between each disease pair

Higher the symptom similarity shorter the PPI network distance between diseases

Page 24: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

DijKstra’s algorithm to find all shortest path in the PPI network To quantify the PPI distance between disease pairs

single linkage distance DSL is used DSL is the minimum of all shortest paths between

related proteins For 2 diseases x and y with corresponding related

protein sets Px and Py, the single linkage distance is given by

D(pi, pj) is the shortest path length between 2 proteins pi and Pj

Page 25: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

Diversity of disease manifestations and molecular mechanisms Pleiotropism and genetic heterogeneity causes discrepancy

in diverse clinical manifestations and underlying cellular mechanisms

To understand these complex relations genome components are mapped with intermediate phenotype components, environmental factors

To analyze the relation between molecular and phenotypic diversity of diseases SGPDN is constructed.

Shared genes, proteins disease network (SGPDN) An integrated disease network that combines phenotypic

relations based on symptom similarity with shared molecular mechanisms based on protein interactions was constructed

Page 26: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

HSDN for significant links with similarity score >0.1 is filtered All disease links supported by either shared genes or 1st/2nd

order protein interactions Betweennes and node diversity are used to measure the

disease diversity in this network Betweennes is a centrality measure quantifying how many

shortest path run through the node

Diversity ϕ of node j is based on the node bridging coefficient

k(i) is the degree of node I, N(i) denotes its neighborhood

Page 27: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

A strong positive correlation of the 2 quantities used to measure disease diversity in the SGPDN and the corresponding maximum diversities of disease related genes in the PPI network was found

These results demonstrate that a disease with diverse clinical manifestations will typically also have more diverse underlying cellular network mechanisms

Page 28: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Results

Disease Groups To study the interrelationship between the classes of

diseases. In the SGPDN, it was found that diseases within the same

category form clear highly interconnected communities Eg: metabolic diseases, digestive system diseases

Exceptions include bacterial, viruses diseases which link to all the communities

Page 29: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Discussion

Results indicate strong associations between symptom similarity of diseases and shared genes and PPI’s

Clear correspondence between the diversity of the clinical manifestations of diseases and the underlying diversity in their cellular mechanisms

Individual level disease phenotypes (symptoms) and molecular level disease components (genes/PPIs) show robust correlations, even though their direct associations are influenced by complicated intermediate factors

Page 30: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Discussion

Observed correlations between clinical manifestations and molecular mechanisms of disease can be highly valuable for functional annotations of genomics and reveal regularities between different disease categories

Another promising use of this broad data across disease categories is a comparison between genetic and infectious diseases

Symptoms also play a crucial role in drug related research and as most FDA approved drugs are palliative (just treat symptoms rather targeting disease specific genes or pathways)

Page 31: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Limitations

MeSH vocabulary is relatively old and rigid with only annual updates This could limit the extent to which the identified

associations capture latest research results of the rapidly evolving field of medicine

Page 32: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Future Directions

How to improve full text analysis of large-scale database to increase the accuracy of search?

How to improvise on the distinction between symptoms and disease is not very well understood as yet?

How to develop techniques that can automatically extract information from clinical records?

How to develop a method of symptom similarity scores that can be assigned to provide for gene prioritization and target identification of viral/bacterial infections.

Page 33: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

References

1) Rzhetsky A., Wajngurt,D., Park,N. & Zheng,T. Probing genetic overlap among complex human phenotypes. Proc. Natl. Acad. Sci USA 104, 11694-11699 (2007)

2)Hidalgo, C.A., Blumm, N., Barabasi, A.L. & Christakis, N.A A dynamic network approach for the study of human phenotypes. PLoS. Comput. Biol 5, e1000353 (2009)

3) Goh, K. I. et al. The human disease network. Proc. Natl Acad. Sci., 104, 8685–8690, (2007)

4) Supplementary Methods, Zhou et.al Nature Communications, 1-22, 2014

5) Wikipedia

Page 34: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

• THANK YOU!

Page 35: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Definitions

Genotype – The genetic makeup of a cell, an organism, or an individual usually with reference to a specific characteristic under consideration

Phenotype – The outward appearance of an organism, the expression of genotype in the form of traits that can be seen or measured.

Page 36: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

HSDN – Human Symptoms Disease Network

MeSH – Medical Subject Headings – defined by experts and offers a comprehensive vocabulary across all disease categories

PPI – Protein-protein interaction

Definitions

Page 37: VAISHNAVI GOWRISANKAR 665591371 04/07/2015 Human Disease Symptom Network Zhou et.al., 2014.

Polygenicity – multiple gene inheritance influencing the phenotypic trait

Pleiotropism – A single gene affects a number of phenotypic traits in the same organism. These affected traits often seem unrelated to each other

Genetic heterogeneity: Single phenotype or genetic disorder can be caused by a number of alleles,locus. This is in contrast to pleiotropism

Definitions