1 __________________________________________________________________________________________________ 10/18/2013 GCBA 815 Tools and Algorithms in Bioinformatics GCBA815, Fall 2013 Week6: Interaction Network Analysis (Cytoscape) Babu Guda Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center __________________________________________________________________________________________________ 10/18/2013 GCBA 815 Background: • Gene products (RNA/proteins) rarely work alone; most often they interact with other gene products to accomplish a task • Most of the cellular processes are regulated by protein- protein or DNA/RNA-protein complexes • Impaired protein interactions can be causative factors for diseases or metabolic abnormalities • Guilt by association : The unknown function of a protein can be inferred based on the proteins it interacts with, if those proteins have a known function • The field of protein-protein interactions (PPIs) is rapidly advancing at various fronts of biomedical research.
14
Embed
Tools and Algorithms in Bioinformatics · 2013-10-18 · BIND 80,378 All major model species covered. DIP 53,778 Mostly Y2H studies, all major model species covered. HPRD 34,367 Only
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Gene products (RNA/proteins) rarely work alone; most often they interact with other gene products to accomplish a task
• Most of the cellular processes are regulated by protein-protein or DNA/RNA-protein complexes
• Impaired protein interactions can be causative factors for diseases or metabolic abnormalities
• Guilt by association: The unknown function of a protein can be inferred based on the proteins it interacts with, if those proteins have a known function
• The field of protein-protein interactions (PPIs) is rapidly advancing at various fronts of biomedical research.
Human interactome Ulrich et al., 2005, reproduced from Cell Journal
• Screened 25 million PPIs • Found 3186 PPIs among 1705 proteins • Maps 195 disease proteins to new partners • Functional annotation of 342 uncharacterized human proteins
URL Comments on the source and type of data covered
BIND http://www.bind.ca/Action All major model species covered.
BioGRID http://www.thebiogrid.org Mixture of invivo, invitro and Y2H interactions from different sources. Major species covered are Yeast, Drosophila, C.elegans and Human.
DIP http://dip.doe-mbi.ucla.edu/ Mostly Y2H studies, all major model species covered.
HPRD http://www.hprd.org Only Human, manually curated from the literature
IntAct http://www.ebi.ac.uk/intact Mainly literature-curated. All major model species covered
MINT http://mint.bio.uniroma2.it/mint Both experimental and literature-based, major species covered are Yeast, Drosophila and Human.
OPHID http://ophid.utoronto.ca/ophid Only Human (Experimental and predicted)
PRISM http://gordion.hpc.eng.ku.edu.tr/prism
Predicted interactions based on interacting surfaces in X-ray crystal structures
STRING http://string.embl.de Mostly predicted interactions based on multiple criteria
• PROSITE : A database of protein profiles and patterns • PRODOM : PROtein DOMain Database-built from UNIPROT • PRINTS: A Compendium of Protein Fingerprints • PFAM : Protein families database of alignments and HMMs • TIGRfams: Protein families based on HMMs • SMART: Simple Modular Architectural Research Tool • BLOCKS: Blocks WWW Server obtained from PROSITE • PANTHER: Protein Analysis Through Evolutionary Relationships • CATH: Class Architecture, Topology & Homologous super family • SCOP: Structural Classification of Proteins • Superfamily: Structural and Functional Protein Annotation • Gene3D: Domain Architecture Classification • INTERPRO: Integrated Resource of Protein Domains and Functional Sites
Significance of studying DDIs • Protein-protein interaction (PPI) data is available as binary data, i.e., an interaction is found’ or ‘not found’.
• About 70% of eukaryotic proteins are multi-domain proteins. In these cases, it is difficult to know which domains actually participate in each interaction.
• Studying interactions at the domain level is vital for understanding the functional significance of PPIs.
• Experimental determination of all DDIs is tedious, hence computational methods can be used to infer DDIs in PPIs and thus can complement experimental investigations.
• Domain Definitions were obtained from the InterPro database that integrates 10 distinct domain databases such as Pfam, Prosite, SMART, Superfamily, etc. Out of 15,064 domain in the InterPro database, 10,389 (~70%) were used in this study.
Positive DDI dataset for validation: • About 4000 known DDIs were used from the iPfam database. • The iPfam was created based on domain-domain contacts in solved protein structures and complexes. This dataset has been extensively used as a ‘gold standard’ for validating computational prediction methods.
• Directed and undirected graphs • Cyclical and linear graphs • Complete and incomplete graphs • Hub nodes • Subgraphs • Graph centrality • Shortest path • Graph density • Power law distribution
• http://www.cytoscape.org • Interaction network visualization and analysis software, first
published in 2003 from Trey Ideker’s group • Open-source tool with active developer support • Cytoscape version 2.8 • Cytoscape version 3.0 is a newly released with new features • Available for all platforms (Mac, PC, Linux) • Contains extensive collection of Plugins to analyze a variety
of datasets from Biology, social sciences and semantic web • Integrated with other tools such as the R package
How to use Cytoscape? • Register and download the software from Cytoscape
• http://www.cytoscape.org • Install on your local computer (PC/Mac/Linux) • Locate the folder (Program files) where files are stored • Use example datasets
• .sif files are network input files • A pp B or A pd C
• node or edge attribute files • .cys files are cytoscape session files (contains info
on network, attribute and session option data) • Other formats: Text, Excel, GML, XGMML, SBML,