Bioinformatics Research Collaboration: NECC & BiNDniclweb.org/wp-content/uploads/2014/09/BioinformaticsCathyWu.pdf · Collaboration Scientific Resource Education. DE ME NH RI VT.
Post on 21-Sep-2018
215 Views
Preview:
Transcript
NICL Program: Bioinformatics Track
Washington, DCJune 17, 2014
Cathy H. Wu, Ph.D.Edward G. Jefferson Chair of & Director
Center for Bioinformatics & Computational BiologyProgram Coordinator, Delaware INBRE
University of Delaware
Bioinformatics Research Collaboration: NECC & BiND
5th Biennial National IDeA Symposium of Biomedical Research Excellence (NISBRE)
Collaborative use of specialized resources & expertise in an integrated process
• Little Skate (Leucoraja erinacea) Clones: MDIBL-Mount Desert Island Biological Lab (ME)
• Next-Generation Sequencing: UD DNA Sequencing & Genotyping Center (DE)
• Sequence Assembly: Vermont Genetics Network (VT) with ME, RI• Sequence Analysis & Annotation: Bioinformatics pipeline at UD CBCB (DE), ME, RI, NH, VT• Storage & Access of Sequence/Annotation data: Shared data center (DE, VT, ME)• Public Dissemination: NCBI (BioProject, SRA, GenBank), SkateBase (skatebase.org) • Scientific publications: Science [PMC3264428], PNAS [PMC3150877], Database [PMC3308154]
Skate Genome ProjectNortheast Cyberinfrastrucrture
Consortium (NECC)
2
Collaboration Scientific Resource Education
DE
ME
NH
RI
VT
NECC
NEBC
SequencingDE
Data StorageRegional Data Center
AnnotationDE, ME, VT, NH, RI
AssemblyVT, ME
Public AccessDE, MESkate DNA
Stage 32 embryoME
Little SkateProject Workflow
The Northeast Cyberinfrastructure Consortium (NECC)
The NECC is a consortium of 5 IDeA states, Maine, Vermont, Delaware, Rhode Island & New Hampshire established in 2006.
NECC Goals1. Build Cyberinfrastructure in NE region2. Workforce training and diversity3. Collaborative Research
The Northeast Bioinformatics Consortium (NEBC) began the Little Skate Genome Project in 2010 as a model for distributed collaboration using a data intensive project requiring integration of specialized resources and expertise.
http://www.necyberconsortium.org/
Current NEBC research efforts include characterization and comparison of embryonic transcriptomes from three chondrichthyan species: the little skate, Leucoraja erinacea, the small spotted catshark, Scyliorhinuscanicula, and the elephant shark, Callorhinchus milii.
Project Based DiscoveryThe project is a valuable resource for comparative biomedical research and evolutionary biology. Publications using little skate genome and transcriptome data are increasingly prevalent in the literature.
1. Boehm, T. & Swann, J. B. Origin and Evolution of Adaptive Immunity. Annu Rev AnimBiosci 2, 259–283 (2014).2. Braasch, I. et al. Connectivity of vertebrate genomes: Paired-related homeobox (Prrx) genes in spotted gar, basal teleosts, and tetrapods. Comp Biochem Physiol C ToxicolPharmacol 163, 24–36 (2014).3. Falcón, J. et al. Drastic neofunctionalization associated with evolution of the timezymeAANAT 500 Mya. Proc Natl Acad Sci U S A 111, 314–9 (2014).4. Modrell, M. S. et al. A fate-map for cranial sensory ganglia in the sea lamprey. Dev Biol385, 405–16 (2014).5. Moore, D. B. et al. Asynchronous evolutionary origins of Aβ and BACE1. Mol Biol Evol 31,696–702 (2014).6. Venkatesh, B. et al. Elephant shark genome provides unique insights into gnathostomeevolution. Nature 505, 174–9 (2014).7. Amemiya, C. T. et al. The African coelacanth genome provides insights into tetrapod evolution. Nature 496, 311–6 (2013).8. Frankenberg, S. & Renfree, M. B. On the origin of POU5F1. BMC Biol 11, 56 (2013).9. Gillis, J. A., Modrell, M. S. & Baker, C. V. H. Developmental evidence for serial homology of the vertebrate jaw and gill arch skeleton. Nat Commun 4, 1436 (2013).10. Lopes-Marques, M., Cunha, I., Reis-Henriques, M. A., Santos, M. M. & Castro, L. F. C. Diversity and history of the long-chain acyl-CoA synthetase (Acsl) gene family in vertebrates. BMC Evol Biol 13, 271 (2013).11. Richards, V. P., Suzuki, H., Stanhope, M. J. & Shivji, M. S. Characterization of the heart transcriptome of the white shark (Carcharodon carcharias). BMC Genomics 14, 697 (2013).12. Gaudet, P. et al. Recent advances in biocuration: meeting report from the Fifth International Biocuration Conference. Database (Oxford) 2012, bas036 (2012).13. Tossidou, I. et al. CD2AP regulates SUMOylation of CIN85 in podocytes. Mol Cell Biol 32,1068–79 (2012).14. Västermark, Å. et al. Identification of distant Agouti-like sequences and re-evaluation of the evolutionary history of the Agouti-related peptide (AgRP). PLoS One 7, e40982 (2012).15. Wang, Q. et al. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees. Database (Oxford)2012, bar064 (2012).16. King, B. L., Gillis, J. A., Carlisle, H. R. & Dahn, R. D. A natural deletion of the HoxC cluster in elasmobranch fishes. Science 334, 1517 (2011).17. Schneider, I. et al. Appendage expression driven by the Hoxd Global Control Region is an ancient gnathostome feature. Proc Natl Acad Sci U S A 108, 12782–6 (2011).
NEBC Research
Three workshops and an annotation Jamboree were coordinated by the NECC collaboration. The workshops were designed to teach gene and protein annotation from a next generation sequencing data perspective to participants with little or no experience. Instructors included regional NEBC experts, NIH and Industry leaders. Lecture materials are linked from SkateBase as and serve as a valuable educational resource available to the public.
Workshops
CurriculumSkateBase includes the infrastructure to teach gene and protein annotation. SkateBase has been used by NECC IDeA state institutions in both graduate and undergraduate classes. Through active and domain-targeted outreach, use of SkateBase as an educational model expanded outside the NECC institutions and includes the Virginia Institute of Marine Science (VIMS). Genes annotated by students are reviewed by SkateBase curators before creating a gene page with gene structural, functional annotation with linked homology and relevant PubMed references.
The American Elasmobranch Society (AES) is a non-profit organization that seeks to advance the scientific study of living and fossil sharks, skates, rays, and chimeras, and the promotion of education, conservation, and wise utilization of natural resources. Skatebase data and resources were presented to AES members at the Joint Meeting of Ichthyologists and Herpetologists in 2013 as well as the 2014 Plant and Animal Genome Conference. Researchers were introduced to the project and invited to use the resource for research and educational purposes. Continued and successful expansion of skatebase includes community annotation which will benefit by the participation of domain experts. SkateBaseis linked from the AES website as well as the Elephant Shark Genome Project website.
University of Maine at Machias• Introduction to Biochemistry
University of Rhode Island• Practical Tools for Molecular Sequence Analysis
University of Delaware• Bioinformatics• Experimental Molecular Biology
Georgetown• Bioinformatics
Virginia Institute of Marine Science• Molecular Genetic Data Analysis
AcknowledgementsFunding provided by a re-entry career award to JTW: NIGMS INBRE 3P20GM103446-12S1Skate genome sequencing was funded by: NIH NCRR ARRA Supplements to 5 P20 RR016463-12 (MDIBL), 5 P20 RR016472-12 (UD), 5 P20 RR16462 (UVM).The North East Cyberinfrasturcture Consortium is funded by:• NIH National Center for Research Resources grants: 5 P20 RR016463-12 (MDIBL), 5 P20 RR016472-12 (UD), 5 P20
RR16462 (UVM), 5 P20 RR016457-11 (URI), 5 P20 RR030360-03 (UNH)• NIH National Institute of General Medical Sciences grants: 8 P20 GM103423-12 (MDIBL), 8 P20 GM103446-12 (UD),
8 P20 GM103449 (UVM), 8 P20 GM103430-11 (URI), 8 P20 GM103506-03 (Dartmouth)• National Science Foundation EPSCoR grants: EPS-0904155 (UM), EPS-081425 (UD), EPS-1101317 (UVM), EPS-
1004057 (URI), EPS-1101245 (UNH).
Outreach
Cartilaginous fishes are divided into two major groups, elasmobranchs and holocephalins. The skate genome project is currently the only public elasmobranch sequencing project. SkateBase.org serves as the project hub. SkateBaseincludes all data generated by the research effort, relevant links, tools including SkateBlast for local queries, a gene table containing annotation features, and a project vitae. For community annotation and educational applications protein and gene annotation guides and examples are provided in addition to the online interface.
Protein Annotation Interface Gene Annotation Workflow
Gene Table
Global Chondrichthyan Genome Sequencing Efforts
The Skate Genome Project: A Model for Scientific Collaboration and EducationJennifer T. Wyffels, Benjamin L. King, Shawn W. Polson, James Vincent, Chuming Chen and Cathy H. Wu
North East Bioinformatics Collaborative of the North East Cyberinfrastructure Consortium
Skates as Biomedical Models:Fundamental Vertebrate Characteristics
Most evolutionarily distant jawed vertebrate
Pressurized circulatory system
Adaptive immune system
Neural crest
Renal physiology
Reproductive Modes:
Oviparity – Placental Viviparity & Parthenogenesis
HPC
Research Capabilities/Needs
Computer Science
NGS
Public Health
Data Mining
Data Management
BioStatistics
Medical InformaticsExpertiseNeeds
• Research: Foster interdisciplinary, cross-campus and inter-institutional research collaborations synergistic to UD strategic areas
• Education: Establish graduate degree programs– Fall 2010: Master’s Program in Bioinformatics & Computational Biology– Fall 2012: PhD program in Bioinformatics & Systems Biology
• Core: Provide scientific expertise and infrastructure support in Bioinformatics & Computational Biology for the Delaware research and education community
• > 60 affiliated faculty from five Colleges– CoE (Engineering), CAS (Arts & Sciences), Agriculture & Natural Resources (CANR),
Earth, Ocean & Environment (CEOE), Health Sciences (CHS)
CBCBPromote, coordinate and support
interdisciplinary activities in Bioinformatics &
Computational Biologyhttp://bioinformatics.udel.edu/
9
NGS (Next-Gen Sequencing) Data AnalysisShort Reads
Organize
Visualize
Analyze
Analysis Pipelines for• RNA-Seq
• miRNA• De novo Genome Assembly • Reference Mapping
• Genomic Structural Variation: SNP/Indel/CNV
• Reduced Representation Library
• Amplicon Library (16S rRNA)
• Metagenome
• Metatranscriptome
11
Variant Pathway Analysis
The web-based iProXpress provides tools for functional profiling, such as pathway and GO enrichment analysis, and allows for custom display of selected fields from >160 databases integrated in PIR iProClass, including OMIM and KEGG
13
Gene Variant Network Analysis
• Patellae dislocation• Ligamentous laxity• Small thorax• Brachydactyly• Short femoral necks• Development delay• Flat midface• Depressed nasal bridge• Stub thumb
Pedigree & Phenotype Description
• STRING Interaction network of variant genes
• CYTOSCAPE clustering and visualization
14
15
Clinical NGS Analysis: Multi-Institution Multi-Core Collaboration
Project Coordination
• Two Cores already using iLab Solutions• Investigate unifying cores under iLab “Collaborating Cores” integration• Unified sample submission• Cores pass collaborative project seamlessly without user intervention• Project tracking• Mechanisms for simplified billing/accounting
top related