Taverna Workflow
Jan 13, 2016
Taverna Workflow
A suite of tools for bioinformatics• Fully featured, extensible and scalable
scientific workflow management system– Workbench, server, portal– Standards-compliant provenance collection – Immediate ingest of web services– Grid services, Beanshell scripts, R-scripts,
BioMOBY services…• Web 2.0 social collaboration
environments (“E-Labs”) for sharing– Methods, workflows– Systems biology data, models and SOPS– Statistical methods
• Curated catalogue of Web Services
Taverna Open Suite of Tools
Client User InterfacesWorkflow GUI Workbench
Workflow Repository
Service CatalogueThird Party Tools
Programming and APIs
Web PortalActivity and Service
Plug-in Manager
Provenance Store
Workflow Server
Open Provenance
Model
Secure Service Access
1000s of Servicesdeveloped by the community
Any SOAP based service, REST services soonGrid Services, R Scripts, Beanshell scripts, Java programs, BioMart queries
• Gene expression• SNP arrays and aCGH• Proteomics• Pathway analysis • Systems biology model
building• Proteomics• Sequence analys• Protein structure prediction• Gene/protein annotation • Microarray data analysis
• QSAR studies• Medical image analysis• Epidemiology• Model simulation• High throughput screening• Phenotypical studies• Phylogeny• Statistical analysis• Text mining• Data retrieval and formatting• QTL studies
CDK
Taverna SoftwareRelease Information• Taverna first released 2004. • Current versions 1.7.2 and Taverna 2.1.2• Currently 1500 + users per month, 350+ organizations, ~40
countries, 80000+ downloads across versionsAvailability• Freely available, open source LGPL• On Windows, Mac OS, and Linux platformsResources• http://www.taverna.org.uk, http://www.mygrid.org.uk• User and developer workshops, documentation, email help
desk• Collaborations with numerous groups including NCI’s cancer
biomedical informatics grid (caBIG), EMBL-EBI, NCBI, Concept Web Alliance, Bio2RDF
myExperiment• A Web 2.0 community for
sharing, discovering and reusing workflows and other scientific methods.
• A platform for launching workflows
• Launched late 2007. • Currently: 3272 members, 223
groups, 1024 workflows, 306 files and 97 packs, 56 different countries.
• 10+ workflow systems: Taverna, Pipeline pilot, BioExtract, Kepler
• ~ 3000 unique hits per month
REST APIsLinked Open DataSoftware Open source BSD
Systems Biology and myGridSysMO-SEEK• e-Laboratory for interlinking
and sharing data, models, SOPS and workflows for Systems Biology in Europe
• ISA-TAB & SBML/MIRIAM compliant
ONDEX• Network based analysis
environment for Systems Biology
• Uses Taverna workflows and text mining
http://www.ondex.org/http://www.sysmo-db.org/
Performing Taverna KDA and Pathways pipeline
• A demonstration Taverna Pipeline (workflow)
• Calculate a differentially expressed genes in a TCGA dataset
• Perform KDA using a Sage breast cancer network model and the gene list from the differentially expressed genes
• Reformats the KDA output for Cytoscape
• Launches Cytoscape to visualize the results
• Extracts gene names from TCGA dataset
• Finds pathways for these genes in KEGG using workflow deposited in myExperiment.
Taverna pathway pipeline demo