Service Composition in Biomedical Applications Pedro Lopes [email protected] PhD Thesis Proposal Programa Doutoral em Engenharia Informática December 17 th , 2009 Research Supervisor: José Luís Oliveira [email protected]
Mar 08, 2016
Service Composition in Biomedical Applications
Pedro Lopes [email protected]
PhD Thesis ProposalPrograma Doutoral em Engenharia Informática
December 17th, 2009
Research Supervisor: José Luís Oliveira [email protected]
Outline
‣ Introduction
‣ Bioinformatics
‣ Objectives
‣ Problems & Requirements
‣ Technologies
‣ Strategies
‣Workplan
‣What’s Next?
‣ Internet (and computer science) is suffering a (r)evolution!
• New application paradigms
‣ Web access anywhere, anytime and to everyone
• Static
• Mobile
‣ The platform for everything
• New opportunities
• New challenges
Introduction
‣ Internet (and computer science) is suffering a (r)evolution!
• New application paradigms
‣ Web access anywhere, anytime and to everyone
• Static
• Mobile
‣ The platform for everything
• New opportunities
• New challenges
Introduction
Information KnowledgeData
Bioinformatics [Motivation]
8 bits - 1 byte
1 0
ATCG _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _
Bioinformatics [Motivation]
8 bits - 1 byte
1 0
ATCG _ _ _ _ _ _ _ _
256 combs1 0 1 1 0 0 1 1
Bioinformatics [Motivation]
8 bits - 1 byte
1 0
ATCG
Wonderful Complexity!
256 combs1 0 1 1 0 0 1 1
65536 combsA C C G T T A G
Bioinformatics [Contextualization]
‣ It all started in the Human Genome Project...
• Immense amount of data
‣ New technologies to deal with the “Book of Life”
‣ New projects were born
• More data!
‣ Need for improved, next-generation applications
Bioinformatics [Contextualization]
‣ It all started in the Human Genome Project...
• Immense amount of data
‣ New technologies to deal with the “Book of Life”
‣ New projects were born
• More data!
‣ Need for improved, next-generation applications
Information KnowledgeData
Bioinformatics [Landscape]
‣ Databases
• KEGG, UniProt, EBI, NCBI, LOVD, UMD... (150 in GeNS)
‣ Service protocols
• DAS, BioMart, EMBOSS, Soaplab, WABI, BioMOBY
‣ Integration applications
• DiseaseCard, GeneBrowser, GeNS, ...
• Taverna, Bioclipse...
• Biozon, Bioconductor, Entrez, Ensembl, ...
• Bio2RDF, RDF Scape, ...
‣ Previous research
• DynamicFlow
Objectives
‣ Dig deep in the life sciences research field
• Understand the problems
• Study state-of-the-art
‣ Propose solutions
• Analyze the requirements
• Develop framework
‣ Internal and external usage
‣ Publish
Objectives
‣ Dig deep in the life sciences research field
• Understand the problems
• Study state-of-the-art
‣ Propose solutions
• Analyze the requirements
• Develop framework
‣ Internal and external usage
‣ Publish
Promote research and development of novel, next-generation frameworks and strategies to enhance life
sciences web applications and systems
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
‣Local‣Remote APIs‣Web Services
‣Structure‣Ontology‣Semantics
‣HTML‣CSV‣XML‣TXT‣Excel
‣Relational Database‣OO Database‣Text File‣Binary File
‣Web Server‣FTP Server‣File Server‣Backup Tape
Heterogeneity [Problems & Requirements]
‣ Subject of many research projects
‣ Occurs at various levels
Physical Logical Format Model Access
Integration [Problems & Requirements]
‣ To deal with resource heterogeneity
Centralized (...) to Distributed (...)
‣ Various solutions
Integration [Problems & Requirements]
‣ To deal with resource heterogeneity
Centralized (...) to Distributed (...)
‣ Various solutionsLink
App
Mediator
App
Mediator
Warehouse
App
Integration [Problems & Requirements]
‣ To deal with resource heterogeneity
Centralized (...) to Distributed (...)
‣ Various solutionsLink
App
Mediator
App
Mediator
Warehouse
App
Hybrid framework!
Interoperability [Problems & Requirements]
‣ Facilitate integration and communication between applications
Conceptual interoperability
Dynamic interoperability
Pragmatic interoperability
Semantic interoperability
Syntactic interoperability
Technical interoperability
No interoperability
Increasing capability forinteroperation
Description [Problems & Requirements]
‣ Resource description is the key for integration and interoperability
• Provide meaning to content
‣ Apply area-specific terminology
• Ontology
•An extra-effort to resource publishers
• Will be very important in the future Internet
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
UDDI
Web services [Technologies]
‣ Applications need to communicate with each other through the web
‣Most widely used technology for the development of distributed web applications
• SOAP
• REST
• XMPP
Service Broker
Service Requester
Service ProviderSOAP
WSDL W
SDL
GRID and Semantic Web [Technologies]
‣ GRID
• Combination of software and hardware infrastructures
‣ Pervasive, Consistent, Low-cost,
• Various GRID types (Computing, Data, Knowledge)
‣ Semantic Web
• Resource Description
‣ Complete framework
• OWL + RDF + SPARQL, Microformats
• Link available resources in a meaningful way for both Humans and Machines
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
Roadmap
‣Heterogeneity
‣Integration
‣Interoperability
‣Description
Problems & Requirements
‣Static Apps
‣Dynamic Apps
‣Meta Apps
Strategies
‣Web-based access
‣Web Services
‣GRID
‣Semantic Web
Technologies
Static or Dynamic Applications [Strategies]
‣ Static
‣ Solve all the problems... “by hand”!
• Hard-coded integration, interoperability and Description
‣ Not a very clever solution
• Adequate to (very) small projects
‣ Dynamic
‣ Take advantage of novel concepts
• Description + Composition
• Intelligent mechanisms for input/output combinations
‣ Generic
• Suitable for the majority of scenarios
Meta Applications [Strategies]
‣ Applications running applications
• Like metadata is data about data
‣ Software-as-a-service
• Service Oriented Architectures
‣Mashups
• Workflows
Meta Applications [Strategies]
‣ Applications running applications
• Like metadata is data about data
‣ Software-as-a-service
• Service Oriented Architectures
‣Mashups
• Workflows
Activity 5In: D - Out: Final
Activity 4In: C - Out: D
Activity 2aIn: B & Z - Out: C
Activity 1aIn: A - Out: B
Activity 3In: Y - Out: Z
Activity 2bIn: X - Out: Y
Activity 1bIn: A - Out: X
‣ Advanced usage
Workplan [Calendar]
‣ Thesis Year 1Year 1Year 1Year 1 Year 2Year 2Year 2Year 2 Year 3Year 3Year 3Year 3 Year 4Year 4Year 4Year 4Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
State of the Art
Domain Analysis
Proposal
Main corpus
Delivery
Year 1Year 1Year 1Year 1 Year 2Year 2Year 2Year 2 Year 3Year 3Year 3Year 3 Year 4Year 4Year 4Year 4Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Preliminary Research
System Analysis
Modelling
Active Development
Deliveries
Year 1Year 1Year 1Year 1 Year 2Year 2Year 2Year 2 Year 3Year 3Year 3Year 3 Year 4Year 4Year 4Year 4Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
High Impact Factor
Medium Impact Factor
‣ Software
‣ Publications
Workplan [Publications]
‣Medium impact factor
• International Conferences & Workshops
‣ High impact factor
• Science, BMC Bioinformatics, Hindawi, Oxford Journals
‣ Published work• Dynamic Service Integration using Web-based Workflows
‣ 10th International Conference on Information Integration and Web Applications and Services; Linz, Austria; November 2008
• DynamicFlow: A Client-side Workflow Management System
‣ 3rd International Workshop on Practical Applications of Computational Biology; Salamanca, Spain; June 2009
• Arabella: A Directed Web Crawler
‣ International Conference on Knowledge Discovery and Information Retrieval; Madeira, Portugal; October 2009
• Link Integrator: A Link-based Data Integration Architecture
‣ International Conference on Knowledge Discovery and Information Retrieval; Madeira, Portugal; October 2009
• Integration of Variome Data using a Link Discovery Strategy
‣ Iberian Bioinformatics Conference 2009; Lisbon, Portugal; November 2009
What’s Next?
‣ Research and Development
• Enabling knowledge
‣ Semantic Web as a technology to ease integration and interoperability
• Well-defined competition
• Ongoing “hands-on” work
• Promote internal and external usage
‣ One framework, multiple projects
• EU-ADR, GEN2PHEN, DiseaseCard, OralCard, VarCard
‣ Publish
Promote research and development of novel, next-generation frameworks and strategies to enhance life
sciences web applications and systems
Thank You!
Questions?