BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 1 Richard White Design decisions: architecture 1 July 2005 Design decisions: architecture Richard J. White Cardiff School of Computer Science
Jan 02, 2016
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
1Richard WhiteDesign decisions: architecture 1 July 2005
Design decisions: architecture
Richard J. WhiteCardiff School of Computer Science
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
2Richard WhiteDesign decisions: architecture 1 July 2005
BDWorld architecture – user perspective (original ideas)
Taxonomic index (Species 2000
& ITIS Catalogue of Life)Analytic
tool
Thematic data source
BDGrid
Ontology: Metadata
Intelligent links Resource & analytic
tool descriptions Maintenance tools
Proxy
Abiotic data source
User
Local tools
Problem Solving
Environment user interface
Problem Solving Environment: Broker agents
Facilitator agents Presentation agents
Proxy
Proxy
ProxyProxy
Proxy
Analytic tool
GSDGSDGSDGSD
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
3Richard WhiteDesign decisions: architecture 1 July 2005
Design principles 1: the Grid
• Creating a Grid for biodiversity informatics• Current Grid practice and software keep changing• Architecture and much of the implementation
should be insulated from changes in Grid technology: such changes should require no change to resource software (other than re-building wrappers); only our interface to the Grid would need to change
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
4Richard WhiteDesign decisions: architecture 1 July 2005
Architecture 1: interfacing to a Grid
The GRID
A Software Component in BDWorld
BDWorld-GRID Interface (BGI)
BGI API
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
5Richard WhiteDesign decisions: architecture 1 July 2005
Design principles 2: services
• Data sources, analytical tools, etc. should be made available as services which can be invoked remotely by clients
• Service-oriented computing is a Good Thing – users do not need to install or adapt resources to their own environments
• Potential for interoperability with other Grids in related domains such as environmental, molecular and genomic biology (e.g. myGrid)
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
6Richard WhiteDesign decisions: architecture 1 July 2005
Architecture 2: invocation via the Grid
Another Software Component
The GRID
A Software Component
BDWorld-GRID Interface (BGI)
BGI API
BDWorld-GRID Interface (BGI)
BGI API
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
7Richard WhiteDesign decisions: architecture 1 July 2005
Design principles 3: wrappers
• The services are made available as Operations provided by a Resource
• A Resource is connected to the BDWorld Grid through a Wrapper
• Any program could therefore be a resource, only the difficulty of wrapping it would vary
• Resources and wrappers should be able to be implemented in any language
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
8Richard WhiteDesign decisions: architecture 1 July 2005
Architecture 3: wrapped resources
Remote Resource
The GRID
Workflow enactment engine
User
BDWorld-GRID Interface (BGI)
BGI API
BDWorld-GRID Interface (BGI)
BGI API
Wrapper
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
9Richard WhiteDesign decisions: architecture 1 July 2005
Design principles 4: workflows
• User metaphor based on the concept of workflows – requires a workflow manager for design and enactment of workflows
• Flexible use and re-use of work-flows • Resource interoperability with heterogeneous
data, complex in structure• Need to be able to select suitable resources
which “fit together” in a workflow – requires metadata
• Need to record activities, data generated, etc.• Did I say we also need a user interface?
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
10Richard WhiteDesign decisions: architecture 1 July 2005
Architecture 4: (as we planned it)
Wrapped “legacy” resources
Native BDWorld resources
Local tools e.g. Input
and Output Units
Presentation layer
The GRID
Workflow enactment
engineMetadata repository
BDWorld-GRID Interface (BGI)
BGI API
User interface Legacy user interfaces
User
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
11Richard WhiteDesign decisions: architecture 1 July 2005
Design principles 5: desiderata
Extensibility and flexibility are important:• Minimise the joining ‘cost’ for
• users (easy installation of local components)• providers (adding a new resource of a type not
previously encountered)
• Adding attributes to the metadata requires no change to the MDR
• Challenges with handling non-portable resources and inflexible user interfaces …
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
12Richard WhiteDesign decisions: architecture 1 July 2005
Legacy resource issues addressed
• We planned to deal with resources which:• interact with their user locally in real time• have not been designed to be scripted• cannot support multiple simultaneous invocations• run on specific platforms only• have other unexpected requirements
• Using techniques such as• capturing input and/or output and emulating a real user’s
actions• providing user access to remote desktops• limiting where the user of a work-flow can be sited• providing instructions for direct user control• modifying the source code• avoiding the use of the resource altogether
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
13Richard WhiteDesign decisions: architecture 1 July 2005
Architecture (as we built it)
Wrapped “legacy” resources
Native BDWorld resources
Local toolse.g.WFDA, Input and
Output Units
Presentation layer
The GRID
Workflow enactment
engine
Metadata repository
(MDR)
BDWorld-GRID Interface (BGI)
BGI API
User interface
Legacy user
interfaces
User
User interface(Protégé)
MDA
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
14Richard WhiteDesign decisions: architecture 1 July 2005
Glossary of components (existing or Real Soon Now)
• Problem-solving Environment (PSE) • Workflow Designer, Enactment Engine, User Interface [Triana]• local Units in Toolboxes
• proxies for remote Operations • local functions, including
• Input and Output Units• Workflow Design Assistant (WFDA)
• Metadata Agent (MDA)• BDWorld-Grid Interface (BGI)
• BGI Comms Layer, API, Wrappers• Remote Resources
• provide Operations (services)• Metadata Repository (MDR)
• BDWorld ontology, metadatabase, user interface
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
15Richard WhiteDesign decisions: architecture 1 July 2005
Current evaluation; future flexibility
We believe our architecture ensures that BDWorld is: • not limited to a specific application domain• extensible to cope with unanticipated uses and
resources Because:• new resources can be added• domain-specific knowledge resides
• only in the resources and the MDR• not in the BGI or the workflow engine or its user interface or
the Metadata Agent
• MDR contents come from the resources and from humans customising the MDR to assist in new domains
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
16Richard WhiteDesign decisions: architecture 1 July 2005
A dream• Desktop environment in which scientists “drag &
drop” data sources, analysis and modelling tools, and visualisation interfaces into desired sequence of operations which can be run automatically
• Essentially a component-based visual programming environment for scientific tasks
• With additional features (some described earlier), the environment could be made richer, more productive, and support research groups.
• Not just for biodiversity!
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
17Richard WhiteDesign decisions: architecture 1 July 2005
Where do we go from here?
• Present system is a proof of concept• Limited
• Restricted domain of exemplars
• Needs• more data resources• more PSE functionality (described next)• additional features
• User interaction (described earlier)• Virtual organisations (described later)
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
18Richard WhiteDesign decisions: architecture 1 July 2005
Extra PSE functionalitySome of these topics are becoming available within
the present BDWorld project• Enhanced metadata
• Provenance and data lineage• Automatic electronic “notebook”
• Stored workflows• Repeatability, reproduceability• Re-use with different data, changed parameters
• Ontologies• Resource discovery and improved selection• Usability
• Dynamic interaction of users with resources
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
19Richard WhiteDesign decisions: architecture 1 July 2005
Virtual organisationsThese are not going to be addressed during the present
BDWorld project, but would make a good Computer Science component in future proposals
• Collaborative working environments• Shared and private resources: data, tools• Shared experimentation
• User authentication• Access control
• Controlled release of data, tools and results• Dynamic
• Membership• Resources
BiodiversityWorld Grid WorkshopNeSC, Edinburgh, 30 June - 1 July 2005
20Richard WhiteDesign decisions: architecture 1 July 2005
The way forward
• New domain exemplars
• Links with national and international organisations, resources
• “End users”• Applied use, driven by scientific priorities• Input for planning• Feedback for evaluation and improvement
• …