Top Banner
1 Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08 Tutorial on Computational Workflows for Large-Scale Artificial Intelligence Research
75

1 Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

Jan 16, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

1Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Part III

Computational Workflows in Wings/Pegasus

AAAI-08 Tutorial on Computational Workflows for

Large-Scale Artificial Intelligence Research

Page 2: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

2Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Our Approach

Express analysis as distributed workflows• Data analysis as distributed application

User-centric workflow refinement process • Start with high-level problem description, add layers of detail,

map to distributed execution environment Knowledge-rich descriptions of workflows -- OWL/RDF

• Descriptions of input data and data products (aka “metadata”)• Models of components in terms of I/O data and their function

Automation of resource allocation and optimization• Efficient scheduling algorithms for workflow graphs• Optimization techniques of broad applicability

Build on distributed computing research -- GRID• Designed, by definition, to be robust, secure, flexible

Page 3: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

3Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

The Wings/Pegasus Workflow System[Gil et al 07; Deelman et al 03; Deelman et al 05; Kim et al 08; Gil et al forthcoming]

Grid servicescondor.uwisc.eduwww.globus.org

Pegasus:Automated workflow refinement and executionpegasus.isi.edu

WINGS:Knowledge-based workflow environmentwww.isi.edu/ikcap/wings

•Ontology-based reasoning on workflows and data (W3C’s OWL)

•Workflow library of useful analyses

•Proactive assistance +automation

•Execution-independent workflows

•Optimize for performance, cost, reliability

•Assign execution resources•Manage execution through DAGMan

•Daily operational use in many domains•Secure and controlled sharing of distributed services, computing, data

•Scalable service-oriented architecture

•Commercial quality, open sourceIBM

IBM

IBM

IBM

Page 4: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

4Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

WorkflowSelection

WorkflowTemplate

DataSelection

WorkflowInstance

WorkflowLibraries

Data Repositories

Application Components

Ontologies:Domain terms,

Component types,Workflow Products

- Preexisting data collections- Workflow execution results

“Show meworkflows that classifydatasets”

“Run this workflowwith theweather1980 data set”

“Validate this workflowbased on the component specs”

STUDENT

SEASONED NL RESEARCHER

WorkflowCreation

ALGORITHM DEVELOPER

-Workflow templates specify complex analyses sequences- Workflow instances specify data

“Here is a newclassification algorithm,has a parameter for smoothing, is compiled for MPI”

Component Specification

Executable WorkflowPegasus

WINGS

- Specifies data requirements- Specifies execution requirements

DAGMan/Grid

(OWL)

Wings: Workflow Instance Generation and Selection

Page 5: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

5Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

April 21, 2023

© 2005 TANGRAM

5

Globus RLSreplicamgmt

GRAMremote

submission

GridFTPdata

transfer

Condor DAGManexecution

engine

Condor-Gjob

manager

Nagiosmonitoring

probes

PegasusSite

selectionReplica

selectionWorkflow

optimization

WingsWorkflowvalidation

Data/Compselection

Metadatageneration

Workflowgeneration

NationalMiddleware

Infrastructure(NMI) software

Workflowsubmission

LEGEND:

Workflow System

All softwareis open source

Page 6: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

6Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Workflow Structure We take to heart the separation of “programming”

from “analysis” activities– Components are designed by programmers and can be

complex (and need testing, debugging, loops should terminate, etc)

– Workflows are composed by non-programmers and should have simple structure-- focus is on selecting application components and data

Therefore, our workflow structure is very streamlined• Only iterations handled are parallel data processing

pipelines• Only conditionals handled are data-driven component

selections• Standard workflow languages offer much more complex

constructs Workflow structure designed to:

• Be accessible to users• Facilitate automation and failure recovery

Page 7: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

7Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Core Workflow Concepts

C1

C2

F1

F4

F6

F2

Workflow consists of• Components: software to be executed• Links: data flow among components

Directed Acyclic Graphs (DAGs)• Facilitate automation, esp. execution

monitoring and repair Data always handled through files Special handling of some control

constructs loops (more on this later)• Choices of components• Iterations over data sets

Layered workflow refinement process• Select application components ->

select data -> select execution resources

Each layer adds more information to the same basic workflow structure

C3

F5

F3

F5

Page 8: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

8Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Workflow Abstraction Layers We use several layers of description of workflows

Page 9: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

9Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

WINGS:Workflow Representation

Page 10: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

10Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

F2-operation-SA-Median-Distance-JB F2-operation-SA-Median-VS30

Compute-F2-SA-Median-wrt-Distance-JB-given-Fault-Type-&-Basin-Depth-&-…

Compute-F2-SA-MEDIAN-wrt-VS30-given-Fault-Type-&-Basin-Depth-&-…

Hazard-Level

Hazard-Level-with-SA

Hazard-Level-with-PGA

Hazard-Level-with-PGV

Compute-Hazard-Level-given-IMR-input-parameters

. . .

. . .

Compute-Hazard-Level-with-SA-given-IMR-input-parameters

Compute-Hazard-Level-with-PGA-given-IMR-input-parameters

Compute-Hazard-Level-with-PGV-given-IMR-input-parameters

Hazard-Level-with-SA-Median

Hazard-Level-with-SA-Std-Dev

Hazard-Level-with-SA-Prob-Exc

Hazard-Level-with-Median

Hazard-Level-with-Std-Dev

Hazard-Level-with-Median

. . .

Compute-Hazard-Level-with-SA-Median-given-IMR-input-parameters

Compute-Hazard-Level-with-SA-Std-Dev-given-IMR-input-parameters

Compute-Hazard-Level-with-SA-Prob-Exc-given-IMR-input-parameters

IMR-Input-Parameter

Field-2000-Input-Parameter

Parameter

Fault-Type

Basin-Depth

Distance

. . .

. . .Compute-F2-SA-Median-given-Field-2000-input-parameters

Compute-F2-Hazard-Level-given-Field-2000-input-parameters

F2-Hazard-Level

. . . . . .Domain OntologyOntology of Components

IMTprobability-function

IMR

probability-function

F2-SA-Median-wrt-VS30

. . .

Page 11: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

11Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

F1

WINGS: Representing Components

Any input or output can be defined as a file collection

• Same file type• Unspecified cardinality• Ordered

Inputs and outputs through files• Files are typed

Each input is uniquely identified by a file descriptor (~ parameterID)

Ordered lists of file descriptors for both I and O

C-one

D1

D3

D2

C-many

F1

D13

F1DC11 D12

Page 12: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

12Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Data Descriptions Metadata of different

kinds can be organized in ontology

Files represented as instances and classified in ontology according to their metadata

File collections also represented as instances and defined as ordered sets of file instances

A file Skolem is created for each class as a representative instance (more on this later)

Similarly, a file collection Skolem is created for each class

Application-Specific

Metadata Ontologies

ContentMetadata

FormatMetadata

Kim-Homepage

EHS-T

File Collection

Gil-Homepage

Kim-Homepage

Gil-Homepage

EHCS-T

IKCAP-pages

Page 13: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

13Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

A Component in a Workflow Template

C-one

D1

D3

D2

Nodes correspond to individual application components

Links include file descriptors for origin and destination and a file Skolem

C-one

D1

D3

D2

Link

Node

C67C67

D6

D7

D6C67

D6

L1 L2

L3

L4

N1

N2

N3

FS-A FS-B

FS-C

FS-DNotation: “S” marks a Skolem

Page 14: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

14Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

File Collections in a Workflow Template

F1

Links that include file descriptors that are collections refer to file collection Skolems

Using the same file Skolem ID or file collection Skolem ID in different links indicates identity

F1F1DC11 D12

C-many

D13

F1

C-many

F1

D13

F1DC11 D12FS-B

FS-C

L1L2

L3

N1

FCS-A

Page 15: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

15Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Iteration Over File Collections in a Workflow T

Iteration over sets compactly represented with single nodes that contain component collections

Will be expanded to as many jobs as files are specified for the executable workflow

Links capture formation of file collections as input

C-one

G1

Z1

D1 D2

D3

C-many

C-one

Z2

C-one

Z88

K1 G2 K2 G88 K88

L1 L2

L3

C-manyN2

D12

L4

FS-Y

Y1

C-one

D1

D3

D2

F1

C-many

F1

D13

F1DC11 D12F1F1F1DC11

FCS-G FCS-K

FCS-Z

C-one

NC1

Page 16: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

16Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Iteration With a Constant in a Workflow T

Nodes that represent component collections can take the same file from the same link when the link contains a file Skolem instead of a file collection Skolem

C-one

G1

Z1

C-many

C-one

Z2

C-one

Z88

K1 G2 K1 G88 K1

Y1

C-one

D1

D3

D2

F1

C-many

F1

D13

F1DC11 D12

D1 D2

D3

L1 L2

L3

C-manyN2

D12

L4

FS-YF1F1F1DC11

FCS-G

FCS-Z

C-one

NC1

FS-K1

Page 17: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

17Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints on Workflow Templates

CybershakeTemplate

InputLink_SiteNameFile_to_BoxNameCheck

hasSiteName

InputLink_RuptureVars_to_SeisgmogramGen

hasLink

F-RV

C-RuptVars

CC-RuptureVariations

InputLink_SGTCollforRup_to_SeismogramGen

F-SGT

C-SGT-forRups

CC-SGTs

hasFile

hasFile

hasFile

SGTsSiteName

SiteNameFile

hasSiteName

SiteName

N_Rups

hasN_Items

hasN_Items

… isSameAs

Constraints on number of elements in different collections

Constraints on files/collections of different workflow components

Page 18: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

18Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Workflow Instances

C-one

D1

D3

D2

C67

D6

C-plenty

L1 L2

L3

N1

N2N3

FS-A FS-B

F-C

D7

C-one

D1

L5

N4

FS-ED8

D2

L6

FS-F

D3L7

FS-G

DC9

L4

File85

File28

F34254-05-06-08

FileColl54

F34256-05-06-08

F34255-05-06-08

F34257-05-06-08

Existing data

New data products

Input data selected from the file library by querying for files of the type of file Skolems

Logical names created for new data products with metadata based on file Skolems

Compact Workflow Instance = WT + bindings

Easy to understand, and easily transformed into an expanded WI and a DAX for Pegasus

Bindings

FCS-D

Page 19: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

19Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

AUTOMATED WORKFLOW INSTANCE GENERATION

IN WINGS

Page 20: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

20Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Corpus

Kernel_RulesSplit

Filter_Rules

Prune_Rules

Binarize Generate_Rule_Map

Compile

XRS_Rules BRF_Rules Lexicon_Dictionary

1…n

1…n

1…n 1…n

WSJ-2001

KR-09-05

WSJ-2001KR-09-05

Workflow Instance Expressions

•Compact expression for efficient search and matching

•Expanded expression when further details are needed

Page 21: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

21Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Expanded Workflow Instance<rdf:RDF ..(xmlns definitions)....><wflns:WorkflowInstance rdf:ID="WFT0b"> <wflns:hasDescription rdf:datatype=MailScanner has detected a possible fraud attempt from "www.w3.org" claiming to be MailScanner has

detected a possible fraud attempt from "www.w3.org" claiming to be "http://www.w3.org/2001/XMLSchema#string"> Count the number of unique words in a file </wflns:hasDescription> <wflns:hasNode rdf:resource="#N1"/> <wflns:hasNode rdf:resource="#N2"/> <wflns:hasLink rdf:resource="#L12"/> <wflns:hasLink rdf:resource="#L01"/> <wflns:hasLink rdf:resource="#L2Output"/> </wflns:WorkflowInstance> <wflns:InOutLink rdf:ID="L12"> <wflns:hasOriginFileDescription rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#remDupesOutputFile"/> <wflns:hasFile rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/fileLibrary.owl#F12_WFT0b_1117161532484"/> <wflns:hasDestinationFileDescription

rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#CountWordsInputFile"/> <wflns:hasDestinationNode> <wflns:Node rdf:ID="N2"> <wflns:hasComponent rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#countWordsV1"/> </wflns:Node> </wflns:hasDestinationNode> <wflns:hasOriginNode> <wflns:Node rdf:ID="N1"> <wflns:hasComponent rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#removeDuplicatesV1"/> </wflns:Node> </wflns:hasOriginNode> </wflns:InOutLink> <wflns:InputLink rdf:ID="L01"> <wflns:hasDestinationNode rdf:resource="#N1"/> <wflns:hasFile rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/fileLibrary.owl#test_txt_WFT0b_1117161532484"/> <wflns:hasDestinationFileDescription

rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#remDupesInputFile"/> </wflns:InputLink> <wflns:OutputLink rdf:ID="L2Output"> <wflns:hasOriginFileDescription

rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/componentLibrary.owl#CountWordsOutputFile"/> <wflns:hasOriginNode rdf:resource="#N2"/> <wflns:hasFile rdf:resource="http://www.isi.edu/ikcap/wings/domains/linguistics/fileLibrary.owl#F2Output_WFT0b_1117161532484"/> </wflns:OutputLink></rdf:RDF>

Page 22: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

22Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

W Instance: “dax” for Pegasus<?xml version="1.0" encoding="UTF-8"?><!-- generated: 2004-08-18T10:53:01-05:00 --><adag xmlns="http://www.griphyn.org/chimera/DAX"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.griphyn.org/chimera/DAX http://www.griphyn.org/chimera/dax-1.8.xsd" version="1.7" count="1" index="0" name="WorkFlow0b">

<!-- part 1: list of all files used (may be empty) --> <filename file="vahi.f.a" link="input"/> <filename file="vahi.f.b1" link="inout"/> <filename file="vahi.f.b2" link="output"/><!-- part 2: definition of all jobs (at least one) --> <job id="ID000001" namespace="vds" name="removeDups" version="1.0" level="3" dv-namespace="vds" dv-

name="top" dv-version="1.0"> <argument>-a top -T60 -i <filename file="vahi.f.a"/> -o <filename file="vahi.f.b1"/> </argument> <uses file="vahi.f.a" link="input" dontRegister="false" dontTransfer="false"/> <uses file="vahi.f.b1" link="output" dontRegister="true" dontTransfer="true" temporaryHint="true"/> </job> <job id="ID000002" namespace="vds" name="countWords" version="1.0" level="2" dv-namespace="vds" dv-

name="left" dv-version="1.0"> <argument>-a left -T60 -i <filename file="vahi.f.b1"/> -o <filename file="vahi.f.b2"/> -p

0.5</argument> <uses file="vahi.f.b1" link="input" dontRegister="false" dontTransfer="false" temporaryHint="true"/> <uses file="vahi.f.b2" link="output" dontRegister="true" dontTransfer="true" temporaryHint="true"/> </job><!-- part 3: list of control-flow dependencies (empty for single jobs) --> <child ref="ID000002"> <parent ref="ID000001"/> </child></adag>

Page 23: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

23Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

AUTOMATED METADATA GENERATIONIN WINGS

Page 24: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

24Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Metadata Reasoning for file name generation and workflow validation

Filename Generation• Explicit representation of metadata in ontology (e.g.

source id, rupture id)• Propagate metadata attributes for all data products when

creating workflow instance• Names for intermediate files are created automatically

from the metadata Workflow Validation

• Explicit representation of metadata constraints (examples are shown below)

– Constraints on individual files and collections– Constraints on component inputs and outputs – Constraints among components in a workflow

• Check constraints while generating workflow instantiations

Page 25: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

25Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Propagation of metadata for filename generation: an example

SeismogramGen_Li

RVM

127_6.rvm- source_id: 127- rupture_id: 6

Rupture_variationRupture_variation

127_6.txt.variation-s0000-h0000- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

127_6.txt.variation-s0000-h0000- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

127_6.txt.variation-s0000-h0001- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

127_6.txt.variation-s0000-h0001- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

SGT

127_6.txt.variation-s0000-h0000- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

127_6.txt.variation-s0000-h0001- source_id: 127- rupture_id: 6- slip_relaization_#:0- hypo_center_#: 1

FD_SGT/PAS_1/A/SGT161- site_name: PAS- tensor_direction: 1- time_period: A- xyz_volumn_id: 161

127_6.txt.variation-s0000-h0001- source_id: 127- rupture_id: 6- slip_realization_#:0- hypo_center_#: 1

Seismogram

Seismogram_PAS_127_6.grm-site_name: PAS-source_id: 127-rupture_id: 6

… …SGT

Page 26: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

26Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

AUTOMATIC WORKFLOW GENERATION IN WINGS

Page 27: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

27Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Automatic Template-Based Workflow Generation Algorithm

WR0: Workflow Template

Workflow request =

Workflow Template

+

Seed Constraints

Seed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows dataVariable5 data:contains data:Muti-party-communicationdataVariable0 data:creator 5048dataVariable1 data:creator 5048

WR0: Seed Constraints

Page 28: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

28Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Step 1: Workflow Template is Seeded

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 29: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

29Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Step 2: Backward Sweepunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 30: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

30Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Step 3: Select Data Sourcesunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 31: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

31Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Step 3: Select Data Sourcesunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 32: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

32Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Step 4: Forward Sweepunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 33: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

33Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Result-PartA

Result-PartB

Step 5: Workflow Instantiationunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 34: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

34Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Result-PartA

Result-PartB

Step 5: Workflow Instantiationunified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Seed workflow from request

Page 35: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

35Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

E-07

S-NY

Result-PartA

Result-PartB

<job id = “j42” name=“Neuman-BC”> <argument> -i E-07 17.5 -o ES-07….

parent

parentparent

parent

parent

Step 6: Workflow Grounding

Ground Workflow

Seed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Page 36: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

36Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

W1: estimated exec time 3hrs W2: estimated exec time 20hrs

W3: estimated exec time 3dW4: estimated exec time 5hrs

Step 7: Workflow RankingSeed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Page 37: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

37Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Seed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

W1: estimated exec time 3hrs W2: estimated exec time 20hrs

W3: estimated exec time 3dW4: estimated exec time 5hrs

Step 7: Workflow Ranking

Page 38: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

38Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Ground workflow: 15 compute nodesdevoid of resource assignment

41

85

10

9

13

12

15

9

4

837

10

13

12

15

13 data stage-in nodes

11 compute nodes (1-2&5-6 reduced based on available intermediate data)

8 inter-site data transfers

14 data stage-out nodes to long-term storage

14 data registration nodes (data cataloging)

Executable workflow: mapped to 3 sites

Step 8: Workflow MappingSeed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Page 39: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

39Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Why Do We Automate All This?So You Don’t Have To

Request ID

# Binding-Ready

Workflow Candidates

# Bound Workflow

Candidates

# Configured Workflow

Candidates

# Calls to c:find-DODs-given-output-requirements

# Calls to

d:find-data-

objects

# Calls to c:predict-DODs-given-input-requirements

Workflow Generation

Time

R1 6 8 8 1 6 8 5 s

R2 6 8 8 7 6 16 4 s

R3 6 24 24 7 6 48 7 s

R4 6 24 24 13 6 72 8 s

R5 18 64 48 7 18 128 22 s

R6 18 288 216 7 18 576 81 s

R7 18 16 12 7 18 32 10 s

R8 6 0 0 1 6 0 1 s

Seed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

binding-ready workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows

executable workflows

Workflow ranking

top-k workflows

Workflow candidates generated + considered(many are eliminated)

Queries aboutdata

Queries abouttools

Page 40: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

40Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

WINGS DEMO

Page 41: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

41Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Editing a Seed & Template,Generating a DAX

WR0: Workflow Template

dataVariable5 data:contains data:Muti-party-communicationdataVariable0 data:creator 5048dataVariable1 data:creator 5048

WR0: Seed Constraints

Workflow seed =

Workflow Template

+

Seed Constraints

Seed workflow from request

unified well-formed request

Find input data requirements

seeded workflows

Data source selection

candidate workflows

Parameter selection

bound workflows

configured workflows

Workflow instantiation

Workflow grounding

workflow instances

Workflow mapping

ground workflows (DAXes)

executable workflows

Workflow ranking

top-k workflows

Page 42: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

42Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

SCEC WORKFLOWS IN WINGS

Page 43: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

43Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

InSAR Image of theHector Mine Earthquake

• A satellitegeneratedInterferometricSynthetic Radar(InSAR) image ofthe 1999 HectorMine earthquake.

• Shows thedisplacement fieldin the direction ofradar imaging

• Each fringe (e.g.,from red to red)corresponds to afew centimeters ofdisplacement.

SeismicHazardModel

Seismicity Seismicity PaleoseismologyPaleoseismology Local site effectsLocal site effects Geologic structureGeologic structure

FaultsFaults

StressStresstransfertransfer

CrustalCrustalmotionmotion

CrustalCrustaldeformationdeformation

Seismic velocitySeismic velocitystructurestructure

RuptureRupturedynamicsdynamics

Seismic Hazard Analysis in Southern California Earthquake Center (SCEC) [Slide from T. Jordan]

Page 44: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

44Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Intensional descriptions of data sets

Intensional descriptions of parallel computations

Querying results of other data creation subworkflows

Rich metadata descriptions for all data products

Reusable High-Level Workflow Templates

Page 45: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

45Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Workflows for Seismic Hazard Analysis [Gil et al 06; Kim et al 06; Gil et al 07]

Input data: a site and an earthquake forecast model

• thousands of possible fault ruptures and rupture variations, each a file, unevenly distributed

• ~110,000 rupture variations to be simulated for a given site

High-level template combines 11 application codes

8048 application nodes in the workflow instance generated by Wings

24,135 nodes in the executable workflow generated by Pegasus, including:

• data stage-in jobs, data stage-out jobs, data registration jobs

Executed in USC HPCC cluster, 1820 nodes w/ dual processors) but only < 144 available

• Including MPI jobs, each runs on hundreds of processors for 25-33 hours

• Runtime was 1.9 CPU years Provenance records kept throughout the

generation and execution process for 100,000 workflow data products

Page 46: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

46Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

DAX automatically generated from WINGS

14,639 jobs for 4,626 ruptures with 106,124 rupture variations for USC site

Page 47: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

47Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Summary:Creating Workflows with WINGS

Separates analysis spec from data• Workflow template as reusable well-defined acceptable analysis process• Workflow instance binds template to data for particular analyses

Ensures that the data complies with the component specifications and their constraints within the workflow

Represents data collections (nominal or otherwise) within the workflow specification

Automatically generates descriptions and metadata to new data products to be created by the workflow execution

Compact workflow instance is user-friendly and reusable • Separates data provenance (workflow instance) and pedigree (workflow

template) Expands workflow instance into DAX for Pegasus, which creates

the executable workflow

Page 48: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

48Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Key Benefits

Efficient and correct creation of new workflows• By retrieving a template and filling in the data

Framework ensures adherence to methodology• Represents as templates widely-accepted analysis

methodologies• Supports repeatability of experiments/analyses• Enables controlled variations

Ensures better quality of data analysis results• Attaches provenance and pedigree information

Page 49: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

49Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Ongoing and Future Work

Interactive assistance in creating valid workflow templates• Based on CAT (Composition Analysis Tool) [Kim et al 05]

More sophisticated models of components Automatic completion of workflow’s data conversion

and formatting steps through AI planning techniques Tracking new versions of components, invalidate

data and workflows from old versions Workflow template libraries

• Indexing, retrieval Managing collections of workflows as part of an

overall analysis activity• Eg: parameter sweeping, variants of analysis

Page 50: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

50Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

BACKUP SLIDES

Page 51: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

51Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Extension 1: Handle Collections of Collections

SGTSGT127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0001127_6.txt.variation-s0001-h0000127_6.txt.variation-s0001-h0001 …20_0.txt.variation-s0000-h0000 …150_11.txt.variation-s0000-h0000…

SGTSGT127_6SGT20_0.txt.variation-s0000-h0000

SGT150_11.txt.variation-s0000-h0000

For rupture 127_6 (source ID 127, rupture ID 6), there are 8 variationsFor rupture 20_0(source ID 20, rupture ID 0), there are 1352 variationsA set of ruptures, each with a set of variationsEach variation in a separate file

Page 52: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

52Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Extending Wings to Handle Collections of Collections

File Collection

File

Variation FileCollection

has-type

Variation File

Collection ofCollections

has-type

has-type

Ruptures-PAS

SGTSGT127_6.txt.variation-s0000-h0000

SGTSGT127_6SGT127_7.txt.variation-s0000-h0000

SGT150_11.txt.variation-s0000-h0000

127_6.txt.variation-s000-h000

Vars_127_6 Vars_127_7

127_6.txt.variation-s000-h001

127_7.txt.variation-s000-h000

127_7.txt.variation-s000-h001

… …

127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0001127_6.txt.variation-s0001-h0000127_6.txt.variation-s0001-h0001 …20_0.txt.variation-s0000-h0000 …150_11.txt.variation-s0000-h0000…

Page 53: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

53Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Wings Coll/Coll

150_11127_7

L1

F1F1F1RupVar

L2

F1F1F1SGT

SeismogramGen_Li

NC1

L3

seism

seism

L4SA

FCS-S

FCS-SA

PeakValCalc_Okaya

NC2

FCS-Var

CCS-Rup

SGTSGT127_6.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_6.grm

PeakVals_allPAS_127_6.bsa

SGT161 SGTSGT127_7.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_7.grm

PeakVals_allPAS_127_7.bsa

SGT282 SGTSGT150_11.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_151_11.grm

PeakVals_allPAS_151_11.bsa

SGT161

FCS-SGTColCCS-SGT

RV_127_6150_11127_7S_127_6

Page 54: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

54Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints (in OWL ontology)

Constraints on Files • metadata attributes: data types and default valuese.g. simulation_out_timesamples of SeisParamValsFile should be an integer and the default value is 1801• File name format with respect to metadata attributese.g. rupture variation file: e.g. 127_6.txt.variation-s0002-h0000

Format: <source_id>_<rupture_id>.txt.variation-s[4 digit slip_realization#]-h000[4 digit hypo center #]

Constraints on collections and collection of collection• Type of each element• Relations between metadata of a collection and metadata of individual itemse.g. Each rupture variation has the same source/rupture ids as the rupture

variation collection Component level constraints on metadata attributes of input/output files

or collections• Deriving metadata of output files from metadata of input filese.g. The output of PeakValCalc_Okaya (SA output file) should have the same site

name as the seismogram file Template level constraints on metadata attributes of files or collections

• Input/output files of different components can have the same metadatae.g. The RVM collection input for SeismogramGen_Li should have the same site

name as the CollOfCollection rupture variations input• Checking number of items in collectionse.g. number of RVM files and the number of rupture var collections should be equal

Page 55: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

55Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints on Files

RuptureVarFile

Int

Metadata:4DigitInt

hasSourceID

hasRuptureID

hasSlipRealization

hasHypoCenter

FileNameFormathasNameFormat

List of Metadata or StringConstant

File

SkolemInstances

RupVar-SK

xsd:inthasDefaultVal

hasMetadata

Metadata

SourceID1

RuptureID1

SlipRealz1

HypoCent1

RupVar_FileNameFormat1

hasDefaultValue

_

.txt.variation

0

Constraints on default values

Constraints on file names…

hasSourceID

hasRuptureID

usedAs

Domain independent definitions

SCEC dependentdefinitions

: classes

: instances

: roles

Page 56: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

56Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints on Collections

RuptureVariations

CollOf Collection

CollectionhasType:hasCollectionType

FilehasType:hasFileType

RuptureVarsForForRupture

RuptureVarFile

RupVar-SK

C-RuptVars-SK

CC-RuptureVariations-SK

hasCollectionType

hasSiteName

Metadata:String

hasFileType

hasSourceID

hasRuptureIDMetadata:Int

hasSourceID

hasRuptureID

SkolemInstances

hasSiteName SiteName1

hasSiteName

SourceID1hasSourceID

hasSourceID

RuptureID1

Constraints on collection element types

metadata constraints on collections & their elements

Page 57: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

57Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints on Components

SeismogramGen

ComponentType

hasInputs FileOrCollection

hasOutputs

SeismogramGen_LiSkolemInstances

hasInputs

SeismogramGenLi_Inputs

SeismogramGenLi_Outputs

hasOutputs

RVM1

Seismogram1

S-RV1

S-RuptVarsForRup1

hasSourceID

RVM_SourceID1

RVM_RuptureID1hasRuptureID

hasSiteName

SGTsSiteName1

metadata constraints on input and output files

Constraints on the types of input and output file and collections

SGT1

C-SGT1

Page 58: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

58Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Workflow Templates: a set of nodes and links

TemplatehasNode Node

hasLink Link(Input, Output, InOut, LinkMaping)

CybershakeTemplate1

Node_SeismogramGen_Collection

ComponentType orComponentCollection

hasComponent

hasFile File orCollection

hasNode

hasDestinationNode, hasOriginNode, hasDestinationFileDesc, hasOriginFileDesc, …

hasComponent

ComponentCollection_SeismogramGen

hasComponentType

InputLink_RuptureVars_to_SeisgmogramGen

hasLink

hasDestinationNode…

hasFile

F-RV1

C-RuptVars1

CC-RuptureVariations1

SeismogramGen_Li

S-RV1

S-RuptVarsForRup1hasDestinationFileDesc

InputOutLink_Seismogram_from_SeismGen_to_PeakValCalc

SkolemInstances

Page 59: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

59Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Constraints on Templates

CybershakeTemplate1

InputLink_SiteNameFile_to_BoxNameCheck

hasSiteName

InputLink_RuptureVars_to_SeisgmogramGen

hasLink

F-RV1

C-RuptVars1

CC-RuptureVariations1

InputLink_SGTCollforRup_to_SeismogramGen

F-SGT1

C-SGT-forRups1

CC-SGTs1

hasFile

hasFile

hasFile

SGTsSiteName1

SiteNameFile1

hasSiteName

SiteName1

N_Rups

hasN_Items

hasN_Items

… isSameAs

SkolemInstances

Constraints on number of elements in different collections

metadata constraints on files/collections of different components

Page 60: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

60Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Example OWL definitions

Filename format for rupture variation files

Definitions for metadatapropagation (SynthSGT)

Constraints on files/collections of different components

Page 61: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

61Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Extension 3: Creating many workflow instantiations

SGTSGT127_6.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_6.grm

PeakVals_allPAS_127_6.bsa

SGT161SGTSGT127_7.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_7.grm

PeakVals_allPAS_127_7.bsa

SGT282 SGTSGT150_11.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_151_11.grm

PeakVals_allPAS_151_11.bsa

SGT161

4262 independent instances for each rupture, >100,000 variations for a site

Memory Bottleneck: handling many files in the file library e.g. rupture variations

. . .

BNC

GenMD

BNC BNC

GenMD GenMD

Page 62: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

62Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Creating many workflow instantiations (on-going work)

Independent instances are generated separately• Instantiations for different ruptures are generated

separately On-demand creation of files and collections in the

file library• If files or collections are not used in metadata reasoning,

we don’t need to create file library objects for them (e.g. rupture variations) and only an ID is generated for them

Currently Wings needs 5-6 hrs to generate DAXes for 4626 ruptures with 106,124 variations

Page 63: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

63Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Extension 4: Interleaving execution with workflow generation

Extensions in the WF template representations• System links: a link from a component that generates

results needed in template instantiation E.g. BoxNameCheck generates a file that contains SGT file

names Template navigation algorithm: while navigating

links, identify partial workflows that can be executed based on system links & steps that are already executed

Wings and Pegasus interaction• On-going work: Client/server style interaction

e.g. use secure shell

Page 64: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

64Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Partial DAX generation: Workflow Navigation Algorithm

System link

Template navigation

Used for Partial DAX generation

Page 65: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

65Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Summary: Current System

MCS

On

tolo

gy

AP

I

file

&

mat

adat

a A

PI

OWL ontologiesWings File Ont

Wings Component Ont

Domain component Ont

Template Library

CC-Rup-Vars

C-Rup-Vars-for-Rup

File Library

Domain File Ont

Metadata constraints

Metadata reasoner

F-RV1F-RV1-current wf instance-logical files used-bindings -new file objects and metadata created

Jena

TemplateInstantiator

Pegasus

CAT TemplateValidator

TemplateSelection

DAXgenerator

User

WINGS

Page 66: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

66Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Ongoing Work

Approaches for handling many thousands of files• Use of MCS for storing logical file names and metadata• Use of more efficient OWL reasoners

(e.g. Sesame can handle 100 million triples) Client/server style interactions with Pegasus

Page 67: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

67Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Mappings in a Workflow Template

Link mappings specify the order of inputs to a node that accepts a collection

F1

C-plenty

F1

D8

F1DC9

C-one

G1

Z1

C-plenty

C-one

Z2

C-one

Z88

K1 G2 K2 G88 K88

Y1

C-spl

H1

C-one

D1

D3

D2

C-spl

D17

D18 C-plentyN3

L4

FS-Y

C-splN2

M5

D18

#1

#2

F1F1F1DC9

D1 D2

D3

L1 L2

L3

F1F1F1DC11

FCS-G FCS-K

FCS-Z

C-one

NC1

FCS-T

Page 68: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

68Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Nested File Collections

SGTSGT127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0001127_6.txt.variation-s0001-h0000127_6.txt.variation-s0001-h0001 …20_0.txt.variation-s0000-h0000 …150_11.txt.variation-s0000-h0000…

SGTSGT127_6SGT20_0.txt.variation-s0000-h0000

SGT150_11.txt.variation-s0000-h0000

For rupture 127_6 (source ID 127, rupture ID 6), there are 8 variationsFor rupture 20_0(source ID 20, rupture ID 0), there are 1352 variationsA set of ruptures, each with a set of variationsEach variation in a separate file

Page 69: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

69Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Nested File Collections

File Collection

File

Variation FileCollection

has-type

Variation File

Collection ofCollections

has-type

has-type

Ruptures-PAS

SGTSGT127_6.txt.variation-s0000-h0000

SGTSGT127_6SGT127_7.txt.variation-s0000-h0000

SGT150_11.txt.variation-s0000-h0000

127_6.txt.variation-s000-h000

Vars_127_6 Vars_127_7

127_6.txt.variation-s000-h001

127_7.txt.variation-s000-h000

127_7.txt.variation-s000-h001

… …

127_6.txt.variation-s0000-h0000127_6.txt.variation-s0000-h0001127_6.txt.variation-s0001-h0000127_6.txt.variation-s0001-h0001 …20_0.txt.variation-s0000-h0000 …150_11.txt.variation-s0000-h0000…

Page 70: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

70Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Nested File Collections

150_11127_7

L1

F1F1F1RupVar

L2

F1F1F1SGT

SeismogramGen_Li

NC1

L3

seism

seism

L4SA

FCS-S

FCS-SA

PeakValCalc_Okaya

NC2

FCS-Var

CCS-Rup

SGTSGT127_6.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_6.grm

PeakVals_allPAS_127_6.bsa

SGT161 SGTSGT127_7.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_127_7.grm

PeakVals_allPAS_127_7.bsa

SGT282 SGTSGT150_11.txt.variation-s000-h000

SeisGen_Li

PeakValCalc

Seismograms_PAS_151_11.grm

PeakVals_allPAS_151_11.bsa

SGT161

FCS-SGTColCCS-SGT

RV_127_6150_11127_7S_127_6

Page 71: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

71Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Example OWL definitions

Filename format for rupture variation files

Definitions for metadatapropagation (SynthSGT)

Constraints on files/collections of different components

Page 72: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

72Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Component Ontology in OWL <owl:Class rdf:ID="ComponentType"/> <owl:FunctionalProperty rdf:ID="hasInputs"><rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/> <rdfs:domain rdf:resource="#ComponentType"/><rdfs:range rdf:resource="#FileAndPrefixList"/> </owl:FunctionalProperty><owl:ObjectProperty rdf:ID="hasOutputs"> <rdf:type

rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/> <rdfs:domain rdf:resource="#ComponentType"/> <rdfs:range rdf:resource="#FileAndPrefixList"/> </owl:ObjectProperty><owl:FunctionalProperty rdf:ID=”hasFile"> <rdfs:range> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <rdf:Description

rdf:about="http://www.isi.edu/ikcap/wings/fileOntology.owl#File"/> <rdf:Description

rdf:about="http://www.isi.edu/ikcap/wings/fileOntology.owl#FileCollection"/>

</owl:unionOf> </owl:Class> </rdfs:range> <rdfs:domain rdf:resource="#FileAndPrefix"/> <rdf:type

rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/> </owl:FunctionalProperty><owl:DatatypeProperty rdf:ID="hasPrefix"> <rdfs:domain rdf:resource="#FileAndPrefix"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/> </owl:DatatypeProperty>

<owl:FunctionalProperty rdf:ID="hasVersion">

<rdfs:domain rdf:resource="#ComponentType"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DatatypeProperty"/> </owl:FunctionalProperty> <owl:FunctionalProperty rdf:ID="hasExecutionRequirements"> <rdfs:domain rdf:resource="#ComponentType"/> <rdfs:range rdf:resource="http://www.isi.edu/ikcap/wings/executionRequirements.owl#ExecutionRequirements"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/> </owl:FunctionalProperty><owl:DatatypeProperty rdf:ID="hasExecutablePath">

<rdfs:domain rdf:resource="#ComponentType"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/> </owl:DatatypeProperty><owl:FunctionalProperty rdf:ID="hasNamespace"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DatatypeProperty"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> <rdfs:domain rdf:resource="#ComponentType"/> </owl:FunctionalProperty><owl:FunctionalProperty rdf:ID="hasTranslationArgument"> <rdfs:domain rdf:resource="#ComponentType"/> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#DatatypeProperty"/> </owl:FunctionalProperty>

<owl:Class rdf:ID="ComponentCollection"/> <owl:ObjectProperty rdf:ID="hasComponentType"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#FunctionalProperty"/> <rdfs:range rdf:resource="#ComponentType"/> <rdfs:domain rdf:resource="#ComponentCollection"/> </owl:ObjectProperty>

Page 73: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

73Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

A Component Description from the Library <owl:Class rdf:ID="RemoveCommonWords"> <rdfs:subClassOf

rdf:resource="http://www.isi.edu/ikcap/wings/componentOntology.owl#ComponentType"/>

</owl:Class> <RemoveCommonWords rdf:ID="removeCommonWordsV1"> <clns:hasInputs> <clns:FileAndPrefixList rdf:ID="componentLibrary_RDFResource_5"> <rdf:rest rdf:resource="#componentLibrary_RDFResource_6"/> <rdf:first rdf:resource="#removeCommonWordsInput1"/> </clns:FileAndPrefixList> </clns:hasInputs> <clns:hasExecutionRequirements

rdf:resource="#countWordsExecutionReq"/> <clns:hasNamespace

rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >vds</clns:hasNamespace> <clns:hasOutputs> <clns:FileAndPrefixList rdf:ID="componentLibrary_RDFResource_7"> <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-

ns#nil"/> <rdf:first rdf:resource="#removeCommonWordsOutput"/> </clns:FileAndPrefixList> </clns:hasOutputs> <clns:hasVersion

rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >1</clns:hasVersion> <clns:hasExecutablePath

rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >/nfs/isd/varunr/wings/removeCommonWords</clns:hasExecutablePath> </RemoveCommonWords>

<clns:FileAndPrefix rdf:ID="removeCommonWordsOutput"> <clns:hasFile rdf:resource="#removeCommonWordsOutputFile"/> <clns:hasPrefix

rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >-o</clns:hasPrefix> </clns:FileAndPrefix> <lingflns:EnglishFile rdf:ID="removeCommonWordsOutputFile"/>

<clns:FileAndPrefixList rdf:ID="componentLibrary_Individual_34"> <rdf:rest> <clns:FileAndPrefixList rdf:ID="componentLibrary_Individual_37"> <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-

syntax-ns#nil"/> <rdf:first> <clns:FileAndPrefix rdf:ID="removeCommonWordsInput2"> <clns:hasFile> <lingflns:EnglishFile rdf:ID="removeCommonWordsInputFile"/> </clns:hasFile> </clns:FileAndPrefix> </rdf:first> </clns:FileAndPrefixList> </rdf:rest> <rdf:first> <clns:FileAndPrefix rdf:ID="removeCommonWordsInput1"> <clns:hasPrefix

rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >-i</clns:hasPrefix> <clns:hasFile> <lingflns:EnglishFile rdf:ID="CommonWordsFile"/> </clns:hasFile> </clns:FileAndPrefix> </rdf:first> </clns:FileAndPrefixList>

<clns:FileAndPrefixList rdf:ID="componentLibrary_Individual_40"> <rdf:first rdf:resource="#removeCommonWordsOutput"/> <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-

ns#nil"/> </clns:FileAndPrefixList>

<clns:FileAndPrefixList rdf:ID="componentLibrary_RDFResource_6"> <rdf:first rdf:resource="#removeCommonWordsInput2"/> <rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-

ns#nil"/> </clns:FileAndPrefixList>

Page 74: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

74Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

Formats for Filenames (examples)

SGT file: e.g. FD_SGT/USC_1/A/SGT161 Format: FD_SGT/<site_id>_[1-2]/[A-L]/SGT[3-digit-alphanumeric] - site_name: e.g. USC - tensor direction[1-2]: 1 (EW) 2(NS) - time_period [A-L]: A (0-15 seconds) B(15-30 seconds), etc. - 3-digit-alphanumeric :xyz volumn id rupture variation file: e.g. 127_6.txt.variation-s0002-h0000

Format: <source_id>_<rupture_id>.txt.variation-s[4 digit slip_realization#]-h000[4 digit hypo center #] - source_id: e.g. 127

- rupture_id: e.g. 6 - 4 digit slip_realization# : 2

- 4 digit hypo center #: 0 SA output file: e.g. PeakVals_allLADT_127_6.bsa

Format: PeakVals_all<site_id>_<source_id>_<rupture_id>.bsa seismogram file : e.g. Seismogram_LADT_127_6.grm

Format: Seismogram_<Site>_<source_id>_<rupture_id>.grm SRL file: e.g. USC-sorted_by_rupture_variations.srl

Format: <site_id>-sorted_by_rupture_variations.srl additional metadata:

Page 75: 1 Yolanda Gil (gil@isi.edu) AAAI-08 Tutorial July 13, 2008 USC Information Sciences Institute Part III Computational Workflows in Wings/Pegasus AAAI-08.

75Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008USC Information Sciences Institute

All Data Products Have Rich Metadata <flns:File

rdf:about="http://www.isi.edu/ikcap/wings/domains/NLP/fileLibrary.owl#kernelRules_RulePruningWorkflow1_1118895460046">

<flns:usedAs rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/componentLibrary.owl#KernelRulesFile"/>

<wflns:createdBy rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/workflows/RulePruningInstance1.owl#"/>

<wflns:usedBy rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/workflows/RulePruningInstance2.owl#"/>

</flns:File>

<nlpflns:TextFile rdf:about="http://www.isi.edu/ikcap/wings/domains/NLP/fileLibrary.owl#TextFileCollection_RulePruningWorkflow1_1118895460046_item_1"/>

<flns:FileCollection rdf:about="http://www.isi.edu/ikcap/wings/domains/NLP/fileLibrary.owl#RulePruningWorkflow1_1119042891296_FilteredRulesCollection">

<flns:usedAs rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/componentLibrary.owl#FilterRulesOutputFile"/>

<flns:usedAs rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/componentLibrary.owl#PruneRulesInputFile"/>

<wflns:createdBy rdf:resource="http://www.isi.edu/ikcap/wings/domains/NLP/workflows/RulePruningInstance2.owl#"/>

<flns:hasFiles rdf:parseType="Collection">

<flns:File rdf:about="http://www.isi.edu/ikcap/wings/domains/NLP/fileLibrary.owl#RulePruningWorkflow1_1119042891296_FilteredRulesCollection_item_0"/>

<flns:File rdf:about="http://www.isi.edu/ikcap/wings/domains/NLP/fileLibrary.owl#RulePruningWorkflow1_1119042891296_FilteredRulesCollection_item_1"/>

</flns:hasFiles>

</flns:FileCollection>