Introduction What are intensional documents? XML document where; some of defined explicitly some of the documents are defined explicitly defined by programs (i.e Web services some are defined by programs (i.e Web services) that generate data. Materialisation of the programs the process of evaluating some of the programs included in an XML document and replacing them by their results.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
IntroductionIntroduction The Model and The Problem Exchanging Intensional Data Safe Rewriting Possible Rewriting Implementation Conclusion and Related Work
IntroductionIntroduction
What are intensional documents? XML document where;
some of some of the documents are defined defined explicitlyexplicitly
some are defined by programsdefined by programs (i.e Web (i.e Web servicesservices) that generate data.
Materialisation of the programs the process of evaluating some of the
programs included in an XML document and replacing them by their results.
Introduction (Introduction (cont’dcont’d))
The goals of the paper Study the new issues raised by the Study the new issues raised by the
exchange of intensional XML document exchange of intensional XML document btw. Applicationsbtw. Applications
Decide on Decide on which data should be which data should be materialised before it is sent and materialised before it is sent and which should not which should not
Introduction (Introduction (cont’dcont’d))
Sendercapabilities
ACLcost...
Receivercapabilities
ACLcost...
Data Exchange Schemag
q f
fq g
...
gq r
gf
r qg
rg
q
... ... ... ...
οData exchange scenario for intensional documents
gr
Outline
Introduction The Model and The Problem Exchanging Intensional Data Safe Rewriting Possible Rewriting Implementation Conclusion and Related Work
The Model and The Problem
Simple intensional XML Model Intension document Simple schema Instance of a schema About rewritings
A Richer Data Model Function patternRestricted Service Invocations
The Model and The Problem Simple intensional XML
Model intentional XML documents as Labelled Trees consisting of two types of nodes:
Data nodes: Nodes with a label in L U D Function Nodes correspond to “Service Calls”, that is, nodes
with a label in F: The children subtrees of a function node are the Function
Parameters When the function is called:
These subtrees are passed to it The return value replaces the function node in the
document.
Assume the existance of some Disjoint Domains: N : domain of NODES L : domain of LABELS F : domain of FUNCTION NAMES D : domain of DATA VALUES
newspaper
title
“The Sun”
date
“04/10/2002”
Get_Temp
city
“Paris”
TimeOut
“Exhibits”
temp
“16 ºC”
The Model and The Problem Simple intensional XML (cont’d)
An example of intentional XML documents
Simple schema A document schema s is an expression (L,F,τ)
where, L L :finite set of labels F F :finite set of function names τ :function that maps:
Each label name l Є L to a expression over L U F or to the keyword “data”
Each function name f Є F to a pair of expressions called
τin(f ) input type of f τout(f ) output type of f
The Model and The Problem Simple intensional XML (cont’d)
τ (title) = data τ (date) = data τ (temp) = data τ (city) = data τ (exhibit) = data
Functions: τin (Get_Temp)= city τout (Get_Temp)= temp τin (TimeOut)= data τout (Timeout)= (exhibit|performance) τin (Get_Date)= title τin (Get_Date)= date
The Model and The Problem Simple intensional XML (cont’d)
The Model and The Problem Simple intensional XML (cont’d)
Instances of a schema An intensional document t is instance of a
schema s=(L,F,τ) if for each: Data Node n Є t with label l Є L, the labels of
n’s children form a word in lang(τ(l ))
Same is valid for Function Node.
Used to denode the regular language defined by ττ ( (l )
about Rewritings t,t’: trees IF t’ is obtained from t by;
selecting a function node v in t with some label f and
replacing it by an arbitrary output instance of f
THEN we say that t t’
The Model and The Problem Simple intensional XML (cont’d)
v
The Model and The Problem Simple intensional XML (cont’d)
about Rewritings (cont’d)
IF t t1 t2 ------ tn THEN
we say that t tn
nodes v1,........, vn are called rewriting
sequence the set of all trees the set of all trees t’t’ such that such that t t’ t t’
is denoted is denoted ext(t)ext(t)..
vv11 vv22 vvnn
*t rewrites into tt rewrites into tnn
*
The Model and The Problem Simple intensional XML (cont’d)
about Rewritings (cont’d) Let:
t be a tree s be a schema
1. IF ext(t) contains some instance of s THEN t possibly rewrites into s.
2. IF either t is already an instance of s or there exists some node v in t such that
all trees t’ where t t’ safely rewrite into s
THEN we say that t safely rewrites into s
vv
The Model and The Problem Simple intensional XML (cont’d) safely rewritsafely rewriting of schemaing of schema
Let:Let: s be a schemas be a schema r is a distinguished label called root labelr is a distinguished label called root label
IF IF all the instances t of s with root label r rewrite all the instances t of s with root label r rewrite safely into instances of s’ safely into instances of s’
THENTHEN we say that:we say that: s s safely rewritessafely rewrites into into s’s’
Problems:Problems:
The Model and The Problem Simple intensional XML (cont’d)
Sendercapabilities
ACLcost...
Receivercapabilities
ACLcost...
Data Exchange Schemag
q f
fq g
...
gq r
gf
r qg
rg
q
... ... ... ...
gr
The Model and The ProblemA Richer Data Model
Function Patterns A function belongs to the pattern if its name A function belongs to the pattern if its name
satisfies thesatisfies the boolean predicateboolean predicate and itsand its signaturesignature is the same as the required oneis the same as the required one
The Model and The Problem A Richer Data Model(cont’d)
Restricted Service Invocations We assumed so far that all the functions appearing We assumed so far that all the functions appearing
in a document may be invoked in a rewriting, in in a document may be invoked in a rewriting, in order to match a given schema.order to match a given schema.
This is not always the case, for the reasons like;This is not always the case, for the reasons like; securitysecurity,, costcost,, access rightsaccess rights , etc. , etc.
THUS, function names/patterns in the schema can THUS, function names/patterns in the schema can be partitioned into two disjoint groups of be partitioned into two disjoint groups of invocable invocable and and noninvocablenoninvocable ones. ones.
A A legal rewritinglegal rewriting is then one that invokes only is then one that invokes only invocable functionsinvocable functions..
Outline
Introduction The Model and The Problem Exchanging Intensional Data Safe Rewriting Possible Rewriting Schema Rewriting Implementation Conclusion and Related Work
Exchanging Intensional Data
Rewriting process Safe writing Possible writing Mix approach
Restriction
Exchanging Intensional Datarewriting process
Safe rewriting: check if t safely rewrites to s
if so, find a rewriting sequence. rewriting sequence a sequence of functions
that need to be invoked to transform t into the required structure
preferred required structure shortest/ cheapest one
For a rewriting sequenceFor a rewriting sequence ttvv ::tt11 .. .. ttn n ,, IFIF the node the node VVjj was returned by the invocation of the was returned by the invocation of the
function function VVii , , VVjj ttjj, , VVii ttj-1j-1
THENTHEN we say that we say that function nodefunction node VVjj depends on adepends on a function nodefunction node V V ii ..
IF IF the dependency graph among the nodes contains the dependency graph among the nodes contains no paths of length greater than no paths of length greater than kk..
THEN THEN we say that we say that a rewriting sequence is ofa rewriting sequence is of depth depth kk
Introduction The Model and The Problem Exchanging Intensional Data Safe Rewriting Possible Rewriting Schema Rewriting Implementation Conclusion and Related Work
Safe Rewriting(DEC16,2004)
Algorithm for k-depth left to right safe rewriting Safe Rewriting Algorithm:Safe Rewriting Algorithm:
Given:Given: word word ww the output types the output types RRf1f1,.....,R,.....,Rfnfn of the available functionsof the available functions target regular language target regular language RR
Purpose of the algorithm:Purpose of the algorithm: to test ifto test if ww can be safely rewritten into a word can be safely rewritten into a word
in in RR if so, to find a if so, to find a safe rewriting sequencesafe rewriting sequence
Safe Rewriting (Safe Rewriting (cont’dcont’d) ) Note:Note:For illustration purposes we use the For illustration purposes we use the newspaper newspaper
word children labels formword children labels form R=title.date.temp (TimeOut|R=title.date.temp (TimeOut|
exhibitexhibit**)) safe rewriting of the above word safe rewriting of the above word into the word in into the word in RR
The Algorithm:The Algorithm:Main idea: to put things in regular language terms, the intersection of the language generated by the k-depth invocation with the complement of the target language R should be Empty.
Safe Rewriting (Safe Rewriting (cont’dcont’d))1.1.Build the finite state automata for the following regular languagesBuild the finite state automata for the following regular languages (1) (1) Aw w=title.date.Get_Temp.TimeOutw=title.date.Get_Temp.TimeOut
(2) (2) Build automata Build automata AAfi fi each accepting the regular each accepting the regular language language RRfi fi (the output types of the available functions).(the output types of the available functions).
q1date
q0title q2 Get_Temp q3 TimeOut q4
Safe Rewriting (Safe Rewriting (cont’dcont’d))((3)3) Build an automaton A accepting the complement of Build an automaton A accepting the complement of
the regular language the regular language R R . . The automaton should be The automaton should be deterministic and complete.deterministic and complete.
The complement automation A for schema ττ’(newspaper)=title.’(newspaper)=title.date.date.temp(TimeOut|exhibit*)temp(TimeOut|exhibit*)
p5
p2 p3 p4 p6temp TimeOut
exhibit
exhibit
*
*
**
*
p1 datep0 title
*
Safe Rewriting (Safe Rewriting (cont’dcont’d))2. Construct automation 2. Construct automation Aw represents all the words represents all the words
that can be generated by such k-depth rewriting that can be generated by such k-depth rewriting process (by iteration)process (by iteration)
1 depth automaton Aw for the word w=title.date.Get_Temp.TimeOutw=title.date.Get_Temp.TimeOut
1
q1 dateq0 title q2 Get_Temp q3 TimeOut q4
q5
ε
q6
ε
temp q7
ε ε
exhibit
performance
Fork node Fork node
Represents choice of invoking the function
Represents choice of not invoking the function
k
Safe Rewriting (Safe Rewriting (cont’dcont’d)) 3.3.Construct the cartesian product automatonConstruct the cartesian product automaton
AX=Aw X AAX=Aw X A
k
q0,p0
q3,p6
q1,p1 q2,p2 q3,p3
q5,p2 q6,p3
q4,p4
q7,p3 q4,p3
q7,p5 q5,p5
q7,p6
q4,p6
q7,p6
title date
Get_Temp
temp
TimeOut
Perform.
exhibit
PerformanceexhibitTimeOutε
Exhibit
Performance
ε
ε ε
ε
εεε
Figure6:Figure6:
Safe Rewriting (Safe Rewriting (cont’dcont’d))
4.4. Mark nodes in Mark nodes in AAXX ::
q0,p0
q3,p6
q1,p1 q2,p2 q3,p3
q5,p2 q6,p3
q4,p4
q7,p3 q4,p3
q7,p5 q5,p5
q7,p6
q4,p6
q7,p6
title date
Get_Temp
temp
TimeOut
Perform.
exhibit
PerformanceexhibitTimeOutε
Exhibit
Performance
ε
ε ε
ε
εεε
Figure6:Figure6:
Safe Rewriting (Safe Rewriting (cont’dcont’d)) Try to obtain a SAFE REWRITING.Try to obtain a SAFE REWRITING.
““A safe rewriting exists IFF the initial state is not A safe rewriting exists IFF the initial state is not marked”marked”
Follow a non-marked pathFollow a non-marked path (corresponding to(corresponding to w w ) ) starting from the initial state ofstarting from the initial state of AAx x to a state [q to a state [q p] where q is an accepting statep] where q is an accepting state ofof AAww
non-marked fork options on the path non-marked fork options on the path determine the rewriring choices (i.e. which determine the rewriring choices (i.e. which functions to call)functions to call)
when a function is invoked, we contnue the when a function is invoked, we contnue the path with the new rewritten word rather than path with the new rewritten word rather than the wordthe word w w
k
Safe Rewriting (Safe Rewriting (cont’dcont’d)) To minimize the rewriting cost, choose a path To minimize the rewriting cost, choose a path
with minimal number/cost of function with minimal number/cost of function invocations.invocations.
EXIT EXIT % End of the algorithm% End of the algorithm
Safe Rewriting (Safe Rewriting (cont’dcont’d)) The complement automaton A for schema
Possible Rewriting(cont’d) 2.Construct the cartesian product automaton Ax=Aw x A
q0,p0 q1,p1 q2,p2 q3,p3
q5,p2 q6,p3
q7,p3title date
tempε εε
Figure11:Figure11:
q4,p3
q4,p4
q7,p4
ε
εexhibit
k
exhibit
Possible Rewriting(cont’d) The cartesian product automaton for possible
rewritting.
q0,p0 q1,p1 q2,p2 q3,p3
q5,p2 q6,p3
q7,p3title date
tempε εε
Figure11:Figure11:
q4,p3
q4,p4
q7,p4
ε
εexhibit
exhibit
Outline
Introduction The Model and The Problem Exchanging Intensional Data Safe Rewriting Possible Rewriting Implementation Conclusion and Related Work
Implementation In the implementation;
intensional XML document a well-formed XML document
To distinguish intensional parts from the rest of the document; namespace http://www.activexml.com/ns/int is used. http://www.activexml.com/ns/int namespace