Page 1
Shahab Mokarizadeh , Royal Institute of Technology (KTH) , Sweden
Peep Küngas, University of Tartu (UT) , Estonia
Mihhail Matskin , Royal Institute of Technology (KTH) , Sweden
Marco Crasso, Marcelo Campo, Alejandro Zunino , UNICEN University,
Argentina
Contact: [email protected]
1
Information Diffusion in Web Services Networks
Page 2
Outline
2
Background of Information Flow Analysis
Roadmap and Computational Model
Web service Annotation
Web service Categorization
Experimental Results
Discussion & Conclusion
Page 3
Background – Information Diffusion
3
Information Diffusion: the communication of knowledge over
time among members of a social system
It shows intrinsic properties of real-world phenomenon.
Already studied in the context of: biosphere, microblogs,
publication citation, … where a network structure present.
Page 4
Information Diffusion
among Web service Domains
4
Observation: Services published in the Web form a conceptual
ecology of knowledge where information is shared and flows
along input and output parameters of service operations.
Case-study: How Web services in different commodities have
been designed from information exchange perspective?
Introducing value-add Web services
Web service adoption spots
Page 5
Roadmap
5
1 • Semantically annotation of Web services
2 • Assign Web services to respective categories
3 • Construct Web service network
4 • Compute information flow matrix
5 • Matrix Analysis
Page 6
1-Web service Annotation
6
Image from : Web Services and
Security,1/17/2006 ,Marco Cova
-Only semantic annotations of basic elements of input and output
parameters of Web service Operations
-SAWSDL annotation model
-We exploit our Semi-automated ontology learning method which
relies on lexico-syntactic patterns “Ontology Learning for Cost-Effective Large-Scale Semantic Annotation
of Web Service Interfaces”. EKAW 2010:pp. 401-410
Page 7
Tax and Customs Board service
7
Output message content fragment
Page 8
Business Registry service
8
Input message content fragment
Page 9
A Business Registry service
9 Output message content fragment
Page 10
Registry of Economic Activities Service
10
Output message content fragment
Page 11
2-Web service Categorization
11
A category (a.k.a. commodity) describes a general kind of a service
that is provided, for example “B2B” , “Health”, “E-Commerce”, etc.
Each Web service could belong to multiple categories !
Standard Software Taxonomy e.g. UNSPSC: http://www.unspsc.org/
We use Classifier : "AWSC: An approach to Web Service classification
based on machine learning techniques“, Inteligencia Artificial, ISSN 1137-3601, vol.
12, no. 37, pp. 25-36, Asociación Española para la Inteligencia Artificial, Valencia, España.
2008.
UNSPSC
Instant messaging Calendar and scheduling
Adventure games Mobile operator specific
Internet directory services Medical software
Music or sound editing Video conferencing software
Page 12
3-Web service Network Construction
12
1- Present annotated Web services as bipartite (2-mode) graph
2- Create Semantic Network (1-mode graph)
3- Create Weighted Category Network using Semantic network
Page 13
Bipartite Web Service Network
13
Page 14
Bipartite Web Service Network
(categorized)
14
Page 15
Propagate the categories to semantic
nodes , Cu: semantic node ,
qk: weight of node in category k
Network Transformation
15
Semantic Network Category Network
nku qqqQ .,,..1
n
i
iu
su
s
DinCoffrequency
DinCoffrequencyq
1
Ds, Dt : category nodes
Label each category edge with weights:
tvsutsvu qqDD ,,, .),(
),(
, ),(),(vuedge
tsvuts DDDDW
Page 16
4-Normalizing Weights (Z-score)
16
Edge category weight W(Di,Dj) : Wi,j
Sum of all weights of all links from category i:
Sum of all weights of all links to category j:
Sum of weights of all categories:
Expected weights from category i to category j :
Normalize category weights (Z-Score):
j
jii DDWW ),(*
i
jij DDWW ),(*
ji
ji DDWW,
),(
W
WW ji **
W
WW
W
WWW
jiji
jiji
****
,, )(
Page 17
Matrix of Information flow
17
nnjnn
nijii
nj
,,1,
,,1,
,1,11,1
Matrix of information flow between pair of categories:
A high proximity (Φ i j) between categories i and j reveals a strong
tendency for semantic concepts associated to category j to be resulted
from invocation of services which take semantic concepts associated to
category i.
Page 18
5-Experimental Settings
18
27000 public Web services (WSDLs) (collected 2005-2011)
Semantic Annotation
Lexico-syntactic based ontology learning
Annotation accuracy: Precision= 31% , Recall= 19%
Categorization
AWSC Classifier
Training dataset: 1500 WSDLs
Categorization Accuracy: 91%
Page 19
Category Category
1-Communications server 11-Network operation system
2-Instant messaging 12-Database management system
3-Adventure games 13-Analytical or scientific
4-Internet directory services 14-Portal server
5-Music or sound editing 15-Foreign language software
6-Calendar and scheduling 16-Procurement software
7-Mobile operator specific 17-Inventory management software
8-Medical software 18-Dictionary software
9-Video conferencing 19-Fax software
10-Map creation software 20-Object oriented database management
19
Excerpt of Identified Service Categories
Page 20
20
Visualization of Matrix of Information Flow
Page 21
Information Exchange Patterns - 1:
21
Self-Referential Pattern: A category mainly provides inputs
for its own services and consumes mostly the information
provided by itself (i.e. self contained).
Appear in diagonal of matrix
Categories: Financial Analysis Software, Web Platform Development
Software, Map Creation Software, Video Conferencing Software and
Accounting Software
The API-s exposed by these Web services exploit frequently
domain-specific concepts as input and output elements
Page 22
Information Exchange Patterns - 2:
22
Outside main diagonal:
-Foreign Language category , Presentation category
-Financial Analysis category , Enterprise Resource Planning category
Least volume of information flow:
-Video Conferencing software and Financial Analysis software
Page 23
Threats to Validity
23
The presented model heavily relies of accuracy of underlying semantic annotation and matching scheme !
The examined Web services account only for small proportion of existing ones on the Web!
The collection of Web services’ interface descriptions may also suffer from unintentional preference toward some specific categories.
In the absence of timing factor our analysis is rather static analysis of information flow
Page 24
Conclusion and Future Work
24
The presented approach can discover information exchange
patterns.
In general our approach is applicable to any other kind of machine
understandable APIs, not just WSDLs, !
Future work:
To examine how presence of service composition or mashups
influences the information exchange pattern
Recommending value-add Web services based on identified
information exchange patterns and Web service network
properties
Page 25
Thanks!
Questions Please!
25
Page 26
26
tvsutsvu qqDD ,,, .),( Partial Category Weight for Edge (Ds,Dt) :
Augmented Category Weight for Edge (Ds, Dt):
),(
, ),(),(vuedge
tsvuts DDDDW
Page 27
Ontology Learning for
Web service Annotation1
27
Reference Ontology
Adding Relations
Ontology Organization
Term Extraction
Syntactic Refinement
Information Elicitation
Pattern-based Semantic Analysis
Term Disambiguation
Class and Relation Determination
Ontology Discovery
Ontology Learning Input:
- Message Part names of input/output
parameters
- XML Schema leaf element names of
complex types
[1] ”Ontology Learning for Cost-Effective Large-scale Semantic
Annotation of XML Schemas and Web Service Interfaces". in Porc.
EKAW 2010, LNAI 6317,pp.401-410, 2010