1 Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality modeling and functionality interoperability, Session 1” Functionality and Interoperability with 5S by Edward A. Fox • [email protected]http://fox.cs.vt.edu • Dept. of Computer Science, Virginia Tech
82
Embed
1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations)
Functionality Working Group Mtg29-30 June 2009, Athens
“Functionality modeling and functionality interoperability, Session 1”
Functionality and Interoperability with 5Sby Edward A. Fox
• Students, colleagues, co-investigators• Robert France, Marcos André Gonçalves, Doug Gorton,
Yi Ma, Uma Murthy, Rao Shen, Hussein Suleman, Ricardo da Silva Torres, ...
• Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang
2
Theses and Dissertations• Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S
Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-04252007-161736/
• Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/
• Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semi-automatic Mapping-based Integration of Heterogeneous Collections into Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/
• Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/
• Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June 2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/
• Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/
• Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-07012002-145841/
3
Other Selected References• Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible Interoperability for
Federated Digital Libraries. ECDL 2001, 173-186, 2001• Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and Effective
Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002• Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification and
Generation of Digital Libraries. JCDL 2002, 263-272• Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML Log
Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143• Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne,
Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314
• Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003
• M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312 , 2004
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful Digital Library? ECDL 2006, 208-219
4
Other Selected References - 2• Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth. The
Core: Digital Library Education in Library and Information Science Programs. D-Lib Magazine, 12(11), Nov. 2006
• Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What is a good digital library?" - A quality model for digital libraries. Information Processing and Management, 43(5): 1416-1437, 2007
• Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations
• Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox, Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470
• Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124
• Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008
• Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8): 699-723, 2008
• Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox. Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123, 2009
• Module 1-b: History of digital libraries and library automation
• Module 2-c: File Formats, Transformation, and Migration
• Module 3-b: Digitization
• Module 4-b: Metadata
• Module 5-a: Architecture overviews
14
DL Curric. Modules - 2
• Module 5-b: Application software• Module 5-d: Protocols• Module 6-a: Information needs/relevance• Module 6-b: Online information seeking
behaviors and search strategies• Module 6-d: Interaction design and
usability assessment
15
DL Curric. Modules - 3
• Module 7-b: Reference Services
• Module 7-g: Personalization
• Module 8-b: Web Archiving
• Module 9-c: Digital library evaluation, user studies
16
Interoperability Approaches
• Browsers (Mosaic)
• Federation
• Heterogeneous, Homogeneous
• Protocols (OAI-PMH)
• Repositories
• Content Standards (XML), Mapping
• Integration (ETANA)
• Services (Superimposed Information)17
18
Integration: Challenges
• “Semantic Web” is vision, not reality.
• How can we integrate without a theory?
• How can we interoperate without a common framework?
• How can we have a science of DLs if we lack agreement on definitions (so we can reason and discuss) and measures of quality (so we can compare and improve)?
19
Informal 5S & DL Definitions
DLs are complex systems that
• help satisfy info needs of users (societies)
• provide info services (scenarios)
• organize info in usable ways (structures)
• present info in usable ways (spaces)
• communicate info with users (streams)
20
5S LayersSocieties
Scenarios
Spaces
Structures
Streams
21
5Ss
Ss Examples Objectives
Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data
• Discovery of content• Classification and cataloguing• Acquisition and/or linking; referencing• Disciplinary-based themes define a natural body of content,
but other possibilities are also encouraged • Access to massive real-time or archived datasets• Software tool suites for analysis, modeling, simulation, or
visualization• Reviewed commentary on learning materials and pedagogy
• System descriptions and comparisons– Personal DLs; Institutional to global– DSpace, Eprints, Fedora, Greenstone, Kepler
• ODL• 5S Suite: language, visualization,
generation, logging
46
Architectural Issues• Independent system vs. part of federation• Centralized vs. distributed vs. open services• Monolithic vs. modular vs. componentized• Topologies: bus vs. star vs. hierarchical vs. network• Decompositions vary
– search engine, browser, DBMS, MM support– repository, handle server, client– information resources + mediators, bus or agent
collection + client with workspace/environment
47
NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup
referenceditems &
collections
referenceditems &
collections
Special Databases
NSDLServicesNSDL
ServicesOther NSDLServices
CI Services
annotation
CI Services
discussion
CI Services
personalization
CI Services
authentication
CI Services
browsing
Core Services:information retrieval
Core Collection-Building Services
harvesting
Core Collection-Building Services
protocols
Core Services:metadata gathering
Portals &ClientsPortals &
ClientsPortals &Clients
Usage Enhancement
Collection Building
User Interfaces
NSDLCollections
NSDLCollections
NSDLCollections
CoreNSDL“Bus”
48
5S Modeling -> SystemsDomain Concepts (theory)
DLArchitecture
instance of
ModelingLanguage(Meta-Model)
Model
used to compose instance of
abstracted from
represented by
interpreted as
represented by
interpreted as
instance of
instance of
Running
DL DL
Actors
“Real”World
“real” worldobject
Q
49
Tools/Applications
5S MetaModel
5SGraphDL
Expert
DL Designer
5SL DL
Model
5SLGen
Practitioner
Researcher
TailoredDL
Teacher
componentpool
ODLSearch,ODLBrowse,ODLRate,ODLReview,
…….
Logging ModuleXMLLog
50
Requirements Analysis Design Implementation Test
5S 5SLOO ClassesWorkflow Components
DLEvaluation
5SGraph 5SLGenFormalTheory/Metamodel
DL XMLLog
51
5SL: a DL design language
• Domain specific languages – Address a particular class of problems by offering
specific abstractions and notations for the domain at hand
– Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping.
• XML-based realization of 5S– Interoperability– Use of many sub-languages (e.g., MIME types, XML
Schemas, UML notations)
52
5SL – The Minimal DL Metamodel
Index
Actor
Search Manager
Index Manager
Document
Collection Catalog
Metadata
Service
Manager
Interface Manager
Community
Event
Scenario
Service
Browsing Manager
User
Interface
Scenarios (Meta-) Model
Spatial
(Meta-) Model
Meta-Models
Meta-ModelsPrimitives
Stream
(Meta-)ModelStructural (Meta-) Model
Text AudioVideo Image
Societal (Meta-) Model
Retrieval
Model
uses
runs
receiver
Repository Manager
53
<document name=`ETD'>
<stream_enumeration>
<stream
value=`ETDText'>
<stream
value=`ETDAudio'>
...
</stream_enumeration>
<structured_stream>
%XMLSchema%
<structured_stream>
</document>
Example of Document declaration in theStructures Model
<Society>
<Actor>
<Community name='Patron‘/>
<Attribute name='name‘
type='String'/>
<Attribute name='ID‘
type='Integer'/>
</Community>
<Community name='Student'>
<Service>Converting</Service>
</Community>
<Community name='ETDReviewer'>
<Service>Reviewing</Service>
</Community>
<Community name='ETDCataloguer'>
<Service>Cataloguing</Service>
</Community>
</Actor>
………
Example of Actors declaration in theSocieties Model
<SERVICE name ='Searching'>
<SCENARIO name='SimpleSearching'>
<NOTE>Simple scenario for an NDLTD
site searching service</NOTE>
<EVENT>
<SENDER>Patron</SENDER>
<RECEIVER>InterfaceManager</RECEIVER>
<OPERATION name=SearchCriteria/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
<SENDER>InterfaceManager</SENDER>
<RECEIVER>SearchManager</RECEIVER>
<OPERATION name='Search'/>
<PARAMETER>collection</PARAMETER>
<PARAMETER>query</PARAMETER>
</EVENT>
<EVENT>
<SENDER>SearchManager</SENDER>
<RECEIVER>InterfaceManager</RECEIVER>
<PARAMETER name='Results'>WtdSet
</PARAMETER>
</EVENT> ….
Example of Service declaration in theScenario Model
54
• Help users model their own instances of a digital library (DL) in the 5S language (5SL).
• A simple modeling process which enables rapid generation of digital libraries
• Features– 5SGraph loads and displays a metamodel in a structured toolbox.– The structured editor of 5SGraph provides a top-down visual
building environment for the DL designer.– 5SGraph produces syntactically correct 5SL files according to the
visual model built by the designer.
5SGraph: A DL Modeling Tool
55
Overview of 5SGraph
Workspace
(instance model)
Structured
toolbox
(metamodel)
56
57
5SGen
• Version 1 -- MARIAN as the target system– Focused on rich structures: semantic networks– Behavior attached to nodes/links
• Version 2 -- Shifted for later work to componentized (ODL) approach – Focused on scenarios/societies– Structures/Spaces encapsulated within components (e.g.,
relational tables, indexes)– Only textual streams supported
• Effectiveness– Very common measures: Precision, Recall, F1, 10-
precision, R-Precision– Other services may have different measures: e.g.,
Recommending, etc.
• Efficiency– let t(e) be the time of an event e
– let eix and efx be the initial and the final event of service sex .
– For service sex, efficiency is defined as:
• Efficiency(sex) = t(efx) - t(eix)
64
DL Integration
• What is “DL Integration”– Hide distribution– Hide heterogeneity– Enable autonomy of individual component
• Why Integration– island-DLs– inability to seamlessly and transparently
access knowledge across DLs
Utilize various autonomous DLs in concert
65
Integration: Urgency, Longevity
• If we collect, capture, acquire, or produce information, will it be usable in 100 years?
• NSF Digital Archiving Program
• Library of Congress National Digital Information Infrastructure and Preservation Program
66
DL interoperability approach
Intermediary-based mapping-based
Consists of
mediator wrapper agent
use
two architectures
federation Union Archiving
used in
Consists of
hybrid mapper composite mapper
use
schema mapping
use
Interrelated with
GA
trained by
DL integration formalization
based on
Union DL Definitions
• A Minimal Union Digital Library integrated from n DLs is given as a four-tuple: MinUnionDL=(Union Repository, Union Catalog, Minimal Union Services, Union Society).
• DL Integration Problem Definition: Given n individual digital libraries (DL1, DL2, …, DLn), each defined as described above, to integrate the n DLs is to create a Union DL.
68
Union Catalog Quality Measurement
• Complete– All the catalogs to be integrated are complete.
• Consistent– All the catalogs to be integrated are consistent.– Each descriptive metadata specification in the
union catalog describes only one digital object.
Member DLs of ETANA-DL
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Lahav
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Madaba
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Megiddo
Repository
Catalog
DatabaseSearching
and Browsing
Archaeologists
Society
Archaeologists
Archaeologists
Society
Service
Umayri
…
Architecture of ETANA-DL, with centralized catalog and partially