ETANA-DLNSF Digital Library Project
Edward A. Fox, Virginia Tech
ASOR Annual Meeting, 2004
[email protected] http://fox.cs.vt.edu
http://fox.cs.vt.edu/talks/2004/
Problems
Delay in publication of primary archaeological data Lack of sustainable solutions to long-term
preservation of valuable information Lack of services useful to the archaeology
community, including “traditional DL services” Difficulty in understanding complex archaeological
information systems Difficulty in requirements elicitation for archaeological
systems Interoperability among heterogeneous archaeological
systems
Solution – our approach
Applying and extending Digital Library (DL) techniques to solve the following problems: making primary data available, data preservation, and interoperability
Modeling archaeological information systems using 5S theory to better understand the domain and design the system and the supported services
Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks: elicitating requirements, providing useful services
ETANA-DL
Archaeological Digital Library Applies and extends the OAI-PMH
• Open Archives Initiative Protocol for Metadata Handling
Design considerations• Componentized• Distributed architecture• Extensible• Portable
Site Artifact Type Original data sourceNumber of
records harvested
Bab edh-Dhra’ Pottery cp6 database file 786
Lahav Figurine Tab-delimited text file 563
Madaba Locus field record Tables in Access DB 786
Mozan Publication PDF files 19
Nimrin
Bone field record Table in Oracle DB 7419
Seed field record Table in Oracle DB 429
Locus field record Table in Oracle DB 2101
Umayri Bone field record 2 tables in Access DB 2122
Total 18404
Heterogeneous data handling
ETANA-DL Searching ServiceSearch
ETANA-DL Multi-dimensional Browsing
3 new sites
2 new types of artifacts
ETANA-DL Visual Browsing Service
Visual BrowseBy site
Visual Browsing Nimrin: Topographical Drawings
Full site North west quadrant
Square:N40/W20
Visual Browsing Nimrin : Square information
Square:N40/W20
Locus: 86
Loci layout
Visual Browsing Nimrin : locus sheet
Visual Browsing Bab edh-Dhra' Cemetery
Pottery # 25
Visual Browsing Bab edh-Dhra' Cemetery
Pottery # 25
5S Archaeological DL Modeling
Modeling archaeological information systems
using the 5S theory to better understand the domain and design the system and the supported
services
Digital Object
RepositoryCollection Minimal DL
Metadata Catalog
Descriptive Metadata
Specification
A Minimal DL in the 5S Framework
Structural Metadata
Specification
Streams Structures Spaces Scenarios Societies
indexing
browsing searching
services
hypertext
Structured Stream
Streams Structures Spaces Scenarios Societies
indexing
browsing searching
services
hypertext
Structured Stream
Descriptive Metadata
specification
SpaTemOrg
StraDia
Arch Descriptive Metadata specification
ArchDO
ArchObj
ArchColl
Arch Metadata catalog
ArchDColl ArchDR Minimal ArchDL
A Minimal ArchDL in the 5S Framework
Modeling ETANA-DL – An Archaeological DL Meta-model
Text Video Audio
*Site *Sub-partition *Container *Artifact*LocusRegion
Taxonomies
Temporal Artifact-specific
Space model
Structuremodel
Metadata
Drawing Photo 3DStreammodel
*Partition
Society model
Archaeologist
General public
Geographic space
Service Manager
Information Satisfaction
Value added
Repository buildingScenario
model Services
Domain specific
User interface Metric space
Spatial
Modeling ETANA-DL – ETANA Model
*Field *Pail *Bone*LocusJordan
Taxonomies
Space model
Structuremodel Field record,
locus sheet
Figurine image (photo)
Streammodel
Umayri
Society model
Archaeologist
Generic public
Site-specific coordinate system
Web interface Vector space
ETANA-DLService Manager
Searching, Browsing
Annotation, binding
Harvesting, Converting Scenario
model Services
Object comparison, marking item for analysis
Archaeologicalperiods
Bone type
Seed species
*Square
*Figurine
*Quadrant *Bag*LocusJordan Valley Nimrin *Square
*Field *Basket*LocusSouthern Israel Halif *Area*Seed
Site/field plan(drawing)
Preliminary/FinalReport (application/pdf)
Spatial
Overall objective of 5SGraph:Help users model their own instances of a digital library (DL) in the 5S language (5SL).
A simple modeling process which enables rapid generation of digital libraries is needed.
Support non-expert users. Speed-up development process. Increase the quality of final product.
5SGraph: A DL Modeling Tool
Goals of 5SGraph
To help digital library designers understand the 5S model quickly and easily
To help digital library designers build their own digital libraries without difficulty
To help digital library designers transform their models into 5SL files automatically
To help digital library designers understand, maintain, and upgrade existing digital library models conveniently
5SGraph
How does 5SGraph work?
5SGraph loads and displays a metamodel in a structured toolbox.
The structured editor of 5SGraph provides a top-down visual environment for the DL designer.
5SGraph produces correct 5SL files according to the visual model built by the designer.
Overview of 5SGraph
Workspace
(instance model)
Structured
toolbox
(metamodel)
Stream Model
Structure Model
Space Model
Scenario Model
Society Model
Component Reuse
Components can be loaded/saved. Load and save sub-trees
Component reuse saves time and effort. Full reuse from component pool Partial reuse: adapting components
Semantic Constraints
There are inherent semantic constraints in the hierarchical structure of the 5S model.
5SGraph maintains the constraints and enforces these constraints over the instance model to ensure correctness.
DiscoveryCurrent
AwarenessPreservation
Service Providers
Data Providers
Meta
data
harv
estin
g
The World According to OAI
Data and Service Providers
Data Providers possess metadata and share it (internally / externally) via well-defined OAI protocols (e.g., database servers)
Service Providers harvest data from Data Providers provide higher-level services to users (e.g., search engines)
Who will fit where in ETANA-DL? Data Provider – YOUR PROJECT Service Provider – ETANA-DL
Why be an OAI Provider
Speed up publication
Long-term preservation
Do not need to worry about providing services
How to be an OAI Provider
Requirements• Perl• Web server with ability to run CGI scripts
Download OAI-XMLFile-2.1.tar.gz fromhttp://www.dlib.vt.edu/projects/OAI/software/xmlfile/xmlfile.html
Extract the files into a directory from which CGI scripts may be run • gunzip OAI-XMLFile-2.1.tar.gz• tar –xvf OAI-XMLFile-2.1.tar
How to be an OAI Provider (Cont.)
Want your pottery collection be an OAI data provider?
Create a director “mySitePottery” under ‘OAI-XMLFile-2.1/XMLFile’ • Copy the contents in test5 directory to
“mySitePottery” directory
Modify the config.xml under ‘OAI-XMLFile-2.1/XMLFile/mySitePottery’
<?xml version="1.0" ?>… <repositoryName> pottery repository name</repositoryName> <adminEmail> YourAdmin@yourServer </adminEmail> <archiveId> pottery Archive ID </archiveId> <recordlimit>500</recordlimit> <datadir> directory of pottery XML collection </datadir> … <metadata> <prefix> prefix of pottery repository </prefix> <namespace> namespace of your schema </namespace> <schema> location of your XML file schema </schema></metadata></xmlfile>
Apply the 5S Framework in Integrating Archaeological DLs
Architecture of a Union DL
Union Catalog Integration
Union Services Automation
Repository1
DL1
Repository2
Union Catalog
Union Repository
Catalog1 Catalog2
Searching
Union DL DL2
archaeologists
Society
General Public
Society
ArchaeologistsGeneral Public
Union Society
ServiceBrowsingService
Union Service
Harvesting, Mapping,Searching, Browsing,
Clustering, Visualization
Architecture of a Union DL
Union Catalog
VNCatalog
Union Catalog Integration
Virtual Nimrin(VN)
Halif DigMaster(HD)
HDCatalog
VN MetadataFormat
MappingTool
MappingTool
Global MetadataFormat
Wrapper
Wrapper
HD MetadataFormat
Visualizing Components
Mapper1
Composite Mapper
Mapper2 Mapper3 Mapper4
Visual Mapping Tool Architecture
local schema global schema
Mapping recommendation
Mapping confirmation
Mapping history
No recommendation for “Tomb_Area”
User-decided mapping
5S MetaModel
5SGraphDL
Expert
DL Designer
5SL DL
Model
5SLGen
Practitioner
Researcher
TailoredDL
Services
Teacher
componentpool
ODLSearch,ODLBrowse,ODLRate,ODLReview,
…….
Requirements (1) Analysis (2)
Implementation (4)
Design (3)
5SGraph 5SGen
Mapping Tool
5SSuite
5SGraph5S Archaeology
MetaModelArchDL Expert ArchDL Designer
VN Metadata Format
ETANA-DL Metadata Format
Mapping Tool
Wrapper4VN Wrapper4HD
HD Metadata Format
Inverted Files
Services DB
Index
Index
BrowseService
SearchService
Browse DB
OtherETANA-DL
Services
Web
Interface
XOAI
XOAI
VNCatalog
VNCatalog
UnionCatalog
Structure Sub-modelScenario
Sub-model
Harvesting description
Mappingdescription
Browsing description
…
5SGen
ComponentPool
Browsing…