Digital Preservation: The Multimedia Standards way Mario Döller Assistant Professor University of Passau, Germany 1st International Digital Preservation Interoperability Framework (DPIF) Symposium
Digital Preservation: The Multimedia Standards way
Mario DöllerAssistant Professor
University of Passau, Germany
1st International Digital Preservation Interoperability Framework (DPIF) Symposium
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Agenda
• Heterogeneity in Digital Preservation and Multimedia Retrieval in general
• Selected solutions based on Multimedia Standardso MPEG Query Formato JPEG JPSearch
• Conclusion
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Digital Preservation (DP) Efforts
• National Programs (US, EU, Australia, etc.)o NDIIPP, RAMA, ORION, CASPAR, PLANETS, PADI,
PANIC , …
• Digital Preservation Europe (DPE) o improve collaboration and synergies between
existing preservation initiatives across Europe
• Developed Metadata Formatso General: MPEG-7, Dublin Core, …o DP: VRA (Visual Resources Association) Core 4.0,
CIDOC Conceptual Reference Model (CRM, (ISO 21127:2006) ), museumdat, …
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Multimedia RetrievalCurrent Situation
SQL/MM
MOQL
XIRQL
MMDOC-QL
IMAQL
XQuery
XPath
SVQL
Professional Content Provider
PrivateContent Provider
MetadataAnnotation
MPEG-7
Dublin Core
TV-Anytime
NISO
Proprietary
MMRS
Oracle InterMedia
Informix
Blobworld
IBM MMRS
QueryLanguages
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Example for heterogeneous Image Annotation
MPEG-7<Creator> <Role href=„urn:…:CS:creator"></Role> <Agent xsi:type="PersonType"> <Name> <GivenName>Mario</GivenName> <FamilyName>Döller</FamilyName> </Name> </Agent></Creator>
Dublin Core<metadata> … <title>Alps</title> <creator>Mario Döller</creator> …</metadata>
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Current Standardization Efforts
• MPEG – Query Format (ISO/IEC SC29 WG11) (ISO/IEC 15938-12)
• JPEG - JPSearch (ISO/IEC SC29 WG1) (ISO/IEC 24800)
• W3C – Media Annotations Working Group
(http://www.w3.org/2008/WebVideo/Annotations/)
MPEG – Query Format (ISO/IEC SC29 WG11) (ISO/IEC 15938-12)
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
The MPEG Query Format (MPQF)
• International Standard since end of 2008
• standardizes messages from and to multimedia repositories and provides extended functionalities for service discovery, service selection and service capability description.
• General Conceptso bases on XML and is defined by an XML Schemao decoupled from any other metadata standard (also MPEG-7)o support for any XML based MM metadata descriptiono integration of limited XQuery functionality
o MPQF divided into 3 main categories• Management• Input Query Format• Output Query Format
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF Scenario
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF ConceptsQuery I
Query
QFDeclaration(0..1)
OutputDescription(0..1)
QueryCondition(0..1)
• MPQF supports:o Synchronous/
Asynchronous modeo Timeout
functionality
• MPQF combines:o Exact matches o Fuzzy requests
• How to query MMRS satisfactorily?
Query Design
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF ConceptsQuery - Condition
AND
QbMediaQbMedia
scoringFunction
preferenceValuethresholdValue
[0 .. 1][0 .. 1]
[0 .. 1]
• assign preferenceValue and thresholdValue to every condition
• assign scoringFunction to every „Boolean Operator“ (AND, OR, XOR) (recommended to follow t-norm, t-conorm rules)
• result in rank and confidence evaluation for every item
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF ExamplesManagement I
• request: give me all available MMRS!
• request: give me all available MMRS fitting to my
desired requirements
<MpegQuery> <Management>
</Input> </Management></MpegQuery>
<MpegQuery> <Management> <Input>
<DesiredCapability> <SupportedMetadata href="urn:mpeg:mpeg7:schema:2004" /> <SupportedQueryTypes href="QueryByMedia" /> <SupportedQueryTypes href="QueryByFreeText" /></DesiredCapability>
</Input> </Management></MpegQuery>
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF ExamplesQuery I
• Browsing Query
• QueryByFree Text
<MpegQuery> <Query> <Input /> </Query></MpegQuery>
<MpegQuery> <Query> <Input>
<QueryCondition> <Condition xsi:type="QueryByFreeText">
<FreeText>This is a free text query</FreeText> </Condition></QueryCondition>
</Input> </Query></MpegQuery>
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF Examples
• Assume: DB contains images annotated with the Corine Land Cover specification combined with MPEG-7.
• Example images show industrial or commercial units (121) in the area of Sines/Portugal [European ]
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
MPQF ExampleSpatial Query
<MpegQuery mpqfID="">… <QueryCondition> <Condition xsi:type="SpatialQuery"> <SpatialRelation relationType="urn:...:SpatialRelationCS:2008:south" sourceResource="ID1"> </SpatialRelation></Condition> </QueryCondition>…</MpegQuery>
Give me all satellite images that show an industry unit in the south of something else!
ID1
JPEG - JPSearch (ISO/IEC SC29 WG1) (ISO/IEC 24800)
current status
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
JPSearch Objectives
• provide a standard for interoperability for image search and retrieval systemso by defining the interfaces and protocols for data
exchange between them
• provide an abstract framework and flexible search architecture that allows:o adding, updating or querying metadata of images
and image collections o federated search across different systems o the integration of best-of-breed independent search
components, provided by different companies
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
JPSearch Overall Structure
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Part 2: Registration, Identification and Management of Metadata Schema (1)
• Schema identification:o identified by a URI and XML Namespace
• Schema and Transformation Rules managemento Central authority hosted by JPEG
• Create a single core schema
• Definition of Transformation Ruleso perform semantic, structural and syntactic
mapping rules between XML-encoded metadata descriptions from different formats
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Workflow of a JPSearch request
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Query Transformation
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
AIR: ARCHITECTURE FOR INTEROPERABLE RETRIEVAL ON DISTRIBUTED AND
HETEROGENEOUS MULTIMEDIA REPOSITORIES
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Summary - Outlook
• Introduced MPEG Query Format and the JPSearch approach for improving interoperability during multimedia retrieval
• Future Work o Establish the standards for cultural heritage
projects?
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Questions?
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Part 3: JPSearch Query Format (FCD)
• Derived from MPEG Query Format • Only minor changes (namespaces, etc.)• Restrictions:
o No TemporalQueryTypeo only image domain allowed
• Modifications:o QueryByROIo QueryByMediao MIME-Type
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Part 4: JPSearch File Format (CD)
• Extension of JPEG-1/JPEG2000 file format • Fully compatible to JPEG-1/JPEG2000 and
provides additional functionality carrying associated metadata within a file
Overall structure of JPEG-1-compliant version of JPSearch file format.
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Part 5: Data Interchange Format between Image Repositories (CD)
• should be able to perform synchronization and consolidation/aggregation of repositories
• Synchronization of:o Meta part: to identify the content of data part o Data part: image, collection of images,
metadata, ontology or URI
• Relys on image collection format of ISO/IEC 23000-3 (Photo player MAF), MP4
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
Workflow of a JPSearch system (1)
1. JPQF query receives2. Metadata based on Core
Schema is transformed to the metadata of the N native schemas
3. Additional Metadata is transformed to N native Schemas (if possible else the information is discarded)
4. N times Query transformation (optional)
5. Forward N queries to N native systems where the interpreter transforms it to the native query language and executes it
6. Transforms N individual result sets to core schema.
7. Aggregate N result sets to 1 result set
8. Forward final result set to user
JPQF
JPQF
JPSearch compliant
system
CORE Schema
+ JPTags
MPEG-7MM Database
Interpreter for native query language
Transformation Rules
JPQF query reformulation
Ag
greg
ation
Service
1
2
3
4
5
6
7
8
... MPEG-7MM Database
Interpreter for native query language
Universität Passau, Lehrstuhl für Verteilte InformationssystemeProf. Dr. Harald Kosch
AIR: Planned search concepts