Top Banner
Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African Studies, London
30

Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

Mar 26, 2015

Download

Documents

Diego Murray
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

Current design issues for digital archives

Robert Munro(presented by David Nathan)

Endangered Languages Archive (ELAR), School of Oriental and African Studies, London

Page 2: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

2

Outline

1. Introduction

2. Archive architectures

3. Current Issues1. value-adding interaction from ‘end’ users

2. flexibility in access to materials

3. granularity of description of materials

4. Conclusions

Page 3: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

3

Introduction – ELAR

Part of the Hans Rausing Endangered Languages Project (HRELP).

Open for deposits since October 2005.In the process of designing and implementing

key systems.

Page 4: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

4

Introduction – ELAR

ELAR will be the first language archive that allows users to:add metadata in the language of their choice add new metadata (comments, descriptions, links) to

existing materialstranslate metadata into a language of their choiceselect language preference(s) for viewing existing

metadataadd metadata to archived materials at different levels of

granularity

Page 5: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

5

Introduction – current issues

‘End’ users adding value to archive materialswho will moderate such additions?

Flexible support of accesscan an archive explicitly support multilingual users?

Metadata – comments / description of materials:should the granularity of description be at the level of:

files,collections of files,and/or sub-subsections of a file?

Page 6: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

6

Archive architectures

Producers Silo

The classic ‘silo’ view of an archive:little more than disaster-proof backup

Page 7: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

7

Archive architectures

Silo

The producers are not the only users:different dissemination formats are required…

Dissemination Producers

Page 8: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

8

Archive architectures

Silo

The producers are not the only users:different dissemination formats are required…

…for different user communities

Dissemination Producers

Designated communities

Page 9: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

9

Archive architectures

Silo

Working formats are not preservation formats:materials may need to be transformed on ingest

Dissemination Designated communities

IngestionProducers

Page 10: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

10

Archive architectures

You cannot rigidly preserve digital data:file need to refreshed and migrated to current formats

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 11: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

11

Archive architectures

…but the objects, metadata and structures are still backed up in disaster-proof silo’s.

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 12: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

12

Archive architectures

Archives need to define three types of ‘packages’ingestion, archive and dissemination:

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 13: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

13

Ingestion (Accession) packages

Formats & structures that can be converted to archive formats with minimal effort:open file formatswell-documented structures: XML with schema ideal

The content needs to take into account the many potential uses of the materials:high quality sound and videoa variety of genresdetailed metadata and structural information

Page 14: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

14

Dissemination packages

Many potential users of archived materials:researchersspeakers educators publishers

With many different requirements:access to materials by various methodsarchive servicescontinuation of ownership of language materials

Page 15: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

15

Current issues – value adding

The current model is fairly uni-directionalbut users can/should add value to archive materials

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 16: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

16

Current issues – value adding

Users should be able to add to existing materials:speakers’ comments on contentresults of recent research

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 17: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

17

Current issues – value adding

The archive needs to trust certain users to add metadata to existing materials:should the identity of users be recorded / open?should users be able to challenge existing metadata?

Who to trust?depositors cannot moderate all comments on objects,

especially if comments can be in any languagebut can an archive deny a speaker’s request to add

comments to a recording of them speaking?

Page 18: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

18

Current issues – flexibility of access

The archive cannot create different dissemination packages for every language and/or user:

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 19: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

19

Current issues – flexibility of access

Users should be able to personalize access:language preference(s) for metadatapreference on type of materials

Archive Dissemination

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

afd_34

dfa dfadf

fds fdafds

Designated communities

IngestionProducers

Page 20: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

20

Current issues – flexibility of access

Flexibility of search / browse:keyword ‘search engine’ type searchrich relationships between objects for browsinggeographic searchesresearch community specific search

Page 21: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

21

Current issues – flexibility of access

Flexibility of language:most metadata in most archives is in Englishshould metadata be multilingual?

Page 22: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

22

Current issues – flexibility of access

If a user prefers to speak Quechua, then Spanish, then English:rather than accessing via one interface per

language…

ORbread

Photograph by Juan Pérez Martínez

January 2006

pan

Fotografia tomada por Juan Pérez Martínez

Enero 2006

pan

Fotografia tomada por Juan Pérez Martínez

Enero 2006

OR …

Page 23: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

23

If a user prefers to speak Quechua, then Spanish, then English:…users should get all languages at once, according

to availability of data and their preferenceslabel in Quechua:photographer in Spanish:date in English:

Current issues – flexibility of access

t’anta

Fotografia tomada por Juan Pérez Martínez

January 2006

t’anta

Fotografia tomada por Juan Pérez Martínez

January 2006

Page 24: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

24

Current issues – granularity

Archives tend to treat archived files as ‘atomic’metadata only refers to files as a whole

What abouta specific comment about a 20 second subsection of

the file? a general comment applying to many files?

Page 25: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

25

Current issues – granularity

For example, suppose we have an annotated sound recording of some event:

Page 26: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

26

Current issues – granularity

Some metadata is about the file as a whole:date recorded, speakers, title

Page 27: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

27

Current issues – granularity

Some metadata is about sub-segments:name of a significant person or place specific linguistic phenomena

Page 28: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

28

Current issues – granularity

It is likely that users will want to:add comments to such subsections richly link subsections to other items make unambiguous reference to subsections

At the time of deposit, no one can predict which subsections of files will later be significant:users need to be able to explicitly define subsections

of archive objects

Page 29: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

29

Conclusions

Archives are not static repositories:an archive supports materials for multiple different

user communities in parallel

Value-adding interaction:archived materials can be further enriched by users

Flexibility in access to materials:personalizable interaction with archive materials

Granularity of description of materials:user defined granularity of materials

Page 30: Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

30

Thank you