Experiences (mis)managing archaeological data

Post on 09-May-2015

1116 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation given by Jeremy Huggett, Head of Archaeology, University of Glasgow, made at the 'Managing Archaeology Data' event on Monday 7th March 2011.

Transcript

Experiences (mis)managing archaeological data

What’s on my hard disk?

• Cemetery PhD research data, results published

• Two unpublished excavation projects

• Approx 14 surveys, some published, some not

• Backed up by virtue of University network

• None archived

The Primacy of Archaeological Data“Archaeological research data has a primacy

which requires that it must be preserved at all costs. ‘Excavation is destruction’ – the ‘unrepeatable experiment’ – and the digital record may be the only record of precious heritage assets”

ADS report to AHRC (2011)

Professional standards

“Digital material should be fully documented and created according to recognised standards and guidelines as made available by the Archaeology Data Service …”

Institute for Archaeologists: A Standard and Guidance for the creation, compilation, transfer and deposition of archaeological archives

Proof of Problem

• Newham Museum Archaeological Service closed in 1998

• About 230 floppy disks containing over 6000 files totalling over 130 Mb of data.

• Files in a variety of proprietary software formats and versions, some of which are now 'archaic'

http://archaeologydataservice.ac.uk/archives/view/newham/

Proof of Problem (II)

• Closure of GUARD• Files stored on College of Arts network drives• “Copy to DVD and send to RCAHMS”• …

People, we have a crisis …

“Archaeology is in a special position with respect to archiving because the act of data creation, e.g. archaeological excavation, results in the destruction of the primary archaeological evidence itself. Increasingly, the digital record may be the only source of precious research materials”

… or do we?

“All born digital material should be included in the archive, together with appropriate digital material compiled from paper records.”

Institute for Archaeologists: A Standard and Guidance for the creation, compilation, transfer and deposition of archaeological archives

Distinctively Different

“All born digital material should be included in the archive, together with appropriate digital material compiled from paper records.” (IfA)

• Born digital or digital surrogate?

Born Digital

• Data originating in digital form• Data not intended to have an analog

equivalent• Born Digital Data are not: – created as a result of converting analog originals – printed to paper

Digital Preservation Coalition: Digital Preservation Handbook

Digital surrogates

• Digitised from physical items (a scan of a slide or plan, a data table)

• May be a partial, incomplete view

• May be captured at a limited resolution

• Include research databases captured from analog originals

SCAN: 150dpi sharpen filter

Surrogates

Surrogates

RASTER to VECTORTRACE

Surrogates in action

Daub with surfaces Nails

How many data are born digital?

• How many archived projects solely consist of born digital data?

• How many data are actually surrogates?• How many surrogates are suitable for re-use?• If the originals are retained, what are the cost-

benefits of archiving digital surrogates?– Improved access– Enhanced analysis compared to physical

examination

The tyranny of archaeological primacy?

• Do we over-state the case?• Does the hi-tech approach

scare off well-meaning people?– Metadata schema– Ontologies …

• Can there be a scaled approach which includes the less-than-ideal?

An inclusive approach

Tim Berners-Lee “Open, Linked Data for a Global Community” Gov 2.0 Expo 2010http://www.youtube.com/watch?v=ga1aSJXCFe0

make it availablemake it available in machine readable form (e.g. excel file rather than scan of a table)make it available in a non-proprietary format (e.g. csv rather than excel)

link data format with URLs to identify things so people can point at it

link your data to other people’s data

Levels of Archiveishness

1. Data simply made available for access

2. Data made available for re-use with some contextual information

3. Data made available for re-use with complete metadata

= A Tiered Archive?

I promise I will archive my dataI promise I will archive my dataI promise I will archive my dataI promise I will archive my data

… eventually

top related