Top Banner
The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University
25

The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

Jan 21, 2016

Download

Documents

Claire Turner
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

The Data Documentation Initiative: more

discussionChuck Humphrey

University of Alberta

Atlantic DLI Workshop 2005, Acadia University

Page 2: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

2

Outline

• Two metadata challenges

• The evolution of DDI

• The story of the RDC / DDI project

Page 3: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

3

Metadata Challenge #1

The correspondence between metadata and the tools that make use of it tends to be one-to-one.

MARC OPAC

SPSS syntax file SPSS

PDF file Acrobat

SAS commands SAS

Page 4: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

4

Metadata Challenge #1

And we have to create and re-create metadata for each application.

MARC OPAC

SPSS syntax file SPSS

PDF file Acrobat

SAS commands SAS

Page 5: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

5

Metadata Models for Data

• Here are some examples of historical metadata models for social science data. Notice that the characteristics of the metadata were bound by the tools of the day.

Page 6: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

6

Page 7: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

7

Page 8: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
Page 9: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
Page 10: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.
Page 11: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

11

Metadata Challenge #2

Our application tools have tended to constrain our metadata.

Direction of control

Desired service

title search

Choose a tool

card catalog

Tool definesthe metadataformat

3x5 card

We createmetadata tofit a format

Small. (e.g.,3 subjects headings max.)

Page 12: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

12

Metadata Challenge #2Consider how the length of variable labels for various statistical software has constrained our metadata about brief variable descriptions.

Statistical Package Max. Length of Var. Labels

SPSS 7.5 120

SPSS 12 255

SAS 6.12 40

SAS 8.0 256

STATA 6.0 80

Page 13: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

13

Metadata Challenge #2The dilemma created by limiting our metadata to current tools is that when new tools arise or new services are sought that can make use of richer metadata, we will not have created it and must face re-creating the metadata.

Page 14: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

14

Lessons from These Challenges

Metadata should be created to go beyond simple one-to-one use and should be reusable for more than one purpose.

Metadata should be created to describe data, not to meet the needs of one system, one service.

Page 15: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

Blaise

SAS

IMDB

PDF

Word

Paper

DDI

IMDBNesstar

OracleLibrary OPACStat Software

Google

PDF, printhtmlRSSDDI 3, 4 ...

Proposal

Sample design

Questionnaire

Pre-test

Revisions

Collection

Processing

Dissemination

Function Tools Metadata Uses

Applying These Lessons

Page 16: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

16

DDI Versions 1 & 2

1.0 Document Description2.0 Study Description3.0 Data Files Description4.0 Variable Description5.0 Other Study-related

Materials

The first two versions of DDI were modeled after the traditional ‘codebook’ made up of a user’s guide, data dictionary and record layout.

Page 17: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

17

DDI Version 3 (Draft)

The draft for Version 3 is based on a process model and attempts to describe the stages within data creation using a life cycle perspective.

1. Start up 2. Planning 3. Execution 4. Close Out

Page 18: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

DDI Versions 3 (Draft)

Page 19: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

19

DDI Versions 3 (Draft)

Page 20: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

20

Project Partnerships• RDC Network

• RDC’s in the pilot include McMaster, Prairie and Alberta

• RDC Central has a Nesstar Licence

• DLI• DLI Central shares the Nesstar

License and is working on converting PUMF’s to DDI

• DLI EAC approved joining the DDI Alliance

Page 21: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

21

Project Partnerships

• General Social Survey

• Permission to use Cycle 17 in the pilot

• Provided a contact to assist with the data documentation

• Standards Division

• Interested in a pilot that would expose the issues of using DDI to document data

Page 22: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

22

Project Operation

• No formal budget at this point. All contributions to the project are in kind.

• Irene Wong is conducting the evaluation and creation of DDI documentation in the Alberta RDC.

• Sharon Neary, associated with the Prairie RDC, is coordinating training for end-users.

Page 23: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

23

Project Operation

• Byron Spencer is coordinating an evaluation of the Nesstar application of DDI in the McMaster RDC with end-users.

We need for data discovery tools in DLI and the RDCs.

Page 24: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

24

Project Status

• The DDI compliant documentation for the GSS Cycle 17 master file has been completed and is now being tested as McMaster’s RDC.

• Irene is completing a report describing the process of creating the DDI version of the documentation and an assessment of DDI strengths and weaknesses.

Page 25: The Data Documentation Initiative: more discussion Chuck Humphrey University of Alberta Atlantic DLI Workshop 2005, Acadia University.

25

Metadata Life-Cycle Research

One outcome of this project will be to comment on the amount of metadata produced over the life cycle of a survey and to identify the existing tools in which this metadata had been created and stored.