The Data Documentation Initiative (DDI) Ron Nakao Social Science Data and Software (SSDS) Stanford University Libraries With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy Thomas, Mary Vardigan Presented to the DLF Fall Forum 2007 - Philadelphia, PA A Metadata Specification for Social Science Data
The Data Documentation Initiative (DDI). Ron Nakao Social Science Data and Software (SSDS) Stanford University Libraries With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy Thomas, Mary Vardigan Presented to the DLF Fall Forum 2007 - Philadelphia, PA. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Data Documentation Initiative (DDI)
Ron NakaoSocial Science Data and Software (SSDS)Stanford University Libraries
With input from Gretchen Gano, Sanda Ionescu, Jim Jacobs, Nancy McGovern, Wendy Thomas, Mary Vardigan
Presented to the DLF Fall Forum 2007 - Philadelphia, PA
A Metadata Specification for Social Science Data
Presentation Overview
What is the DDI? What is the DDI Alliance?
A taste of the DDI specification
Futures
What is the DDI?
“The Data Documentation Initiative (DDI) is an effort to establish an international XML-based standard for the content, presentation, transport, and preservation of documentation for datasets in the social and behavioral sciences.”
What is the DDI Alliance?
Host Institutions Member Institutions Organization Structure
Director Steering Committee Expert Committee Working Groups
DDI Alliance Host Institutions and Associations
Inter-University Consortium for Political and Social Research (ICPSR) - b.1962
Roper Center for Public Opinion Research - b.1946+
Council of European Social Science Data Archives (CESSDA) - b.1976
International Federation of Data Organizations (IFDO) - b.1977
International Association of Social Science Information, Service, and Technology (IASSIST) - b.1974+
DDI Alliance Member Institutions (30) University of Alberta, Canada University of California, Berkeley --
Computer-Assisted Survey Methods Program and UCDATA
University of California, California Digital Library
Centre for Survey Research and Methodology (ZUMA)
Centro De Investigaciones Sociologicas (CIS), Spain
DDI Alliance Member Institutions (30) CEPS/INSTEAD -- Luxembourg Danish Data Archive Data Archiving and Networked Services
(DANS), The Netherlands Emory University Finnish Social Science Data Archive German Socio-Economic Panel Study (SOEP) University of Guelph, Canada Harvard-MIT Data Center
DDI Alliance Member Institutions (30) Inter-university Consortium for Political and
Social Research (ICPSR) Massachusetts Institute of Technology (MIT) University of Minnesota National Opinion Research Center (NORC) Norwegian Social Science Data Service (NSD) Open Data Foundation, Tucson, Arizona Princeton University Roper Center Stanford University
DDI Alliance Member Institutions (30) University of Surrey, United Kingdom Swedish Social Science Data Service (SSD) Swiss Data Archive for the Social Sciences
(SIDOS) United Kingdom Data Archive (UKDA) University of Wisconsin World Bank, Development Data Group
(DECDG) Yale University Zentralarchiv fuer Empirische
Concept of DDI and definition of needs grew out of the data archival community
1995 - DDI efforts initiated by ICPSR 1997 - XML DTD released 2000 - DDI 1.0 released 2003 - DDI 2.0 released - DDI Alliance formed 2007 - DDI 3.0 Candidate Draft Release 2008 - DDI 3.0 Final Release
DDI: Early Development
2000 – DDI 1.0 Simple survey Archival data formats Microdata only
2003 – DDI 2.0 Aggregate data (based on matrix
structure) Added geographic material to aid
geographic search systems and GIS users
DDI versions 1 & 2
Document Description Study Description Data Files Description Variable Description Other Study-Related Materials
DDI 3: The Data Life Cycle
Capturing the Data Life Cycle Study Unit
- Research question - Funding - Concepts - Background research
Study Unit Data Collection
- Instrument - Data collection process - Questionnaire
Capturing the Data Life Cycle
Study Unit Data Collection Logical Product
- Intellectual content of data - Relationship to questions and concepts- Relationship to processing (recodes, weighting, derivations, imputations)
Capturing the Data Life Cycle
Study Unit Data Collection Logical Product Physical Data Product
- Describes the structure (microdata, tabular,aggregate, Ncube…)
Capturing the Data Life Cycle
Study Unit Data Collection Logical Product Physical Data Product Physical instance
- Each describes a single data file (e.g., Census data by state...each state is an instance)
Capturing the Data Life Cycle
Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” (METS-inspired)
-An instance module “wraps” the other modules. Like a table of contents to a group of studies and files and modules it brings everything together.
Capturing the Data Life Cycle
Study Unit Data Collection Logical Product Physical Data Product Physical instance “Instance” Archive
- Each archive can add its own local information with an archive module.
Capturing the Data Life Cycle
Group module
- Describe concepts, questions, and variables that occur in several studies.- Describe a series (e.g., CPS, Eurobarometer) - Describe a collection of studies (not a series) and identify the common comparable concepts, questions and variables.
Capturing the Data Life Cycle
Group module Comparative module
-The Comparative module contains information for comparing concepts, questions, and variables between or among Study Units that have been housed in a Group.
<r:SouthLatitude>+13.71</r:SouthLatitude> <r:NorthLatitude>+76.63</r:NorthLatitude></r:BoundingBox><r:Description translated="false" translatable="true"><xhtml:p>United States, Region, Division, State, County, County Subdivision, Place, Tract/Block Numbering Area within Place/Remainder within County Subdivision.</xhtml:p></r:Description><r:SpatialObject>Polygon</r:SpatialObject><r:GeographicStructure><r:Geography><r:Identification><r:ID>G001</r:ID>
DDI 3.0 Geography Example
DDI - User Community Data archives and libraries world-
wide (e.g., ICPSR, CESSDA) Health Canada Statistics Canada World Bank WHO (World Health Surveys) Gallup-Europe Metadata Management Toolkit
(IHSN)
International Household Survey Network (IHSN) To coordinate and improve survey
collecting operations in developing countries
Developed to support the survey collection activities of the International Household Survey Network (IHSN)
Sponsors: 18 organizations, such as ILO, UNESCO, World Bank, UNICEF, WHO, UNDP, Eurostat
Goal: improve the quality of collected data and encourage more dissemination and long-term preservation
100% DDI compliant
Futures Continued development of DDI Outreach, train, promote Expand Alliance membership Foster tools development Build ties & interoperability with
other metadata specifications Funding ISO Standard status