Top Banner
Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011
15

Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Dec 17, 2015

Download

Documents

Delphia Hardy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Data Citation for the Social Sciences

Mary Vardigan ICPSR

CODATA Conference on Data Attribution and Citation August 22-23, 2011

Page 2: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Today’s Presentation

• Norms in the social sciences and implications for data citation

• Summary of major citation issues for social science

Page 3: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Knowledge claims

• Social science advances through knowledge claims published in the literature

• Need to verify and extend claims; Secondary analysis encouraged

• Follows that data need to be available for reuse and cited

Page 4: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Data sharing

• Strong tradition of data sharing, both formal and informal

• Active social science data archives around the world

• Some PIs distribute data on Web sites• Pienta, Alter, and Lyle found 88.5% of data

generated not publically archived (since 1985)

Page 5: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Metadata

• Metadata play important role – Documentation necessary to understand the data

• Questionnaires, user guides, methodology descriptions, record layouts also provided

• Heterogeneous in format – most unstructured• Data Documentation Initiative (DDI) seeks to

provide a structured metadata standard

Page 6: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Granularity and versioning

• “Studies” may be single datasets or aggregations

• Also a need to cite data subsets that support the findings in publications

• Data are sometimes updated and need to be versioned

Page 7: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Content and formats

• Mostly quantitative data and some qualitative• Boundaries blurring between social science and

other domains• Survey data supplemented by biomarker data• Survey data merged with administrative records• Trend toward complex collections• Social media data• Video, audio data

Page 8: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Confidentiality concerns

• Survey respondents promised anonymity, a critical pledge to uphold

• Legal agreements required for restricted data use• New mechanisms to analyze restricted data

online emerging – virtual enclaves and virtual datasets

• Often a public-use version and restricted versions coexist

Page 9: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Replication

• Most claims not able to be replicated based on information in publications

• Replication archives -- ICPSR, Dataverse, etc.• What is required is chain of evidence and record

of decisions – deep citation and provenance • Need both production transparency (record of

decisions in transforming data) and analytic transparency (how conclusions drawn)

Page 10: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Some tradition of citation

• Citation standard for machine-readable files created in 1979

• Citations available from data providers -- Census Bureau and ICPSR since late 1980s

• Journals just beginning to cite data• Persistent identifiers: DOIs or handles

Page 11: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Journal practices

• Historically little effort to standardize or verify data references in publications

• Growing movement to require data behind findings to be publically available

• AER: Will publish only if “data used in the analysis are clearly and precisely documented and readily available for replication.”

Page 12: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Influencing journals

• Data-PASS campaign to influence journals sponsored by professional associations

• Wrote to major professional associations demonstrating inconsistencies in citing data

• Success with American Sociological Review, which changed submission criteria

Page 13: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Linking data and publications

• ICPSR has done this since the beginning in 1962

• Now a Bibliography of 60K citations to publications with two-way linking to data

• Vendors like Thomson Reuters now interested in these linkages

Page 14: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Summary -- Citation issues for social science

• Versioning – Data can be dynamic• Unit/Granularity – What is optimal? • Importance of metadata – How to create

durable link?• Replication –– Cite subsets and

replication/workflow files containing scripts?

Page 15: Data Citation for the Social Sciences Mary Vardigan ICPSR CODATA Conference on Data Attribution and Citation August 22-23, 2011.

Thank you…

–Mary Vardigan [email protected]