The Role of Abstract and Citation Databases in Supporting Data Repositories DataCite Workshop: Möglichkeiten und neue Lösungen im Forschungsdatenmanagement Köln - 12/12/2012 Michael Habib, MSLS Product Manager, Scopus [email protected]
The Role of Abstract and Citation Databases in Supporting Data Repositories
DataCite Workshop: Möglichkeiten und neue Lösungen im Forschungsdatenmanagement Köln - 12/12/2012
Michael Habib, MSLSProduct Manager, [email protected]
Broadest source for research answers
18,819Peer reviewed journals
405Trade journals
332Book series
A rich and extended coverage including
19,804 active titles
248Conf. series
2 million new records are added each year via daily updatesTotal average processing time: 5 days
Breadth of coverage across subject areas
More than 19,500 titles in Scopus, titles can be in more than one subject area
Health Sciences 6,300
• (100% Medline)
• Nursing
• Dentistry
• etc.,
Social Sciences 6,350
• Psychology
• Economics
• Business
• A&H
• etc.,
Life Sciences 4,050
• Neuroscience
• Pharmacology
• Biology
• etc.,
Physical Sciences 6,600
• Chemistry
• Physics
• Engineering
• etc.,
Broader coverage than nearest peer
Scopus (Total: 19,809)
Web of Science(Total: 12,311)
8,432
934
11,377
www.jisc-adat.com
Scopus added value
Broadest coverage of quality global content including Asia and emerging countries
• …
Nearest Competitor Scopus 0
150
300
500
0
250
0
1,000
2,000
0
1,000
2,000
0
4,000
8,000
0
4,000
8,000
600
300
0
Elsevier constitutes approximately 15% of titles in Scopus
More expansive coverage does not meanlower standards
Scopus Content Selection & Advisory Board (CSAB)
Scopus selection criteria
Journalpolicy
• Convincing editorial concept/policy
• Level of peer-review
• Diversity in geographic distribution of editors
• Diversity in geographic distribution of authors
Quality ofcontent
• Academic contribution to the field
• Clarity of abstracts
• Quality and conformity with stated aims & scope
• Readability of articles
Journal standing
• Citedness of journal articles in Scopus
• Editor standing
Regularity • No delay in publication schedule
Online availability
• Content available online
• English-language journal home page
• Quality of home page
Minimum criteria
• Peer-review
• English abstracts
• Regular publication
• References in Roman script
• Publication ethics statement
Titles reviewed(n=2,279, January 2011 – 15 May 2012)
2,279 titles reviewed of which 41% accepted
Num
ber
of
titl
es
revie
wed
Acc
epta
nce
rate
(Researchers, N = 3824 ; study by Publishing Research Consortium, 2010)
High importance but noteasily accessible
– establish easier access to research data on the Internet
– increase acceptance of research data as legitimate, citable contributions to the scholarly record
– support data archiving that will permit results to be verified and re-purposed for future study.
From: http://datacite.org/whatisdatacite emphasis my own
What is DataCite?
Pro’s• Coupling of data and article• Peer review• Preservation (byte-wise)• Citation mechanism
Con’s• Limited data type support• Compatibility (format support)• Limited capacity• Data not centrally stored
Supplementary Material
• Supplementary material is not a perfect solution
• Many poor solutions in use: data on PCs, university websites, personal homepages, ...
• Data repositories: the community’s answer?– Scientists prefer independent data repositories above publishers– Domain-specific coordination– Centralized information “hubs”
• “Raw data should be freelyaccessible to researchers” “... believe that, as a general principle,
data sets, raw data outputs of research, and sets or subsets of that data should wherever possible be made freely accessible to other scholars ...”(Statement from STM & ALPSP, June 2006)
Connecting with Data Repositories
Database Subject Type of Linking
CCDC Crystallography Article-level
PANGAEA Earth Sciences Article-level*
EMBL Molecular Interactions
Chemistry Entity, tagging
Molecular INTeraction DB Chemistry Entity, tagging
Genbank Nucleotides Entity, tagging
UniProt Proteins Entity, tagging
Protein Data Bank Proteins Entity, tagging
ClinicalTrials Medicine Entity, tagging
TAIR (Arabidopsis) Model organism Entity, tagging
Mendelian Inheritance in Men
Genetics, inheritance Entity, tagging
*: with Application
ScienceDirect Examples
http://dx.doi.org/10.1016/0377-8398(86)90033-2
PANGAEA Supplementary Data
– establish easier access to research data on the Internet
– increase acceptance of research data as legitimate, citable contributions to the scholarly record
– support data archiving that will permit results to be verified and re-purposed for future study.
From: http://datacite.org/whatisdatacite emphasis my own
What is DataCite?
ScopusExample
(Researchers, N = 3824 ; study by Publishing Research Consortium, 2010)
High importance but noteasily accessible
1.Pilot with specific community of authors, publishers, and data repositories, to try and change behaviours (in concept phase)
2.Track, count, and analyze citations to Documents as proof of Data impact (research needs to be done)
3.Establish links from Scopus Document Records to related Data sets to improve discovery (PANGAEA first step, looking to expand)
4.Ingest and index Data Repository (DataCite) records and enable searching from Scopus (the future)
5.Track Citations from Documents to Data sets (the more distant future)
Scopus priorities moving forward
Michael Habib, MSLSProduct Manager, [email protected]://twitter.com/habibhttp://orcid.org/0000-0002-8860-7565
Thank you