Making research findings visible – the future of the scientific paper Matthew Cockerill Publisher, BioMed Central
Dec 18, 2015
Making research findings visible – the future of the
scientific paper
Matthew CockerillPublisher, BioMed Central
"There is nothing more amusing than watching business interests work themselves up into a righteous frenzy over a threat to their monopoly profits from a new technology or some upstart with a different business model. Invariably, the monopolists… try to present themselves as champions of the consumer, or defenders of a level playing field, as if they hadn't become ridiculously rich by sticking it to consumers and enjoying years in which the playing field was tilted to their advantage."
Steven Pearlstein in the Washington Post, July 19 2006
Momentum for transition to OA
We are seeing action (not just words) from funding agencies and governments– Wellcome and several UK research councils now require OA
deposit as a condition of grants– Federal Research Public Access Act may do the same in US
OA journals continue to grow rapidly Impressive impact factors demonstrate OA and
quality are absolutely compatible Move to OA basically unstoppable
Growth of OARolling 28-day count of submissions to BioMed Central
Journals
0
200
400
600
800
1000
1200
1400
Jul-00 Jan-01 Jul-01 Jan-02 Jul-02 Jan-03 Jul-03 Jan-04 Jul-04 Jan-05 Jul-05 Jan-06 Jul-06
Submissions
Impact factors
Genome Biology – IF 9.71 BMC Bioinformatics – IF 4.96 BMC Genomics – IF 4.09
Genome Biology is: 10th of 124 in GENETICS & HEREDITY 4th of 139 in BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Why did we start BioMed Central as an open access publisher?
Limited access to research articles makes further research needlessly inefficient
Barriers to access obstruct interdisciplinary cross-fertilization
It is in the interest of researchers for their research being read and cited as widely as possible
Traditional scientific publishing is not an effective market, and so high serials prices mean a poor deal for the scientific community
The main reason we started BioMed Central
Publications and data are a continuum Publications include data Publications are data To make sense of data and publications delivered by
post-genomic science, we need– The best possible tools– The widest possible collection of raw material
Open access stimulates the creation of tools by providing access to the raw material
Text mining
Open access facilitates text mining BioMed Central XML corpus of full
text articles is freely downloadable The more semantics that are
captured in the XML, the richer the possibilities for mining
Semantic enrichment Ensure that the rest of the knowledge
represented in scientific articles is structured to be computer-readable
Ideally capture semantics unambiguously at time of publication
Mining of free text is a stopgap/fall-back It is not just articles that need semantic
enrichment, but data sets too Appropriate standards are now emerging
RDF
Useful common technical standard for expressing semantics
Subject-predicate-object triples BioMed Central already exposes
bibliographic RDF for all articles Tools like the PiggyBank can
capture RDF and then store it in triple-stores (local or networked)
Semantic Laundry List Scientific stuff
– Genes– Proteins– Anatomy– Taxonomy– Small molecules/drugs– Macromolecules– Diseases– Experimental methodologies– Experimental data types
General stuff– People, Places, Organizations, Relationships
Neurocommons.org
A ScienceCommons project Working with open access articles
from BioMed Central and PLoS Attempting to define best
practices/gold standard for semantic enrichment of articles
Text mining and enhanced authoring tools both have role
The role of wikis
The challenge: Ontologies, to be useful, must stay up-to-date and receive ongoing maintenance and curation
Scope of problem is enormous - every entity and relationship of relevance to science
Wikis provide a promising approach - perhaps the only viable approach
e.g. AuthorIDs
Projects at BioMed Central to capture structured info
Case reportsClinical trialsBiological processesChemical structuresTaxonomic descriptions
Publishing research articles in a more structured form allows the results to be treated as a database
Incentivize authors
Ideally, create structured authoring tools that remove work rather than add it (e.g. EndNote)
If you do create extra work for authors, find a way to provide the author with an immediate return on investment
Reduce work - smart authoring e.g. auto suggest
Standard way to disambiguate contacts Why not chemicals, genes, species too?
– Unambiguously capture semantics– Increase accuracy, save time, encourage uptake
Return on investment
Automatic update of meta-analysis based on clinical trial data
Automatic list of closely-related case reports from database
Automatic deposit of taxonomic information in registry (Zoobank)