-
Mars Target Encyclopedia:Rock and Soil Composition Extracted
from the LiteratureKiri L. Wagstaff1, Raymond Francis1, Thamme
Gowda1,2∗, You Lu1,
Ellen Riloff3, Karanjeet Singh1∗, and Nina L. Lanza41Jet
Propulsion Laboratory, California Institute of Technology,
Pasadena, CA 91109, {firstname.lastname}@jpl.nasa.gov
2Information Sciences Institute, University of Southern
California, Marina Del Rey, CA 90292, [email protected] of
Computing, University of Utah, Salt Lake City, UT 84112,
[email protected]
4Los Alamos National Laboratory, Los Alamos, NM 87545,
[email protected]
AbstractWe have constructed an information extraction system
calledthe Mars Target Encyclopedia that takes in planetary
sciencepublications and extracts scientific knowledge about
targetcompositions. The extracted knowledge is stored in a
search-able database that can greatly accelerate the ability of
scien-tists to compare new discoveries with what is already
known.To date, we have applied this system to ∼6000 documentsand
achieved 41–56% precision in the extracted information.
IntroductionScientists everywhere are overwhelmed by the stream
ofnew information that is published by their disciplines’
con-ferences, workshops, and journals. It is increasingly
difficultto come up to speed in a new area and to stay current with
thelatest discoveries. In planetary exploration, new discoveriescan
occur each time new data is transmitted. For example,our rovers on
Mars have sent back compositional data forthousands of individual
targets (e.g., rocks, soils), and someof those observations have
transformed our understanding ofpast environments on the planet
(Grotzinger et al. 2014).
To interpret new observations correctly, it is necessary tobe
able to compare them with what is already known. For ex-ample, if
we observe high manganese content at a particularlocation, we want
to know whether it is consistent with pre-vious observations or it
indicates an anomalous new discov-ery. However, no central database
exists in which planetaryscientists can quickly make that
determination.
We have created a system called the Mars Target Encyclo-pedia
(MTE) that uses information extraction methods to an-alyze
planetary science publications and identify stated com-positional
relationships between Mars surface targets and el-ements or
minerals. The extracted information is stored in asearchable
database that allows users to ask questions suchas “Which targets
contain hematite?” or “What is knownabout target Dillinger?” It
also enables entirely new kinds ofinformation visualization, such
as a map display of all loca-tions where the Mars rover Curiosity
has detected hematite.Ultimately, the MTE may serve as a resource
to inform de-cisions about the next steps in Mars exploration.
∗This work was done when the authors were at the Jet Propul-sion
Laboratory.Copyright c© 2018, Association for the Advancement of
ArtificialIntelligence (www.aaai.org). All rights reserved.
In this paper, we describe the MTE system and its compo-nent
technologies, the empirical performance of the systemon labeled
data, and results from a large-scale evaluation on∼6000 documents.
The MTE is currently being integratedinto a public website called
the PDS (Planetary Data Sys-tem) Analyst’s Notebook for Mars
scientists and the publicto access. The automated pipeline can be
used to ingest andanalyze new publications as they become
available.
Related WorkA variety of text analysis methods exist for
extracting in-formation from text. Some methods focus on
extractingmeta-data such as the document title, authors, and
publica-tion venue or analyzing and linking citations between
pa-pers (Ronzano and Saggion 2016). However, understandingthe
content of a scientific publication requires a deeper anal-ysis.
Information extraction (IE) of this nature is generallybroken into
two steps: (1) named entity recognition or con-cept extraction, to
identify references to people, locations,concepts, etc., and (2)
relation extraction, to identify rela-tionships between pairs of
entities (Mooney and Bunescu2005). Many of the recent advances in
IE have been moti-vated by problems from the biomedical research
world, suchas the desire to identify protein-protein interactions
(Tikk etal. 2010; Bui, Katrenko, and Sloot 2011) or
chemical-proteinand chemical-disease relations (Krallinger et al.
2017). Tsut-sui, Ding, and Meng (2016) used topic modeling and
openIE, which does not require the prior identification of
entities,to build a knowledge database about Alzheimer’s
disease.
To date, little such work has been done in the domain
ofplanetary science. The closest existing work is the geology-based
GeoDeepDive project, which performs text data min-ing on scientific
publications about (Earth) rock formationsand stratigraphy (Zhang
et al. 2013). By applying and ex-tending information extraction
methods to planetary sci-ence publications, we have the opportunity
to benefit an en-tirely new population of scientists, researchers,
and inter-ested members of the public.
Machine Learning for Information ExtractionThe Mars Target
Encyclopedia (MTE) is an information ex-traction system that takes
in scientific publications in PDFformat and extracts knowledge that
is useful to scientists
-
EntitiesFind
Elements, Minerals, Targets
Relations Classify pairs of Target +
(Element or Mineral)
MTE Database
Automatedinformationextraction Userqueriesviaweb
SEDIMENTOLOGY AND STRATIGRAPHY OF THE PAHRUMP HILLS OUTCROP,
LOWER MOUNT SHARP, GALE CRATER, MARS. K. M. Stack1, J. P.
Grotzinger2, S. Gupta3, L. C. Kah4, K. W. Lewis,5 M. J. McBride6,
M. E. Minitti7, D. M. Rubin8, J. Schieber9, D. Y. Sumner10, L. M.
Thompson11, J. Van Beek6, A. R. Vasavada1, R. A. Yingst7. 1Jet
Propulsion Laboratory, California Institute of Technology, 4800 Oak
Grove Drive, Pasadena, CA 91109 ([email protected]),
2California Institute of Technology, Pasadena, CA, 3Imperial
College, London, UK, 4University of Tennessee, Knoxville, TN,
5Johns Hopkins University, Baltimore, MD, 6Malin Space Science
Systems, San Diego, CA, 7Planetary Science Institute, Tucson, AZ,
8UC Santa Cruz, Santa Cruz, CA, 9Indiana University, Bloomington,
IN, 10UC Davis, Davis, CA, 11University of New Brunswick,
Fredericton, NB, Canada.
Introduction: In September 2014, the Mars Sci-
ence Laboratory Curiosity rover arrived at the Pahrump Hills
outcrop after an 8 km traverse from Yellowknife Bay. Geologic
mapping of high-resolution orbital images from the HiRISE camera
suggests that the Pahrump Hills outcrop is Curiosity’s first
encounter with the Murray formation, the informal designation for
strata recognized as lower Mount Sharp (Figure 1). This study
presents an overview of the Cu-riosity rover team’s investigation
of Pahrump Hills and provides the stratigraphic context and
depositional interpretation for sedimentary facies and diagenetic
textures observed at this outcrop.
Figure 1. Location of the Pahrump Hills outcrop (yellow star)
shown in HiRISE and on a HiRISE digital terrain model (inset).
The Curiosity Rover Team’s Investigation at Pahrump Hills: After
completing sample acquisition and analysis at the Confidence Hills
drill site at the base of Pahrump Hills [1], Curiosity began the
first of two traverses up the ~12 m thick Pahrump Hills sec-tion
(Figure 2). During the first traverse, only the re-mote science
instruments (ChemCam, Mastcam, and MARDI) were used to quickly and
efficiently charac-terize the section [2-4]. Several outcrops were
then examined during a second traverse using Curiosity’s dust
removal tool (DRT) and contact science instru-ments (MAHLI and
APXS) [5,6]. Using observations acquired during the two traverses
from the Mastcam, MARDI, and MAHLI cameras localized to HiRISE
DTM and Navcam stereo mesh data, a stratigraphic column was
constructed for Pahrump Hills using ele-vations, lithologic, and
sedimentary properties (Figure 3).
Figure 2. Main outcrops visited by the Curiosity rover at the
Pahrump Hills outcrop displayed on a Mastcam mosaic produced by
MSSS. White dots = end of drive or mid-drive stops visited during
traverse 1 only, red dots = outcrops examined during traverse 2,
blue dot = Confidence Hill drill location.
Sedimentary Facies at Pahrump Hills: Five main sedimentary
facies were observed at Pahrump Hills (Figure 3):
Recessively-weathering Massive Mud-stone/Siltstone. The most
prevalent facies observed throughout the Pahrump Hills section is a
slope-forming, very fine-grained rock that appears massive in
Mastcam and MARDI images. Individual in-situ grains are not
resolvable in MAHLI images of brushed exposures, which suggests
that the grain size of this facies is less than ~50 µm, or 2.5x the
maximum MAHLI resolution achieved at a 3.9 cm working dis-tance.
Accordingly, this facies is likely composed of clay (
-
Table 1: Manual annotations for LPSC documents.2015 2016
Total
Annotation (62 docs) (55 docs) (117 docs)Element 1195 1029
2224Mineral 748 708 1456Target 566 347 913Contains 434 262 696Total
2943 2346 5289
Figure 2: Excerpt from document lpsc16-1155 showingcompositional
annotations created with the brat web annota-tion tool.
CorpusOur corpus consists of two-page extended LPSC abstractsin
PDF format. We selected 62 documents from LPSC 2015and 55 documents
from LPSC 2016 that mentioned “Chem-Cam”, and used the brat
annotation tool (Stenetorp et al.2012) to manually label entities
within these documents(see Table 1). This data set contains
thousands of annota-tions, which are available here:
https://doi.org/10.5281/zenodo.1048419. We estimate that it took an
av-erage of 30 minutes to annotate each document, or a total ofmore
than 58 hours of labor for the full corpus.
The annotated relationships ranged from simple (e.g., apattern
such as “X contains Y” within a sentence) to com-plex (e.g.,
relationships that crossed sentence boundaries orinvolved pronouns
like “it” and other anaphora). Figure 2shows an excerpt from one
document that contains sev-eral statements about the composition of
the target Big Sky.The vocabulary used to indicate a compositional
relationshipvaries, and the final relationship crosses a sentence
bound-aries.
Named Entity Recognition ResultsThe Named Entity Recognizer
operates on individual wordsor tokens. We used the 2015 documents
for training and di-vided the 2016 documents into validation (n =
20) and test-ing (n = 35) sets. As shown in Table 2, the baseline
ap-proach of employing the known lists of elements, minerals,and
targets achieved an F1 score of 0.76. Training a basicNER
classifier using the CoreNLP system yielded an im-
Table 2: Named entity recognition performance on LPSC2016 test
documents. The best result for each metric isshown in bold.
Prec. Recall F1Baseline: Lists only 0.831 0.699 0.760CoreNLP NER
trained on:LPSC 2015 0.948 0.700 0.805LPSC 2015 + gazettes 0.945
0.777 0.853
proved F1 score of 0.805. Virtually all of the improvementcame
from increased precision (from 0.83 to 0.95). Recallwas highest
(0.84) for the Element class, as expected; it was0.73 for Minerals
and only 0.28 for Targets. The Target classis the most difficult
one to recognize due to the lack of anaming convention and
ambiguous names. In addition, theTarget class grows much faster
than the set of known ele-ments or minerals, so there will always
be new targets infuture documents that never appeared in the
training set.
However, we were able to improve NER recall as well byincluding
the gazettes as described above. These term listsaugment the
manually labeled documents and provide rel-evant domain knowledge.
With the gazettes, the F1 scoreincreased to 0.853 by boosting
recall to 0.777. Recall for theTarget class, in particular, more
than doubled, to 0.67.
Relation Extraction ResultsThe decision about whether or not a
relation exists is madefor a given pair of entities. Processing all
possible pairs ofentities in a document would be infeasible (and
likely un-necessary). For simplicity, we adopted the strategy used
inprevious work (Giuliano, Lavelli, and Romano 2006) of gen-erating
all pairs of entities that occur within a single sen-tence. We used
CoreNLP’s sentence splitter to divide thecorpus into sentences and
the NER model trained above toidentify entities. For each (Target,
Element) or (Target, Min-eral) pair, we generated a jSRE example
that encoded thesentence content. If the pair of entities was
connected by arelation in the manual annotations, we gave the
example apositive label; otherwise, we gave it a negative
label.
To simulate how the system would be used in practice, wetrained
and validated the relation classifier using text fromLPSC 2015 and
tested it on LPSC 2016. We used the first42 LPSC 2015 documents for
training and the remaining 20for validation. The number and
distribution of the resultingjSRE examples (relationships) are
given in Table 3.
Table 3: Number and distribution of relationships betweenTargets
and Elements or Minerals. The number in parenthe-ses is the
percentage of positive relationships.
Element Mineral MergedTrain 279 (38%) 150 (41%) 429
(39%)Validation 93 (27%) 70 (69%) 163 (45%)Test 111 (37%) 62 (50%)
173 (42%)
-
Table 4: Relation extraction performance on LPSC 2016(test)
documents. The best result for each metric is shownin bold.
Precision Recall F1Elements (n = 111)
Baseline: All-yes 0.369 1.000 0.539jSRE-Elements 0.531 0.415
0.466
Minerals (n = 62)Baseline: All-yes 0.500 1.000
0.667jSRE-Minerals 0.679 0.613 0.644
Merged (n = 173)Baseline: All-yes 0.416 1.000 0.588jSRE-Indiv.
0.598 0.447 0.511jSRE-Merged 0.640 0.444 0.525
We trained three different relation classifiers: one
onTarget-Element relations only; one on Target-Mineral rela-tions
only; and one on the merged set. We were curious as tohow a
specialized model that was trained on less data wouldcompare to a
more generic model trained on more data. Foreach model, we
performed a grid search over the jSRE pa-rameters by trying each of
the SVM kernels (LC, GC, SL)and window sizes within the set { 1, 2,
5, 10, 15, 20 }. Weselected the model parameters that led to the
highest perfor-mance on the validation set in terms of precision.
We foundthat the max-precision model did not employ the same
pa-rameters across the three models. jSRE-Elements and jSRE-Merged
used an LC kernel with a window of 5, while jSRE-Minerals used an
SL kernel with a window of 5. Notably, theGC kernel that the
original authors found to be most power-ful for the biomedical
domain did not perform well in thiscorpus.
We found that the individual models (“jSRE-Elements”and
“jSRE-Minerals”) achieved much higher precision thana baseline
approach that always predicted that a relationshipwas present
(“All-yes”) (see Table 4). While this baselinealways achieves a
recall of 1.00 and therefore appears supe-rior in terms of
F-measure, this application domain valuesprecision much more than
recall. Content included in theMTE must be of the highest
reliability, even if this means itis not comprehensive. We also
found that the merged model(“jSRE-Merged”) out-performed the
baseline and the indi-vidual models when they were applied to the
full (Merged)data set (“jSRE-Indiv.”).
Large-scale EvaluationWe collected all LPSC documents that were
published in2014, 2015, and 2016, omitting the training documents
fromLPSC 2015, and ingested them into the MTE (n = 5897).
It would be infeasible to ask humans to manually labelall 5897
documents to evaluate our results, so instead weperformed a manual
review of only the extracted relations.This allows us to measure
precision, but not recall. However,as noted above, precision is far
more important than recallin this domain, as it captures the true
utility of the extractedinformation when used in practice.
Table 5: Manual review of 817 relations extracted from
5897documents.
LPSC14 LPSC15 LPSC16 TotalCorrect 55% 57% 29% 41%Partial 9% 15%
14% 13%Irrelevant 19% 6% 9% 11%Wrong 11% 21% 2% 8%Unsure 6% 0% 47%
28%
The manual review results are shown in Table 5. Our man-ual
reviewer examined each extracted relation and its sourcesentence to
judge the relation as Correct, Partial (e.g., onlyone word of a
multi-word Target name was extracted), Ir-relevant (an appropriate
extraction from the sentence, butthe content was not about Mars),
Wrong, or Unsure (the re-viewer could not determine whether the
relation was cor-rect).
Overall, the fraction of Correct relations was 41%. Perfor-mance
on the 2016 documents was significantly lower thanfor the preceding
years. Since the system was trained on doc-uments from 2015, it is
likely that targets mentioned in 2015would encompass those
discovered in 2014 and 2015, whilethe documents from 2016 contain
many newly discoveredtargets and therefore present a more difficult
generalizationtask. Many of the Partial relations occurred due to
limitedsupport for extracting multi-word entities. This is an
areafor future improvement.
The Irrelevant relations are in some ways quite interest-ing;
there are several relations that express the compositionof
meteorites that happen to have the same names as (real)Mars
targets. The system correctly interpreted the sourcesentences, but
the information does not (strictly) belong inthe MTE. For example,
the system inferred that “Gibeon”contains “chromite” from this
sentence: “Gibeon was foundin several studies to have both chromite
and daubrelite inclu-sions.” Gibeon is the name of a Mars target
and of a mete-orite. Disambiguating the two requires more context
than asingle sentence. Most of the Unsure relations came from
ta-bles whose formatting was lost in the conversion from PDFto
text. A useful future direction might be to omit table con-tent or
to capture its structure in some way, e.g., by usingthe Tabula
tool4. If we omit these unparseable sections, weobtain 56% Correct
relations, 18% Partial, 15% Irrelevant,and 11% Wrong.
Deployment of the MTEWe created a simple web interface to allow
users to query theMTE for information about targets, elements, or
minerals.This interface is currently only available inside JPL, but
weare in the process of integrating it with a public PDS websiteas
discussed below.
The MTE enables scientists to ask new questions that pre-viously
could not be answered. For example, Figure 3 showsthe results of a
query on “hematite.” Nine targets that con-
4https://github.com/tabulapdf/tabula
-
Figure 3: MTE search results for “hematite.” Nine
results(individual Mars targets) are returned, and a map
displaysthe location of each hit, in red.
tain hematite were returned. The user can click on any targetto
see the extracted information and sentence excerpts thatsupport the
conclusion about the presence of hematite. Be-low is a map of the
Curiosity rover’s traverse on Mars, withthe locations of the
matching targets marked in red. One canimmediately see whether
hematite is localized or has beenidentified throughout the
mission.
LimitationsThe MTE is not comprehensive. There may be
composi-tional information that was never written up in a
scientificpublication and therefore would not be included in the
MTE.Instead, the MTE extracts and indexes only the informationthat
was judged by scientists to be worthy of publication tothe
scientific community. The MTE leverages and mirrorsthis selection
bias, and its holdings (like the source publi-cations) contain only
the most valuable and salient informa-tion. This incompleteness is
important to convey to the userso that they interpret results
correctly.
On the technical side, there are two important limitationsto the
MTE content. First, the current MTE cannot generateoverlapping
annotations. For example, the phrase “calciumsulfate” was manually
labeled as “calcium” (Element), “sul-fate” (Mineral), and “calcium
sulfate” (Mineral). However,the MTE’s NER model only classifies
individual tokens, soit misses the “calcium sulfate” phrase.
Second, the relation extraction module only generatescandidate
relation pairs within a sentence. In this corpus,32% of the
manually annotated relations cross sentenceboundaries. Therefore,
the current system cannot yet retrievethose relations. One way to
access sentence-crossing rela-tions would be to expand the number
of candidate entitypairs to include all pairs within a paragraph.
We plan to eval-uate that strategy in future work.
Conclusions and Next StepsThis work lies at the intersection of
information extraction,machine learning, and planetary science. The
MTE uses cur-rent IE technology to provide the first database of
Mars tar-get compositional knowledge as expressed in the
scientificliterature. The pipeline is fully automated, and we can
em-ploy web crawlers to seek out new (publicly accessible) pa-pers
as they are published. We also plan to enable users tosubmit their
own publications for analysis and augmentationof the database.
We are in the process of integrating the MTE’s contentinto the
MSL Analyst’s Notebook, an interactive web re-source for mission
scientists and the interested public (Steinand Arvidson 2013). The
Analyst’s Notebook allows usersto browse mission plans, targets
discovered, data collected,and summaries of each mission day on
Mars. The MTE con-tent will enable the Analyst’s Notebook to also
connect tar-gets to publications.
The MTE currently contains information about Chem-Cam targets
that was extracted from three years of paperspublished at the Lunar
and Planetary Science Conference. Anext logical step is to expand
the MTE to encompass targetsidentified by other instruments on the
Curiosity (MSL) roverand other missions such as the Mars
Exploration Rovers(Spirit and Opportunity). In addition, we plan to
extend theMTE to be able to ingest journal papers that have been
pub-lished by MSL science team members and the broader com-munity.
This information will carry more weight because itcomes from
peer-reviewed sources; users will be able to re-strict their
searches to journal papers only, or to obtain allpossible
results.
The automatic extraction of knowledge from
scientificpublications can benefit many other areas of scientific
in-quiry. In addition to biology and medicine, there are
op-portunities at the intersection between fields such as
plane-tary science and astronomy. For example, there are
currently3,550 confirmed exoplanets (planets outside our solar
sys-tem) as of November 2, 2017 (NASA 2017). Hundreds ofnew planet
candidates are announced each year in new pub-lications. Desirable
properties to extract and store for eachplanet include its radius,
temperature, period, distance fromits host star, and more.
Compositional relationships exist for
-
elements present in the host star and for constituents in
exo-planet atmospheres, with implications for the possible
pres-ence of life on other planets. In general, extracting
informa-tion and relationships into a central, searchable database
canhelp inform new hypotheses and direct future science
inves-tigations.
AcknowledgmentsThis research was carried out in part at the Jet
Propul-sion Laboratory, California Institute of Technology, un-der
a contract with the National Aeronautics and SpaceAdministration.
We thank the Multimission Ground Sys-tem and Services (MGSS)
program and the Mars ScienceLaboratory project for funding and
enthusiastically sup-porting this work. We also thank Chris
Mattmann for hissupport and partial funding as provided by the
DARPAXDATA/Memex/D3M programs and NSF award numbersICER-1639753,
PLR-1348450, and PLR-144562.
ReferencesBui, Q.-C.; Katrenko, S.; and Sloot, P. M. A. 2011. A
hy-brid approach to extract protein-protein interactions.
Bioin-formatics 27(2):259–265.Finkel, J. R.; Grenager, T.; and
Manning, C. 2005. Incor-porating non-local information into
information extractionsystems by Gibbs sampling. In Proceedings of
the 43nd An-nual Meeting of the Association for Computational
Linguis-tics (ACL 2005), 363–370.Giuliano, C.; Lavelli, A.; and
Romano, L. 2006. Exploitingshallow linguistic information for
relation extraction frombiomedical literature. In Proceedings of
the 11th Confer-ence of the European Chapter of the Association for
Com-putational Linguistics (EACL 2006), 401–408.Grotzinger, J. P.;
Sumner, D. Y.; Kah, L. C.; Stack, K.;Gupta, S.; Edgar, L.; Rubin,
D.; Lewis, K.; Schieber, J.;Mangold, N.; Milliken, R.; Conrad, P.
G.; DesMarais, D.;Farmer, J.; Siebach, K.; Calef, F.; Hurowitz, J.;
McLennan,S. M.; Ming, D.; Vaniman, D.; Crisp, J.; Vasavada, A.;
Ed-gett, K. S.; Malin, M.; Blake, D.; Gellert, R.; Mahaffy,
P.;Wiens, R. C.; Maurice, S.; Grant, J. A.; Wilson, S.; An-derson,
R. C.; Beegle, L.; Arvidson, R.; Hallet, B.; Sletten,R. S.; Rice,
M.; Bell, J.; Griffes, J.; Ehlmann, B.; Ander-son, R. B.; Bristow,
T. F.; Dietrich, W. E.; Dromart, G.;Eigenbrode, J.; Fraeman, A.;
Hardgrove, C.; Herkenhoff,K.; Jandura, L.; Kocurek, G.; Lee, S.;
Leshin, L. A.; Lev-eille, R.; Limonadi, D.; Maki, J.; McCloskey,
S.; Meyer, M.;Minitti, M.; Newsom, H.; Oehler, D.; Okon, A.;
Palucis, M.;Parker, T.; Rowland, S.; Schmidt, M.; Squyres, S.;
Steele,A.; Stolper, E.; Summons, R.; Treiman, A.; Williams,
R.;Yingst, A.; and Team, M. S. 2014. A habitable fluvio-lacustrine
environment at Yellowknife Bay, Gale Crater,Mars. Science
343(6169).Krallinger, M.; Rabal, O.; Loureno, A.; Oyarzabal, J.;and
Valencia, A. 2017. Information retrieval and textmining
technologies for chemistry. Chemical Reviews117(12):7673–7761.
Mattmann, C., and Zitting, J. 2011. Tika in Action. NewYork:
Manning Publications.Maurice, S., and 70 others. 2012. The ChemCam
instrumentsuite on the Mars Science Laboratory (MSL) rover:
Scienceobjectives and mast unit description. Space Science
Reviews170(1):95–166. doi:10.1007/s11214-012-9912-2.Mooney, R. J.,
and Bunescu, R. 2005. Mining knowledgefrom text using information
extraction. ACM SIGKDD Ex-plorations Newsletter 7(1):3–10.NASA.
2017. Exoplanet archive.
https://exoplanetarchive.ipac.caltech.edu/.Ronzano, F., and
Saggion, H. 2016. Knowledge extractionand modeling from scientific
publications. In Proceedingsof the Enhancing Scholarly Data
Workshop, 11–25. Cham:Springer International Publishing.Stein, T.
C., and Arvidson, R. E. 2013. PDS Analyst’s Note-book for MSL. In
Proceedings of the 44th Lunar and Plan-etary Science Conference,
Abstract 1570.Stenetorp, P.; Pyysalo, S.; Topić, G.; Ohta, T.;
Ananiadou,S.; and Tsujii, J. 2012. brat: A web-based tool for
NLP-assisted text annotation. In Proceedings of the Demonstra-tions
Session at EACL 2012.Tikk, D.; Thomas, P.; Palaga, P.; Hakenberg,
J.; and Leser,U. 2010. A comprehensive benchmark of kernel
methodsto extract protein-protein interactions from literature.
PLoSComputational Biology 6(7).Tsutsui, S.; Ding, Y.; and Meng, G.
2016. Machine read-ing approach to understand Alzheimers disease
literature. InProceedings of the Tenth International Workshop on
Dataand Text Mining in Biomedical Informatics (DTMBIO).Zhang, C.;
Govindaraju, V.; Borchardt, J.; Foltz, T.; and andShanan Peters, C.
R. 2013. GeoDeepDive: Statistical infer-ence using familiar
data-processing languages. In Proceed-ings of the 2013 ACM SIGMOD
International Conferenceon Management of Data, 993–996.