Top Banner
Multilingual Information Services in the area of agricultural data The AGRIS use case Fabrizio Celli – FAO of the UN – 06/02/2014
22

Multilingual information services in the area of agricultural data: the use case of AGRIS

Nov 17, 2014

Download

Technology

Agro-Know

The purpose of this presentation, was to present a real problem that could be solved using the multilingual framework produced by Organic.Lingua
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multilingual information services in the area of agricultural data: the use case of AGRIS

Multilingual Information Services in the area of agricultural data

The AGRIS use case

Fabrizio Celli – FAO of the UN – 06/02/2014

Page 2: Multilingual information services in the area of agricultural data: the use case of AGRIS

2

OVERVIEW

Page 3: Multilingual information services in the area of agricultural data: the use case of AGRIS

3

The setting

• The AGRIS database is a collection of more than 7.7 million bibliographic references in the agricultural domain

• They are enhanced by the AGROVOC thesaurus, which is extensively used by cataloguers to enrich data indexing in agricultural information systems

• AGRIS is an RDF-aware system (http://agris.fao.org ), a mashup application that allows users to query the AGRIS-RDF content, interlinking all records to external sources of information

• 7 million bibliographic records become 7 million mashup pages!

Page 4: Multilingual information services in the area of agricultural data: the use case of AGRIS

4

Some statistics

• 7.7 million bibliographic references• 190 million triples• ~ 300.000 visits/month• World wide used (accessed from more than

200 countries)

Page 5: Multilingual information services in the area of agricultural data: the use case of AGRIS

5

How data come to AGRIS

• Centralization: bibliographic references in the AGRIS domain (agriculture, forestry, animal husbandry, aquatic sciences and fisheries, and human nutrition)

• Interlinking: other kinds of information related to the AGRIS domain (statistics, maps, country profiles, etc.)

Page 6: Multilingual information services in the area of agricultural data: the use case of AGRIS

6

Data consuming

• AGRIS consumes metadata provided by the community and publishes them as open data

• Metadata are captured either by pulling data through harvesting from clients (e.g. aggregators, institutional repositories, using protocols such as OAI-PMH)

• or by pushing data to AGRIS from clients (e.g. national libraries or journal publishers)

Page 7: Multilingual information services in the area of agricultural data: the use case of AGRIS

7

Accept any input format!

Page 8: Multilingual information services in the area of agricultural data: the use case of AGRIS

8

The AGRIS metadata format

• AGRIS tries to accept any input format• The AGRIS input module is responsible for the

translation of the source input format to the AGRIS RDF

• The translation currently requires an intermediate step, in which metadata are converted to the AGRIS AP, a metadata standard based on Dublin Core

Page 9: Multilingual information services in the area of agricultural data: the use case of AGRIS

9

MULTILINGUAL METADATA

Page 10: Multilingual information services in the area of agricultural data: the use case of AGRIS

10

Multilingual metadata

• 80% of AGRIS references have an english content: title, abstract, etc.

• The most of the time, when the reference comes in another language, English is used as a translation for both the abstract and the title

• Data providers send us multilingual records, where English is quite the default

Page 11: Multilingual information services in the area of agricultural data: the use case of AGRIS

11

<dc:title xml:lang="en">Effects of straw returned to the field on growth and …</dc:title><dc:title xml:lang="Zh">砂姜黑土区秸秆还田对玉米生育及水分利用效率的影响 </dc:title><dc:creator>

<ags:creatorPersonal>Shen Xueshan, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal><ags:creatorPersonal>Li Jincai, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal><ags:creatorPersonal>Qu Huijuan, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal>

</dc:creator><dc:date>

<dcterms:dateIssued>Apr. 2011</dcterms:dateIssued></dc:date><dc:subject>

<ags:subjectClassification scheme="ags:ASC">F01</ags:subjectClassification><ags:subjectThesaurus scheme="ags:AGROVOC">…</ags:subjectThesaurus>

</dc:subject><dc:description>

<dcterms:abstract xml:lang="Zh"> 摘  要:为了在淮北砂姜黑土区推广小麦玉米秸秆全量还田技术 ,采用大田定位试验 , 设置小麦玉米秸秆不还田、小麦玉米秸秆单季还田和小麦玉米秸秆两季还田 4 种

秸 秆还田方式 , 研究了小麦、玉米秸秆全量粉碎还田对机播夏玉米出苗、 ...</dcterms:abstract>

<dcterms:abstract xml:lang="En">The effects of straw returned to the field which including no straw returning ( CK ) ,wheat straw returning ( T1 ) ,maize straw returning ( T2 ) and wheat and maize straw returning ( T3 ) on emergence,growth...</dcterms:abstract>

</dc:description><dc:language scheme="ags:ISO639-1">Zh</dc:language>

Page 12: Multilingual information services in the area of agricultural data: the use case of AGRIS

12

What about Agrovoc

• AGRIS records are indexed with the AGROVOC thesaurus, the FAO multilingual vocabulary containing more than 40 000 concepts in 21 languages

• Each record can contain one or more AGROVOC strings in a specific language

• The translation to RDF allows to assign AGROVOC URIs to AGRIS record

• From an AGROVOC URI the user can extract many information, as the translation of AGROVOC strings in many languages

Page 13: Multilingual information services in the area of agricultural data: the use case of AGRIS

13

MULTILINGUALITY PROBLEMS AND NEEDS

Page 14: Multilingual information services in the area of agricultural data: the use case of AGRIS

14

The scope of this presentation

• Multilinguality problems and needs for the AGRIS online service

Page 15: Multilingual information services in the area of agricultural data: the use case of AGRIS

15

Two issues

• Displaying multilingual information• Multilingual search

Page 16: Multilingual information services in the area of agricultural data: the use case of AGRIS

16

Displaying multilingual information

• AGRIS can display its content in all the languages available in the source metadata

• For other languages, a naive translation is provided by the Google translator gadget (this step could be improved)

• http://agris.fao.org/agris-search/search.do?request_locale=en&recordID=CN2012002999

Page 17: Multilingual information services in the area of agricultural data: the use case of AGRIS

17

Page 18: Multilingual information services in the area of agricultural data: the use case of AGRIS

18

Page 19: Multilingual information services in the area of agricultural data: the use case of AGRIS

19

Multilingual search

• Currently not available• AGRIS records are indexed with AGROVOC

keywords in a specific language• The translation to RDF provides AGROVOC

URIs, which could be used to perform a multilingual search

• Currently, only AGROVOC strings go to the Apache Solr index

Page 20: Multilingual information services in the area of agricultural data: the use case of AGRIS

20

An example of the issue

• The search: +agrovoc:(AROMATIC COMPOUNDS) +agrovoc:(EXTRACTION)

returns 467 results, but they don’t include the article «Degradacion de compuestos aromaticos por microorganismos y sus aplicaciones biotecnologicas» that was indexed with «Compuestos aromaticos», in Spanish

Page 21: Multilingual information services in the area of agricultural data: the use case of AGRIS

21

A possible need

• It would be great for the AGRIS community if, when the user looks for «Aromatic compounds», the system returns also records indexed with «Composti aromatici», «Compuestos aromáticos», « 芳香类 », etc.

• AGROVOC could help

Page 22: Multilingual information services in the area of agricultural data: the use case of AGRIS

22

Thank you !