Top Banner
Terminology and Ontologies Section 1: Basics Anne-Kathrin Schumann Saarland University “Expert“ Winter School Birmingham November 13, 2013
54

16. Anne Schumann (USAAR) Terminology and Ontologies 1

Aug 28, 2014

Download

Education

RIILP

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Terminology and Ontologies Section 1: Basics

Anne-Kathrin Schumann

Saarland University

“Expert“ Winter School

Birmingham

November 13, 2013

Page 2: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Why terminology?

Terms and concepts

Conceptual relations

Concept systems and concept-oriented terminology work

Resources and references

Overview

Page 3: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

“founder“ of terminology: Eugen Wüster, an engineer Encyclopedic dictionary Esperanto-German 1931: „Die Internationale

Sprachnormung in der Technik, besonders in der Elektrotechnik“ (International language standar- dization in technology, particu- larly in electronics)

Founder of TC37 (later ISO) Teacher at University of Vienna Interlinguistics/planned languages

Why terminology?

Page 4: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Why terminology?

Page 5: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Controlled languages:

“Controlled language … can be defined as a subset of a language with a restricted grammar and a domain specific vocabulary designed to allow domain specialists to unambiguously formulate texts pertaining to their subject fields“

(Wright, Sue E./Budin, Gerhard: Handbook of Terminology Management, vol. 2, p. 872)

Planned languages, e.g. Esperanto, Ido:

Avoidance of lexical ambiguity by means of the construction of an ambiguity-free lexicon

Avoidance of grammatical ambiguities and preference for easy-to-use strutures

Why terminology?

Page 6: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Means of expert communication Text reception (what is the text about?)

Text production (production of comprehensible texts: correctness, univocity, acceptability of specialised texts)

Means of knowledge transfer for education Instructive texts (text books)

Expert-to-layman communication: introduction and explicitation of terminology

Popularising texts

Why terminology?

Page 7: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Example: specialised text (journal abstract)

Why terminology?

Terminology 13(1): 2007, 35

Page 8: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Without knowing the meaning of the terms it is impossible to understand specialised texts

Terms work as “handles“ to units of knowledge

(or “units of understanding“, Temmerman)

Terminology is a means of reducing complexity

Correct use of terminology is a prerequisite for

membership (credibility, social status,

comprehensibility) in a community of experts: need

for correct translation!

Means of social distinction?

Why terminology?

Page 9: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Example: popularising text (Wikipedia)

Why terminology?

* Terms are linked to (canonical) definitions and/or explanations * Humans typically acquire this kind of knowledge from specific types of

text (educational texts)

Page 10: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Knowledge management (industry, big organisations)

* Strategic management of the knowledge stock of an organisation

* Identification of relevant rules, processes and concepts

* Provision of information about these items (e.g. intranet, knowledge base) – knowledge transfer

* Monitoring and management of knowledge evolution

* Research and comparison with other communities‘ knowledge

Why terminology?

Page 11: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Why terminology?

Page 12: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Other applications

domain adaptation of statistical MT systems

ontology-based information retrieval

QA- and expert systems

Why terminology?

Page 13: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the general language view

(graphic by Elke Teich)

Terms and concepts

The basics of structuralist semantics

Page 14: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the general language view

but in general language, ambiguities are ubiquitous: the relation between linguistic symbols (words, lexical units) and concepts is m:n

Terms and concepts

m:n m:n

m:n

Page 15: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the general language view

(www. leo.org)

Terms and concepts

Page 16: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the terminological view Why are m:n-mappings (read: inconsistent terminology)

problematic for specialised domains? hamper comprehensibility of specialised texts create semantic ambiguities (to be avoided at all costs in

safety-sensitive environments, e. g. medicine, engineering or construction!)

reduce retrieval results increase translation costs lower translation quality (in the translation studies point of

view, not necessarily in terms of BLEU points)

Terms and concepts

Page 17: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the terminological view

Why are m:n-mappings (read: inconsistent terminology) problematic?

Examples: Ana Hoffmeister, Volkswagen After Sales Language Service http://fr46.uni-saarland.de/fileadmin/user_upload/personen/wurm/Workshops/Hoffmeister_Terminology_Processes_and_Quality_Assurance.pdf

Terms and concepts

Page 18: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept vs. term – the terminological view

Terms and concepts

individual objects: • material • immaterial • extension

concept: „unit of thought“ – abstract mental representation of typical features (intension)

term: • name, designation • arbitrary linguistic

symbol

1:n

1:1 n:1

Page 19: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Wüster‘s answer to lexical ambiguities: active language planning/standardization -> prescriptive intervention into the lexicon of a specialised domain („bewußte Sprachgestaltung“, „Soll-Norm“)

descriptive branches of terminology: corpus-based investigations, term extraction, use of (automatically acquired) terms in other applications

Terms and concepts

Page 20: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

What is the added value of the distinction between concepts and terms?

allows us to work with culture- and language-independent concepts rather than language-specific terms: terminology is not really a linguistic enterprise

concepts are understood as universal (independent of cultures and languages) representations of knowledge

Terms and concepts

Page 21: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

What is the added value of the distinction between concepts and terms?

concepts are understood as universal (independent of cultures and languages) representations of knowledge

Abstract away from irrelevant differences

BREAD

Terms and concepts

Page 22: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

What is the added value of the distinction between concepts and terms? thus, we can easily map multilingual terms onto one

single concept

rather than mapping incommensurable multilingual terms onto each other (difficult: lexical gaps, slight shifts in meaning)

Brot, bread, pain, pane, maize, хлеб, …∈ BREAD

Terms and concepts

Page 23: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

What is the added value of the distinction between concepts and terms? we can distinguish between:

conceptual (semantic) relations – relations between concepts (e. g. HUT is-a HOUSE)

lexical relations – relations between lexical units (lemmas) – (e. g. house, n. vs. to house, v.)

grammatical relations – relations between word forms (e. g. house vs. houses)

only conceptual relations are relevant to terminology no interest in stylistic or connotational differences between

terms (designations)

Terms and concepts

Page 24: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Terms are also words, but what is the difference between general language words and specialised terms?

Terms and concepts

general language word term

has no specialised meaning can be a homonym of a general language word, but with a distinct specialised meaning (-> mapping to another concept)

can be an abbreviation, an acronym or a unit of measurement, a proper name or a symbol (e. g. mathematical symbols)

meaning often highly dependent on linguistic context (co-text)

meaning defined independently from context

less likely to be a foreign word more likely to be a foreign word

meaning transparent to competent speakers of given language

meaning is part of expert knowledge, non-experts have to look up the concept definition

Page 25: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Terms are often (but not always!) complex noun phrases

(patterns developed within TTC project: www.ttc-project.eu)

Terms and concepts

Page 26: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

* Terminological phraseology: DIN 2342: a fixed group of words containing a verb serving as a

designation of a given concept within a specialised language → einen Wechsel ziehen, den Hochofen anstechen, in Phase sein → to pass a bill, to file for divorce less strict definition: fixed, reproducable, lexicalised and

recurrent group of words that is typical for a specialised domain

(cf. Gläser (2007): Fachphraseologie, HSK 28:1, 482-505, my translations from German)

Terms and concepts

Page 27: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

* Terminological phrases have similar properties as single word terms * no expressive or stylistic connotations * reference to a context- and culture-independent concept * not generally comprehensible (need for explanations!) * non-compositional

Boundary cases: support verb constructions: Einwände erheben vs. einwenden, to make a

decision vs. to decide collocations: to levy taxes/soldiers/troops multi word terms (MWT)

(cf. Gläser (2007): Fachphraseologie, HSK 28:1, 482-505)

Terms and concepts

Page 28: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Conceptual relations

relations between concepts

define where a concept is located within the concept system

important for understanding the concept and for distinguishing it from neighbouring concepts

“Semantic relations are at the core of any representational system, and are keys to enable the next generation of information processing systems with semantic and reasoning capabilities.“

(Auger/Barrière 2008:1)

Conceptual relations

Page 29: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Which kinds of relations are relevant to terminology (for concept analysis)? Wüster: logical relations (similarity between concepts –

hierarchical: is-a, siblings etc.) vs. ontological relations (temporal, spatial or causal relations)

terminologies can be represented as graphs: concepts are nodes

relations are edges

relation types are edge labels

additional information is in the node attributes

Conceptual relations

Page 30: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Conceptual relations

(Nuopponen 1994: 533)

* ISO 12620: 2009: * Generic * Partitive * Temporal * Sequential * Causal * generic, broad-

coverage relations, no domain-specific relations!

* no consensus * synonymy,

antonymy?

Page 31: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

* To choose the right TL term candidate, information about semantic relations is needed (esp. in the legal domain)

* e.g. retrieved from definitions

* but termbases/dictionaries often do NOT provide this information

Conceptual relations

Page 32: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Conceptual relations

Can we improve the representation of terminological information by providing richer descriptions for language workers? For example, by mining explanations, definitions or semantic relations?

Page 33: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

* terminography is concept-oriented (onomasiological approach) * structures descriptions around concepts, not around terms

* lexicography is normally designation-oriented (semasiological approach) - > list of lemmas with corresponding enumeration of “word senses“

(www.leo.org)

Concept systems and concept-oriented terminology work

Page 34: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

* a typical “sense enumerative“ dictionary entry (Tildes Birojs 2013)

Concept systems and concept-oriented terminology work

Page 35: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

What are shortcomings of “sense enumerative“ lexicography/terminography?

no method for handling multilinguality, since semantic structures do not coincide across languages (language industry projects may involve up to 20-30 languages or even more including translation to/from pivot languages)

no method for dealing with term variation, since variants are kept apart from preferred terms

no 1:1-mappings between multilingual designations – backtranslation normally leads to a different result -> inconsistent translation

Concept systems and concept-oriented terminology work

Page 36: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Concept systems and concept-oriented terminology work

Page 37: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Separate entries for different concepts in MultiTerm

Concept systems and concept-oriented terminology work

Page 38: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Onomasiological approaches were not “invented“ by terminology, but are ancient achievements of lexicography proper

Thesauri structure our knowledge of the world according to semantic relations, building a hierarchically organised inventory of concepts (similar to the old philosophical understanding of „ontology“)

Dornseiff: Der deutsche Wortschatz nach Sachgruppen

Roget‘s Thesaurus of the English language

О. С. Баранов: Идеографический словарь русского языка

Concept systems and concept-oriented terminology work

Page 39: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Onomasiological approaches were not “invented“ by terminology, but are old achievements of lexicography

Thesauri structure the lexicon according to semantic relations

Concept with identifier as part of concept hierarchy

Related Concepts

Designations for the concept “cosmos“ + related terms

Concept systems and concept-oriented terminology work

Page 40: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Onomasiological approaches were not “invented“ by terminology, but are old achievements of lexicography

Semantic field dictionaries structure the lexicon according to a notion of „semantic proximity“

Schumacher: Verben in Feldern

Шведова: Русский семантический словарь

Concept systems and concept-oriented terminology work

Page 41: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Other kinds of onomasiological resources

A taxonomy is traditionally a scientific system of categories of concepts and hierarchical relations between them

But there are also “folk taxonomies“

Taxonomic approaches have been applied to the description of the lexicon of a given language (e. g. WordNet)

(but are they really language-independent?)

Concept systems and concept-oriented terminology work

Page 42: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Other kinds of onomasiological resources

A nomenclature is a list of designations in a given domain, especially in science

e. g. Bacterial nomenclature

http://www.dsmz.de/fileadmin/Bereiche/ChiefEditors/BacterialNomenclature/DSMZ_Bactnames.pdf

Concept systems and concept-oriented terminology work

Page 43: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Other kinds of onomasiological resources

Finally, ontologies

Traditionally a discipline of theoretical philosophy/metaphysics: categorisation of elements of existence

In the narrower AI sense: form of knowledge representation that makes explicit concepts and the relations between them and imposes functions, restrictions, rules, axioms and the like

Ontologies can be lexicalised, but don‘t have to be

Gruber: “An ontology is an explicit specification of a conceptualization.”

(http://tomgruber.org/writing/onto-design.pdf)

Concept systems and concept-oriented terminology work

Page 44: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Other kinds of onomasiological resources

Finally, ontologies

Examples:

Cyc, an ontology of common sense knowledge for AI

DOLCE, a descriptive ontology for linguistic and cognitive engineering

SUMO, the suggested upper merged ontology

… and many others and many similar

Concept systems and concept-oriented terminology work

Page 45: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Ontology languages and knowledge representation specifications:

RDF and RDF Schema

OWL, the web ontology language

SKOS, the simple knowledge organization system (builds on RDF and RDFS)

lemon, a lexicon model for ontologies

RDF, RDF Schema, OWL and SKOS are W3C standards

Resources

Page 46: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Tools and semantic resources:

Protegé, an ontology editor with reasoning component

Snomed CT, Systemazized Nomenclature of Medicine – clinical terms

UMLS, Unified Medical Language System

Resources

Page 47: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Relevant Standards: ISO 704 (2000): Terminology work – Principles and Methods

ISO 1087-1 (2000): Terminology work – Vocabulary – Part 1: Theory and application

ISO 12620 (2009): Terminology and other language and content resources – Specification of data categories and management of a Data Category Registry for language resources

ISO 30042 (2008): Systems to manage terminology, knowledge and content – Termbase eXchange (TBX)

(http://www.ttt.org/oscarStandards/tbx/tbx_oscar.pdf)

Resources

Page 48: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Web pages:

www. isocat.org - ISO TC 37 Terminology and Other Language and Content Resources: data category registry

www.taus.net – association of companies in translation industry with interesting resources, downloadable TMs for members

termcoord.eu – web page of the European Parliament’s terminology coordination unit

tekom.de – German association for technical communication

Resources

Page 49: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Journals and conferences:

Terminology (Benjamins)

TIA, Terminologie et Intelligence Artificielle

TKE, Terminology and Knowledge Enginerring

TEKOM

Resources

Page 50: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Auger, Alain / Barrière, Caroline (2008): “Pattern-based approaches to semantic relation extraction”. Terminology 14 (1), pp. 1-19.

Baranov, Oleg S. (1995): Ideografičeskij slovar’ russkogo jazyka. Moskva: ETS. Dornseiff, Franz (2004): Der deutsche Wortschatz nach Sachgruppen. Berlin: de Gruyter. Gläser, Rosemarie (2007): “Fachphraseologie”. In Burger et al. (eds.): Phraseologie. Vol.1., pp. 482-

505. Gruber, Thomas. (1993): “Toward Principles for the Design of Ontologies Used for Knowledge

Sharing”. Human-Computer Studies 43, 907-928. International Organization for Standardization (2000a): International Standard ISO 704: 2000 (E) –

Terminology Work – Principles and Methods. Geneva: ISO. International Organization for Standardization (2000b): International Standard ISO 1087-1: 2000 –

Terminology Work – Vocabulary – Part 1: Theory and application. Geneva: ISO. International Organization for Standardization (2008): International Standard ISO 30042:2008 -

Systems to manage terminology, knowledge and content – Termbase eXchange (TBX). Geneva: ISO. International Organization for Standardization (2009): International Standard ISO 12620: 2009 –

Terminology and Other Language and Content Resources – Specification of Data Categories and Management of a Data Category Registry for Language Resources. Geneva: ISO.

References: Literature

Page 51: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Kipfer, Barbara A. (2010): Roget’s International Thesaurus. New York: Collins Reference.

Nuopponen, Anita (1994): “Wüster revisited: On Causal Concept Relationships and Causal Concept Systems”. 9th European Symposium on LSP, Bergen, Norway, August 2-6, 1993, pp. 532-539.

Schumacher, Helmut (1986): Verben in Feldern: Valenzwörterbuch zur Syntax und Semantik deutscher Verben. Berlin: de Gruyter.

Švedova, N. Ju. (2002): Russkij semantičeskij slovar’: Tolkovyj slovar’, sistematizirovannyj po klassam slov i značenij. Moskva: Azbukovnik.

Wright, Sue Ellen / Budin, Gerhard (eds.) (2001): Handbook of Terminology Management. Vol. 2: Application-Oriented Terminology Management. Amsterdam/Philadelphia: John Benjamins.

References: Literature

Page 52: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

http://www.cyc.com/platform/opencyc

http://www.loa.istc.cnr.it/DOLCE.html

http://www.ontologyportal.org/

http://lemon-model.net/

http://protege.stanford.edu/

http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html

https://uts.nlm.nih.gov/home.html

References: Tools and Resources

Page 53: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

Dr. Ana Hoffmeister, Volkswagen After Sales Language Service

Prof. Elke Teich, Saarland University

Prof. Klaus Schubert, University of Hildesheim

Contributions to this Presentation

Page 54: 16. Anne Schumann (USAAR) Terminology and Ontologies 1

End of part 1 …

Thanks for your attention!