Terminology standards – enhancing language ISO/TC 37 Semantic Interoperability ISO TC 37 Secretariat c/o Infoterm Christian Galinski Bamako (Mali) 2005-05-06/07
Dec 16, 2015
Terminology standards – enhancing language
ISO/TC 37 Semantic Interoperability
ISO TC 37 Secretariatc/o Infoterm
Christian Galinski
Bamako (Mali) 2005-05-06/07
ISO/TC 37 – Bamako 2005-06/07
Overview UNESCO’s IFAP Area 4 IFAP UNESCO and multilinguality Advocating open access solutions Language in industry eContent development Global semantic interoperability Standards for ... Terminology standardization Terminology? Content entities Terminology eContent Terminology in ISO/TC 37 + Language resources & LR management + Content resources Standardization of terminological principles and methods ISO/TC 37 ISO/TC 37/SC 1 ~ 4 ISO/TC 37 Outlook Semantic interoperability – HOW?
ISO/TC 37 – Bamako 2005-06/07
What is terminology?
The description of the specialized vocabulary of an application domain Cf. Eugen Wüster: conceptual view
knowledge representation at concept level Monolingual or multilingual Mainly nouns (in cl. multi-words nominal
units), some verbs, adjectives and adverbs
A strong yet practical simplification of lexical description
Increasing occurrence of non-verbal knowledge representations
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Bamako 2005-06/07
IFAP Areas of interventionWhat are IFAP’s areas of intervention?• Area 1: Development of international, regional and national information
policies• Area 2: Development of human resources and capabilities for the
information age• Area 3: Strengthening institutions as gateways for information access
• Area 4: Development of information processing and management tools and systems (Multilingualism) standards
ISO/TC 37 methodology standards:• terminology• language resources (at the level of concepts)• other content entities (at the level of concepts)
ISO/TC 37 – Bamako 2005-06/07
UNESCO and multilinguality
Promoting a wider, more equitable access to information (« Recommendation on the promotion of multilingualism and universal access to Cyberspace »/ Initiative B@bel)
Raising awareness of issues of equitable access and multilingualism
Encouraging Member States to
Develop strong policies which promote and facilitate language diversity on the Internet Guidelines for Terminology Policies
Create widely-available online tools and applications (such as terminologies, automatic translators, dictionaries) for content in local languages
Share of best practices and information ISO/TC 37
ISO/TC 37 – Bamako 2005-06/07
Advocating open access solutions
“Member States and international organizations should encourage open access solutions including the formulation of technical and methodological standards for information exchange, portability and interoperability, as well as online accessibility of public domain information on global information networks.”(UNESCO Recommendation on Multilingualism and Access to Cyberspace)
“Governments should promote the development and use of open, interoperable, non-discriminatory and demand-driven standards.” (WSIS Action Plan)
Open source software? + Open content?
ISO/TC 37 – Bamako 2005-06/07
Language in industry
Exchange of content entities:e.g. entry in a product catalogue Name of company (® enterprise) Name of product (model) (™ enterprise) Generic name of product (e.g. © Harmonized System) Class (name under which the product falls) (e.g. ©
eCl@ss) Verbal/textual description (© enterprise) Picture (© rights owner) Technical data
• (unified) branch properties (e.g. © OAGi) • Standardized characteristics (e.g. © DIN)• Enterprise product specific data (e.g. for collaborative business)• Enterprise internal data (maybe confidential/secret)
225/55/16 V
ISO/TC 37 – Bamako 2005-06/07
eContent DEVELOPMENT
Workflow management for content development: net-based, distributed, cooperative creation of structured content
CO-OPERATION INTEROPERABILITY
STANDARDIZATION
Re-use in applications:(based on the “single-source” principle)
• eLearning• eGovernment• eHealth• eBusiness• other e...s
multilingual
multimodal
multimediacomplying with
multi-channel output
accessibility requirements
ISO/TC 37 – Bamako 2005-06/07
THE CHALLENGE: (user point-of-view)
• throughout the enterprise/organization requested e.g. in e-government• between enterprises/organizations requested by the market• within industry consortia requested by industry branches• between industry consortia ??? (urgently needs harmonization
and especially open standards)• between different e…s requested by the user• between different language communities requested by the end user
within the standardization world
Global Semantic Interoperability
ISO/TC 37 – Bamako 2005-06/07
STANDARDS FOR:
hw sw methodology standards Technology ITU, ISO, IEC, industry Business models UN/ECE, ISO, industry “Language” ISO/TC 37, research consortia Transfers/transactions ITU, UN/ECE, industry Standards* MoU/MG – why? Content ? Methodology!!! semantic interoperability Legal issues ?
*standards should be examined, whether they support, allow or hinder multilinguality and cultural diversity (very important for SMEs) and semantic interoperability at large
ISO/TC 37 – Bamako 2005-06/07
Terminology standardization
Standardization of terminologies• Terminological data
• Linguistic and non-linguistic representations• Designations: term, abbreviation, graphic symbol,
formula, acoustic symbol, etc. • Descriptions: definition, explanation, non-linguistic
[descriptive] representation, etc. • Source-related data• Data management related data (field, record, holding)• Classification (multiple)
• Terminology-related data: names, phraseology, ... Standardization of terminological principles and
methods generic for many types of content entities
ISO/TC 37 – Bamako 2005-06/07
Terminology? content entities Terminology? knowledge representations
• Nomenclature, taxonomy, typology, partonomy, ...• Glossary, vocabulary, ...• Terminological phraseology• Graphical symbols and other non-linguistic representations?• Properties, characteristics, attributes, ...• Ontology• Names? to be further studied
+ closely related: Thesauri, classification schemes, keywords Encyclopedic (knowledge) entries
• Knowledge-enriched terminology entries• Names, proper names, ...
Ontologies, topic maps, ...
ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology eContent
embedded terminology (or combination of terminology + …)
• Texts: translation, localization, internationalization…• Speech: communication…• Image: CAD/CAM…• Multimedia: video, presentations…
knowledge-rich terminology• Encyclopedic knowledge: Wikipedia…• “Knowledge” management: incl. true “content management”
• document management, • communication management, • information management
“popularized” terminology
“Terminology and other language and content resources”
ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology today
Given its pervasive occurrence in all (written or spoken) domain communication, terminology today has to be considered an economic factor especially in
product data description and management (incl. eCatalogues and product classification)
quality management inter-cultural aspects of management and marketing translation and localization information, documentation, software development knowledge transfer, teaching and training, … Multilinguality and cultural diversity terminology science as a field of fundamental research as well as applied
R&D impact on standardization
ISO/TC 37 – Bamako 2005-06/07
Terminology in ISO/TC 37
Multifunctional nature of terminology:
Terminology as knowledge representation Terminologies as means of domain
communication Terminologies as means of access to other
kinds of information (objects) Terminologies as means of knowledge ordering
at micro-level
ISO/TC 37 – Bamako 2005-06/07
+ Language resource management
Language resources:• Text corpora tagging (on the basis of grammar
models)• Lexicographical data
• Words• Collocations• Morphology
• Terminology • Speech data
LR management:• Input / import• Metadata (incl. bundling/bindings etc.)• Data modelling & metamodel(s) • Exchange / interoperability• etc.
ISO/TC 37 – Bamako 2005-06/07
+ other kinds of content entities
Textual & non-linguistic types of content: Audio information (e.g. read-out written content) av information (e.g. sign language) Multimedia information Haptic information (e.g. in “intelligent cars”) …
Increasingly different (technical) types of content co-occur or are embedded in each other or are combined with each other – e.g. traffic telematics
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Standardization of terminological principles and methods
Fundamental principles Vocabulary of terminology Terminography Language resource management Terminology work (especially systematic ~~) Applications based on terminology methods Content management? eContent
mContent• Multilingual, multimodal, multimedia,
universal accessibility, multi-channel• Re-usability interoperability/ies• Resource-sharing peer2peer
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 Old title:
Terminology and other language resources Old scope:
Standardization of principles, methods and applications relating to terminology and other language resources
New title:Terminology and language and content resources
New scope:Standardization of principles, methods and applications relating to terminology and other language and content resources in the contexts of multilingual communication and cultural diversity
As is the case with terminologies, language resources in general have to be considered as multilingual, multimedia and multimodal from the outset.
Generic fundamental standards for all activities involving language
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (1)
Title: Principles and methods Old scope: Standardization of basic
principles and methods for developing scientific and technical terminologies and other language resources
New scope: ??? still under discussion
ISO/TC 37/SC 1 prepares the meta-standards for the documents prepared by ISO/TC 37/SCs 2, 3 and 4, which cannot be consistent and coherent without these standards. The same applies to the documentation of content management in organizations.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 1:
ISO 704:2000 Terminology work – Principles and methods
ISO 860:1996 Terminology work – Harmonization of concepts and terms
ISO 1087-1:2000 Terminology work – Vocabulary – Part 1: Theory and application
The following standards are under preparation: ISO/CD 704 Terminology work – Principles and methods ISO/CD 860 Terminology work – Harmonization of
concepts and terms ISO/PWI 1087-1 Terminology work – Vocabulary –
Part 1: Theory and application ISO/WD 22134 Practical guide for socioterminology
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (1)
Title: Terminography and lexicography New scope: Standardization of terminological
and lexicographical working methods, procedures, coding systems, workflows, and cultural diversity management, as well as related certification schemes
Tens of thousands of terminology commissions, committees and other terminological entities (especially terminology standardizing SCs and WGs within the standardization framework) are using ISO/TC 37/SC 2 standards. This indirectly improves the overall degree of re-usability and interoperability of the resulting data and documents.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 2:
ISO 639-1:2002 Codes for the representation of names of languages – Part 1: Alpha-2 code
ISO 639-2:1998 Codes for the representation of names of languages – Part 2: Alpha-3 code
ISO 1951:1997 Lexicographical symbols and typographical conventions for use in terminography
ISO 10241:1992 International terminology standards -- Preparation and layout
ISO 12199:2000 Alphabetical ordering of multilingual terminological and lexicographical data represented in the Latin alphabet
ISO 12616:2002 Translation-oriented terminography ISO 15188:2001 Project management guidelines for
terminology standardization
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (3)The following standards are under preparation: ISO/CD 639-3 Codes for the representation of names of languages
– Part 3: Alpha-3 code for comprehensive coverage of languages ISO/WD 639-4 Codes for the representation of names of languages
– Part 4: Implementation guidelines and general principles for language coding ISO/WD 639-5 Codes for the representation of names of languages
– Part 5: Alpha-3 code for language families and groups ISO/CD 639-6 Codes for the representation of names of languages
– Part 6: Extension coding for language variation ISO/DIS 1951 Presentation/representation of entries in dictionaries ISO/CD 10241-1 Terminological entries in standards – Part 1: General
requirements ISO/AWI 10241-2 Terminological entries in standards ISO 12615 Bibliographic references and source identifiers for
terminology ISO/PWI TR 22128Quality assurance guidelines for terminology products ISO/PWI 22130 Additional language coding ISO/NP 23185 Assessment and benchmarking of terminological
holdings
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (1)
Old title: Computer applications for terminology New title: Terminology management
systems and content interoperability New scope: Standardization of principles
and requirements for semantic interoperability, terminology and content management systems, and knowledge ordering tools
Software developers are taking the documents of ISO/TC 37/SC 3 for designing terminology management systems (TMS) or terminology management modules to be integrated into content management as well as information and knowledge management systems. In this way the terminological principles and methods (provided by ISO/TC 37/SC 1) are directly integrated as ‘defaults’ into concrete system design for handling all kinds of information.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (2)
The following standards are under the direct responsibility of ISO/TC 37/SC 3:
ISO 1087-2:2000 Terminology work – Vocabulary – Part 2: Computer applications
ISO 6156:1987 Magnetic tape exchange format for terminological/ lexicographical records (MATER) - withdrawn
ISO 12200:1999 Computer applications in terminology – Machine-readable terminology interchange format (MARTIF) – Negotiated interchange
ISO 12620:1999 Computer applications in terminology – Data categories
ISO 16642:2003 Computer applications in terminology – Terminological markup framework
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (3)
The following standards are under preparation:
ISO/PWI TR 12618 Computational aids in terminology – Design, implementation and use of terminology management systems
ISO/CD 12620-1 Computer applications in terminology – Data categories – Part 1: Model for description and procedures for maintenance of data category registries for language resources
ISO/CD 12620-2 Computer applications in terminology – Data categories – Part 2: Terminological data categories
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (1)
Title: Language resource management Scope: Standardization of specifications for
computer-assisted language resource management
Given the fact that• linguistic infrastructures are being established or re-enforced as part
of the rapidly evolving information and communication society;• professional activities involving language resource sharing and
standardization are increasing in diverse areas: governmental or non-governmental organizations, public or private institutions, educational institutions, commercial enterprises, etc., both, globalization and localization necessitate multilingual communication;
there is an increasing need for new standardization as well as urgent recognition of existing de facto standards and their transformation into International Standards.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (2)The following standards are under preparation: ISO/AWI 21829 Terminology for language resources ISO/CD 24610-1 Language resource management – Feature
structures – Part 1: Feature structure representation ISO/WD 24611 Language resource management –
Morphosyntactic annotation framework ISO/WD 24612 Language Resource Management –
Linguistic Annotation Framework ISO/WD 24613 Language resource management – Lexical
markup framework ISO/AWI 24614-1 Word segmentation of written texts for
mono-lingual and multi-lingual information processing – Part 1: General principles and methods
ISO/AWI 24614-2 Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 2: Word segmentation for Chinese, Japanese and Korean
ISO/NP 24614-3 Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 3: Word segmentation for other languages
ISO/TC 37 – Bamako 2005-06/07
ISO 16642*
(family of)metamodels*
DatamodelsISO 12200**
Datamodels**eBusiness
Datamodels other e...s**
Datamodels other e...s**
Data categoriesISO 12620***
Domain data dictionaries***
DDDs DDDs DDDs DDDs*** *** *** ***
Basic principles and requirements concerning multilingual e/m-content development, data categories/metadata, data modelling, rules for repositories (maintained in MAs/RAs/Reg’s)
*ISO 16642 TMF; ISO 10303-11 EXPRESS; ISO 10303-21 SDAI; …
**ISO 12200 MARTIF; ISO 13584-42 PLIB ~ IEC 61360-2
***ISO 12620 Data categories; ISO 13584-511 Fastener dictionary; IEC 61360-4 Core dictionary; …
State-of-the-art
METHODOLOGY APPLICATIONS
ISO/TC 37 – Bamako 2005-06/07
Semantic interoperability standards
Content-related requirements Workflow methodology Metadata Metadata repositories Data modelling principles and
requirements Micro data models Metamodels Content repositories Federation of repositories …
ISO/TC 37 – Bamako 2005-06/07
CONFERENCES
Terminology Summer School- Cologne (Germany) 2005-07-14/23
TAMA 2005 “Terminology in Advanced Management Applications” – Wiesbaden (Germany) 2005-11-09
TKE 2005 “Terminology and Knowledge Engineering” – Copenhagen (Denmark) 2005-08-15/19
OFMR 2006 “Open Forum on Metadata Registries” – Japan 2006-03-20/22
Thank you for your attention
ISO/TC 37c/o Infoterm – International Information Centre for Terminology
Aichholzgasse 6/12A-1120 Vienna – AustriaTel: +43-1-817 44 99Fax:+43-1-817 44 [email protected]://www.infoterm.info
ISO/TC 37 Secretariat: Secretary: Christian Galinski Chairman: Håvard Hjulstad (SN)
ADDRESS: