Linguistic Representation of Finnish in the Medical Domain Spoken Language Translation System Marianne Santaholma, University of Geneva, TIM/ISSCO.

Post on 26-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Linguistic Representation of Finnish in the Medical Domain

Spoken Language Translation System

Marianne Santaholma, University of Geneva, TIM/ISSCO

Outline MedSLT system overview MedSLT Finnish language resources:

CorporaGeneration grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

MedSLT system (1) Open source medical domain SLT

system Diagnosis tool for doctors One-way dialog Multilingual Coverage: medical sub-domains Architecture: based on general linguistic

resources

Speech Platform Interface Process Translation Server

UnificationGrammar Database

RecognitionPackage

Nuance Voice Platform(recognition, playback)

Application Specific Data

Regulus Runtime

Time System

Regulus Compile

Time Component

GenerationGrammar

Outline MedSLT Finnish language resources:

Corpora Generation grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

Finnish corpora (1) Headache and chest pain sub-domains Created by translating the original English

corpora Serve as the primary source to decide

what kind of structure rules and vocabulary necessary to introduce into Finnish language module

Finnish corpora (2)

Concepts covered frequency of pain, duration of pain, location of pain etc

Examples Do you have headaches in the morning?

• In the evening? Is your headache stubbing?

• severe? Are your headaches caused by coffee?

• By cheese?

Outline MedSLT Finnish language resources:

Corpora Generation grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

FIN generation grammar (1)

Specialized grammar for spoken languageReflects the specific text type and

discourse of the domain 57 grammar rules Unification formalism Developed on the Regulus platform

https://sourceforge.net/projects/regulus/

FIN generation grammar (2)

FIN grammar developed by manual grammar adaptation from the Regulus general English grammar

The Finnish structure rules highly similar to English counterparts

In Finnish more phenomena resolved at morphology level rather than syntax

(Rayner et al., 2000. Spoken Language Translator)

FIN generation grammar (2)'How frequent are your headaches?'s:[sem= @fronting_sem(Adj, S), wh=y\/rel, wh=Wh, vform=VForm,

inv=Inv, whmoved=y, takes_adv_type=none, gapsin=null, gapsout=null] -->adjp:[sem=Adj, wh=Wh, adjpos=pred, gapsin=null, gapsout=null], s:[sem=S, wh=n, vform=VForm, inv=Inv, whmoved=n, gapsin=adjp_gap, gapsout=null].

'Kuinka yleisiä päänsärkynne ovat?’ *how frequent your_headaches are?'s:[sem= @fronting_sem(Adj, S), wh=y, inv=n, vform=inf,

whmoved=y, takes_adv_type=none, gapsin=null, gapsout=null] -->adjp:[sem=Adj, wh=y, agr=Agr, adj_pos=pred, adj_case=Case, adj_degr=positive, gapsin=null, gapsout=null],s:[sem=S, wh=n, agr=Agr, vform=inf, inv=n, whmoved=n, gapsin=adjp_gap, gapsout=null].

Outline MedSLT Finnish language resources:

Corpora Generation grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

Finnish Lexicon (1)

Domain specific ~ 530 lexical entries Difficulty: enumeration of all word forms

Example:Lievittää, ‘to relieve’, question form, sg 3., present.

verb:[sem=[[event, lievittää], [tense, present]], vform=q_ko, agr=sg, subcat=trans, subj_n_case=nom, subj_sem_n_type=(cause\/activity), obj_sem_n_type=perception_body, obj_case=ptv, takes_adv_type=frequency] --> lievittääkö.

Finnish Lexicon (2) Use of macros in lexical entries

macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=sg, case=nom]--> SgNoun)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=pl, case=nom]--> PlNoun)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=sg, case=ptv]--> SgPtv)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=pl, case=ptv]--> PlPtv)).

@noun_perception_body([särky, säryt, särkyä, särkyjä], [symptom, särky]).

Outline MedSLT Finnish language resources:

Corpora Generation grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

Interlingua to FIN mapping MedSLT interlingua interlingua_constant([<key>, <value>])

‘interlingua_constant([symptom, headache])’

Interlingua mapping rules Transformation

Source InterlinguaInterlingua Target

Two types of rules: Simple interlingua transfer_lexicon entries Complex interlingua transfer_rules

SOURCE INTERLINGUA TARGET

ENG: Does the redwine make your headache worse?

FIN: Pahentaako punaviini päänsärkyä?

[[adj,worse], [cause,red_wine], [event,make_adj], [prep,subj], [secondary_symptom, headache], [spec,the_sing], [tense,present], [utterance_type,ynq], [voice, active]]

[[sc,when], [clause, [[utterance_type,dcl], [pronoun,you], [tense,present], [voice,active], [action,drink], [cause,red_wine]]], [event,become_worse], [symptom,headache], [tense,present], [utterance_type,ynq], [voice,active]]

[[cause,punaviini], [event,pahentaa], [symptom,päänsärky], [tense,present], [utterance_type,ynq]]

transfer_rule([[sc,when], [clause, [[utterance_type,dcl], [pronoun,you], [tense,present], [voice,active], [action, drink], ECause]], [event, become_worse], [voice,active]],

[[event, pahentaa], @efin_cause (ECause)]).

transfer_lexicon([symptom, headache], [symptom, päänsärky]).

Outline MedSLT Finnish language resources:

Corpora Generation grammarLexiconInterlingua Finnish mapping rules

Initial evaluation results Summary

Evaluation (1)

Evaluation of Eng-Fin translation performance on headache sub-domain corpus of 870 utterances Comparison with Eng-Fre translation performance

Evaluation in two phases:1. Judging of speech recognition:

good vs. bad2. Judging of translations:

good/acceptable/bad

Evaluation (2)

60

4.4 0.5

35

75.8

19.2

0.7 4.4

0

20

40

60

80

FIN

FRE

FIN 60 4.4 0.5 35

FRE 75.8 19.2 0.7 4.4

Good translation

Acceptable translation

Bad translation

No translation

Evaluation (3)

Lexical gapsExample

“Does the pain radiate to the neck?”

(in coverage sentence) “Is the pain in the neck?”

(not in coverage sentence).

- Finnish ablative vs adessive case

‘kaulalle’ vs ‘kaulalla’

Summary

Development of MedSLT Finnish language module by partly adapting the existing resources. English and Finnish grammar rules highly

similar despite the differences between the languages

Difficulty the Finnish rich morphology that however can be resolved for some degree by using macros in lexicon

Initial evaluation of translation performance

References

MedSLT http://sourceforge.net/projects/medslt/ http://www.issco.unige.ch/projects/medslt

Regulus https://sourceforge.net/projects/regulus/ http://www.issco.unige.ch/projects/regulus

top related