Top Banner
Translation Quality Assessment: Five Easy Steps Using Multidimensional Quality Metrics to improve quality assessment and management Prepared by the QTLaunchPad project ([email protected]) version 1.0 (26.April 2013)
31

Overview of Multidimensional Quality Metrics (QTLaunchPad)

Jun 20, 2015

Download

Technology

Arle Lommel
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Translation Quality Assessment:

Five Easy StepsUsing Multidimensional Quality Metrics to

improve quality assessment and management

Prepared by the QTLaunchPad project ([email protected])

version 1.0 (26.April 2013)

Page 2: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Who does this apply to?

Requesters of translation services looking for relevant quality metrics

Language Service Providers (LSPs) delivering translation services to their clients

The following materials will apply to negotiation between requesters and providers

This description does not apply to individual translators (although they may want to be aware of the contents)

Page 3: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Step 1: Specifications

Page 4: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Basic questions about your project

E.g.,

What languages are you working in?

What is your subject field?

What sort of project is it (e.g., user interface, documentation, advertising)?

What technology are you using (MT, CAT, etc.)?

What register and style are you using?

Page 5: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Step 2. Select Metrics

Page 6: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Based on your specifications…

MQM recommendation tool will: suggest a pre-defined metric used for similar projects, or recommend a custom metric that applies to your project

You are free to modify the metric as needed

Create a metrics specification file that defines the issues to be examined provides weights (descriptions of how important the

issues are)

Metrics specification file can be used by an MQM-compliant tool

Page 7: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Step 3: Evaluation Method

Page 8: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Three options:

1. Sampling: Examine a portion of the text to determine whether to pass or fail the entire text. Sampling can utilize quality estimation for better results

2. Full error analysis: Review the entire text (needed for critical legal or safety texts)

3. Rubric: Rate the text on a numerical scale (suitable for quick assessment of suitability)

Page 9: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Automated Metrics

If sampling is used, MQM’s quality estimation tools will help focus sampling on those parts of the text that need attention

Automatic metrics can be used in some cases where human evaluation is too expensive or time-consuming

Page 10: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Step 4: Evaluation

Page 11: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Evaluation…

Can be conducted by the requester or LSP in accordance with the agreement between the parties

Follows the method chosen in Step 3 (evaluation method)

Issues must match the metric chosen in Step 2: issues not found in the metric should not be considered errors

Page 12: Overview of Multidimensional Quality Metrics (QTLaunchPad)

MQM provides capabilities

For human evaluation Inline markup provides an audit trail:

Allows independent verification of errors Helps ensure that issues are corrected

Full reporting functions: See what types of errors are reported Understand where errors come from

For automatic evaluation Integrated use of existing quality metrics to help

provide evaluation

Page 13: Overview of Multidimensional Quality Metrics (QTLaunchPad)

translate5

These capabilities are being integrated into an open-source editing tool, translate5 (http://www.translate5.net)

All results are free to implement in additional tools (both open source and proprietary)

Parties interested in development should contact [email protected]

Page 14: Overview of Multidimensional Quality Metrics (QTLaunchPad)

The source matters

Full MQM evaluation includes the source

Source quality evaluation can help identify reasons for problems and resolve them

Translators can be rewarded for addressing source deficiencies (scores over 100% are possible!)

Page 15: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Step 5: Scoring

Page 16: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Scoring Formula

(Q = whatever set of issues being counted within the bigger formula)

Provides consistency with LISA QA Model scoring method

Can be customized to support other legacy systems

Can be applied to individual parts of the overall formula: i.e., fluency, accuracy, grammar, etc. subscores can be derived

Weights (not shown) can be used to adjust importance of various issue types

Page 17: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Scores help guide decisions

Scores are given on a 100% basis

Scores can be broken down into more fine-grained reports. E.g., a score of 96% could have 100% accuracy but

92% fluency. Helps target actions for quality control.

Page 18: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Example

Page 19: Overview of Multidimensional Quality Metrics (QTLaunchPad)

1. Specifications

Parameter Value

Language/Locale Source: English; Target: Japanese

Subject field/domain Medical

Text type Narrative

Audience Educated readers with an interest in medicine

Purpose Education about a new procedure for managing diabetes

Register Moderately formal

Style no specified style – match source if possible

Content correspondence

Literal translation

Output modality subtitles (speech to text)

File format Time-coded XML for dotSub

Production technology human translation

Page 20: Overview of Multidimensional Quality Metrics (QTLaunchPad)

2. Recommended Metric

Issue type Weight (high, medium, low)

Notes

Fluency

Orthography High

Grammar High

Accuracy

Mistranslation High

Omission Low Due to nature as captions, some information loss is expected. Captions should be 60% of spoken dialogue

Untranslated High

Legal requirements

High Must make sure that legal claims are admissible under Japanese law

Page 21: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Chosen from…

Issue types are a subset of the full catalog of types

Page 22: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Chosen from…

Page 23: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Quality Formula (1)

TQ = (Atr + At - As) + (Ft – Fs)

with respect to specifications

TQ = translation qualityAtr = accuracy (transfer)At = accuracy for the target textAs = accuracy for the source textFt = fluency score for target textFs = fluency score for source text

Page 24: Overview of Multidimensional Quality Metrics (QTLaunchPad)

Quality Formula (2)

TQ = (Atr + At - As) + (Ft – Fs)

with respect to specifications

Definition: A quality translation demonstrates required accuracy and fluency for the audience and purpose and complies with all other negotiated specifications, taking into account end-user needs.

The gold portion = dimensions (specifications)

Page 25: Overview of Multidimensional Quality Metrics (QTLaunchPad)

3. Evaluation method

In this example, portions of the text are marketing: sampling is an acceptable evaluation method for these parts

Other portions contain legal and regulatory claims: full error analysis is required for those portions

Inline markup can be used via MQM namespace (because text is in XML) to ensure corrections are made.

Page 26: Overview of Multidimensional Quality Metrics (QTLaunchPad)

4. Evaluation

• Evaluation includes subsegment markup with issues in metric

• Issues stored in MQM namespace to allow audit and revision

• Users can select three severity levels:• critical: the issue renders the text unusable• major: the issue leaves the text usable, but is an obstacle

to understanding• minor: the issue does not impact usability of the text

screenshot: translate5.net showing MQM markup tool

Page 27: Overview of Multidimensional Quality Metrics (QTLaunchPad)

5. Scoring

Issue type Weight Minor Major

Critical

Penalty

Adjusted

Total

Fluency

Orthography 1.0 8 2 1 28 28 97.2%

Grammar 1.0 6 2 0 16 16 98.4%

Subtotal 44 95.6%

Accuracy

Mistranslation

1.0 4 0 0 4 4 99.6%

Omission 0.2 12 4 1 42 8.4 99.2%

Untranslated 1.0 1 0 0 1 1 99.9%

Legal requirements

1.0 0 0 1 10 10 99.0%

Subtotal 23.4 97.7%

Total 67.4 93.3%

Assumes 1000-word sample

Because Omission is considered a low priority in this case, it is given a low

weight

Page 28: Overview of Multidimensional Quality Metrics (QTLaunchPad)

5. Scoring

Without weighting of Omission, the score would be 89.9%

We can see that the translator has more problems with fluency than with accuracy

Page 29: Overview of Multidimensional Quality Metrics (QTLaunchPad)

5. Full scoring (including source)

Issue type Source Target Adjusted

Fluency

Orthography 96.1% 97.2% 101.1%

Grammar 99.0% 98.4% 99.6%

Subtotal 95.1% 95.6% ☞ 100.5%

Accuracy

Mistranslation (100%) 99.6% 99.6%

Omission (100%) 99.2% 99.2%

Untranslated (100%) 99.9% 99.9%

Legal requirements

(100%) 99.0% 99.0%

Subtotal 100% 97.7% 97.7%

Total 95.1% 89.9% 98.2%

Assumes 1000-word sample. Source accuracy set to 100% for computational purposes.

Page 30: Overview of Multidimensional Quality Metrics (QTLaunchPad)

5. Scoring (including source)

In many cases, some problems in a translation are not caused by the translator.

In this case, the translator fixed problems in the source, resulting in better quality for fluency in the target. The translator should be recognized for this work.