How STE and Analytical Tools Enabled Intuit MT Program Welocalize TAUS 2013

Post on 19-Nov-2014

250 Views

Category:

Business

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

How STE and Analytical Tools Enabled Intuit MT Program presented by Render Chiu, Intuit Group Manager, Global Content & Localization and Alex Yanishevsky, Welocalize Senior Product Architect. A look at how machine translation and automation helped Intuit translate products into 11 languages in three months Discussion at TAUS 2013 in Portland about Globalization, Product Localization, Translation, Language Services, Software, Technical, Consumer Experience, Machine Translation, MT TAUS, October 14-15, 2013 in Portland, Oregon.

Transcript

How STE and Analytical Tools Enabled

Intuit MT Program

Render Chiu, IntuitGroup Manager, Global Content & Localization

Alex Yanishevsky, WelocalizeSenior Product Architect

TAUS, October 14-15, 2013

• $4.15 billion rev in 2012 • Flagship products: QuickBooks,

TurboTax and Quicken• New: Mint.com, Intuit Money Manager• Markets: North America, Europe,

Singapore, Australia, India

The World is a big place…

3

•29 Million SMBs•500,000 Accountants•50 Million Employees

•600 Million SMBs•2.4 Million Accountants•1 Billion Employees

…and we’ve barely opened the door…

4

…but we have a clear vision…

5

…to be the World’s SMB Operating System!

…Then Business Tell Us To Have 10 Languages in 3 Months

Challenges (or Reality Check)How do you go global ASAP when you start from ground zero?

Requirement StatusBilingual translations None, except for FR-CAIn-house MT expertise NoneMT engine/technology NoneTMS + MT connector NoneStructured Content One Major Plus We Had

Going for Us: STE

What Were Our Options?Extreme Options We Chose Collaboration

• Lower cost by spreading the risk• Speed w/ immediate expertise• Scalability via deep supply chain

Four Questions on STE

• Q1. When is English not really English?• Q2. Does our choice of words hinder

translation?• Q3. Where does simplified English fit

in?• Q4. Does size really matter?

10

Q1. When is English not really English?

A. When the same words mean something different to different people• Clear (v)• Disable (v)• Table (v)

B. When we put words together in unexpected ways• One trick pony• Shoebox accounting

C. When we use slang or make up words• Huh, Oops, Whoopsie, Psst, Hmmm

11

Q2. Does our choice of words hinder translation?

A. Yes, when we use a word to mean more than one thing, and don’t provide context.

• “First” - does it mean first name (Prénom)? Or first day of the month (Premier)? First line (Première)*? First, run the export (D’abord)?

• “Tax” – is this sales tax? Income tax? Or payroll tax?

B. Yes, when we use two words with very similar meanings.• “Refund” and “rebate” - in French, they’re the same word.

– English: Enter a refund or rebate you receive– French: Entrer un remboursement ou un

remboursement que vous recevez.

12

Q2. Does our choice of words hinder translation?

C. Yes, when we don’t provide an English glossary of product-specific terms.

13

Q3. Where does simplified English fit in?

There are two basic types of controlled language:

1. Simplified or technical, used to improve documentation.

2. Logic based, used for software specifications, queries, proving theorems.

Simplified English belongs to the first category.

14

Q4. Does size really matter?YES!•Size of words. Use shorter words when possible:

• Help… not facilitate• Tell… not communicate or notify• Show… not indicate

•Size of sentences. Limit to 25 words (descriptions), 20 words (tasks)•Size of paragraphs. Limit to 6 sentences.•Size of titles, labels, and buttons.

• Content expands in other languages (buttons, field labels, etc.) FR +30%; DE +50%

Comprehensive MT ApproachWelocalize has a multi-tiered approach to machine translation (MT) implementation: 1) Evaluate content for MT readiness • source content audit • pre-translation editing• style and glossary verification

2) Assist in selection and integration of one or multiple MT engines into the localization technology ecosystem

3) Perform MT post-editing services• evaluation of MT output quality via workbench• human assessment and automated scoring• engine training feedback / engine improvement

4) Support transition from SaaS/hosted “black box” model to hosted glass box or in-house model

Predictable, Controllable, Progressive

Quality

Does Simplified English Work?Proof based on“Stinker” scores based on

POS Candidate ScorerPerplexity scores of bad LMTag density

• Stinker scores based on POS

• Perplexity scores based on LM

• Tag density

Candidate ScorerHow good is candidate text?

Take historically “bad” textPOS tagger on “bad text” and candidate textExclude list to reduce false positivesLower score is better, i.e. candidate text does

NOT match “bad” textRESULT: On sample of 10K sentences, Intuit’s

“stinker” score was 1.5-2 times LESS than 3 other companies!

• Compare to historically “bad” text

• Lower score is better

Capturing Perplexity for LMsBuild language model (LM) of “bad”

textIs candidate text closer to “bad” LM

or not?

RESULT: On sample of Intuit’s content (10K sentences), perplexity score was 2 times HIGHER, i.e. further away from “bad” text than 3 other companies!

Higher score is better

Tag Density

More tags results in increased post-editing and reduced efficiency

RESULT: On sample of 10K sentences, Intuit’s tag density was 3-4 times LESS than 3 other companies!

The fewer tags the better

How good is the MS Hub engine?Trained engine on 4,500 TUs and

3,000 glossary termsAutomatic scoringHuman evaluations for adequacy

and fluency

GOING LIVE!

• Auto scoring• Human evals

Results – Auto scoring 1

Bing PTBR

MS Hub PTBR

Bing ES

MS Hub ES

Bing DA

MS Hub DA

Bing NL

MS Hub NL

0.00 20.00 40.00 60.00 80.00

BLEU (MS Hub)BLEU (Bing)

Results – Auto scoring 2

Bing ID

MS Hub ID

Bing ZHCN

MS Hub ZHCN

Bing IT

MS Hub IT

Bing DE

MS Hub DE

0.00 20.00 40.00 60.00 80.00

BLEU (MS Hub)BLEU (Bing)

Results – Human evals 1

Bing ID

MS Hub ID

Bing ZHCN

MS Hub ZHCN

Bing IT

MS Hub IT

Bing DE

MS Hub DE

0.00 1.00 2.00 3.00 4.00 5.00

FluencyAdequacy

Results – Human evals 2

Bing PTBR

MS Hub PTBR

Bing ES

MS Hub ES

Bing DA

MS Hub DA

Bing NL

MS Hub NL

0.00 1.00 2.00 3.00 4.00 5.00

FluencyAdequacy

Lessons Learned

• Good wine comes from great grapes

• You can hire a professional tennis player to play for you

• You need a great team and a great partner

contactuswelocalizewww.welocalize.com 241 East 4th St. Suite 207Frederick, Maryland 21701 USA[t] +1.301.668.0330[t] +1.800.370.9515 Toll Free[f] +1.301.668.0335[e] marketing@welocalize.com

top related