Changes in Moses Hieu Hoang TAUS October 2014
Jun 19, 2015
Changes in Moses
Hieu HoangTAUS
October 2014
MosesCore
• Easier installation– Binary releases– Pre-built models
• Testing and Releases– Linux, Mac OSX, Windows– 32 and 64-bit
• Faster training– Parallelism at all stages
Year 1 (2012)
MosesCore
• Even Easier installation– Binary releases– Pre-built models– Virtual Machines– Amazon EC2
• Refactored Decoder
Year 2 (2013)
MosesCore
• Even Easier installation– Binary releases– Pre-built models– Virtual Machines– Amazon EC2
• Refactored Decoder
Year 2 (2013)
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
Specify a Feature Function
• New Feature Function– New sections
● [feature-function-file]● [weight-?]
• Custom code– Parse ini file
– Initialize feature function
Then….[lmodel-file]8 0 3 europarl.en.srilm.gz
[weight-l]0.142
ini file:
Adding new Feature Function
• New Feature Function– No new section
● Line in [feature] section
● Line in [weight] section
– Framework● parse ini file● initialize feature
Now….[feature]KENLM file=path order=0
[weight]KENLM0= 0.142
ini file:
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
● Dynamic suffix array● Stores training data
– Extract translation rule on-the-fly– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model● Continuous space LM
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models● Replicate Devlin et al, 2014● Large quality gains
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration● Character level translation● Learns from parallel data● Integrate into decoder
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
– Extra information for each rule● Context, syntax, domain etc
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
– Faster, memory efficient decoding
– More syntactic models
Technical Breakout• Organization and Releases
– Academic and commercial needs
– Prevent forks
– Development/Stable versions
– Forwards/Backward compatibility
– Upgradability
• Features
• Deployment
• Future development
Technical Breakout• Organization and Releases• Features• Deployment
• Future development
Technical Breakout• Organization and Releases• Features• Deployment
– Platform/Clouds
– Docker containers
– Priorities
– Interaction and data formats
• Future development
Technical Breakout• Organization and Releases• Features
• Deployment
• Future development
– User-friendliness
– End-to-end solution
– Users
Changes in Moses
Hieu HoangTAUS
October 2014
Thanks for inviting me to come
Here to tell you a little about the things I’ve been doing to Moses
- over the past 2 years - mainly concentrate of the past year - but will quickly tell you about things I did
prior to that
1
MosesCore
• Easier installation– Binary releases– Pre-built models
• Testing and Releases– Linux, Mac OSX, Windows– 32 and 64-bit
• Faster training– Parallelism at all stages
Year 1 (2012)
In the 1st year - picked off the low hanging fruit - fixed many of the easy issues that required - time & effort
Made installation easierRun a lot of experiments anyway - gave some of them away - with all the scripts + configuration - used to run them - students can see how to replicate our
resultsLots of testing - all major platforms
Made obvious speed improvements - parallelising as much the traning as possible
2
MosesCore
• Even Easier installation– Binary releases– Pre-built models– Virtual Machines– Amazon EC2
• Refactored Decoder
Year 2 (2013)
In year 2 - made it even easier to install - if you can’t be bother to compile or even
download the binaries
- download a virtual machine with moses + friends installed
OR rent an amazon server with moses + friends
installed
3
MosesCore
• Even Easier installation– Binary releases– Pre-built models– Virtual Machines– Amazon EC2
• Refactored Decoder
Year 2 (2013)
However, the main reason I came here today - talk about the major changes I made - in decoder - and else where Makes is easier for us coders - add and change things in Moses
4
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
What is a feature function? - something that gives a translation a score
over the last few years - gotten bored with existing features like
language models and reordering modelsthe trend in MT - create novel features which give a score to a translation - like any feature, tries to give bigger scores to better models
New feature function framework - designed to make it easy to add new features
Not totally new to Moses - always had the ability have - add new LM implementations - add new phrase-table implementation - now – generalize to mutiple implementations of arbitary features that gives a score to transation - always been able to add new features - just made it easier
Another trend - FF shouldn’t just have a fixed, limited number of scores - they can have unknown number of scores - that can flicker on when a particularly good, or bad translation, is used - this is usually called sparse featuresAim of feature function framework - give them equal prominense to dense features - rather than have them as abjuncts - easy to forget - all FF can have sparse features - don’t need to turn it - FF can have dense AND sparse features - not mutually exclusive
5
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
Simplify class structure - to make it easier for us to develop with
Moses - Moses has been around for 8 years now - everyone has the freedom to add what
they want - no-one is in overall control - this way of organising an open-source
project is great - gotten lots of contribution, lots of
features - downside - grown organically - things are not as well structured as
they can be - now I have the time - with the benefit of hindsight - go back and put some structure
to what we’ve done
6
Why did you Refactor?
• Feature Function Framework– easier to implement new features– use sparse features
• Simplify class structure– easier to develop with Moses
• Delete functionality– easier to refactor code– very little deletion
Why did I delete things - delete very little - I’m not the gatekeeper of moses, I don’t
control it - if a functionality was deleted, it’s not a
comment on usefulness of it - purely ‘cos it got in the way of the
refactoring
Quickly go thru the last 2 - before telling you about feature functions
7
Specify a Feature Function
• New Feature Function– New sections
● [feature-function-file]● [weight-?]
• Custom code– Parse ini file
– Initialize feature function
Then….[lmodel-file]8 0 3 europarl.en.srilm.gz
[weight-l]0.142
ini file:
completely bestoked - no framework to help you - if you don’t do it right, wont’ work
8
Adding new Feature Function
• New Feature Function– No new section
● Line in [feature] section
● Line in [weight] section
– Framework
● parse ini file● initialize feature
Now….[feature]KENLM file=path order=0
[weight]KENLM0= 0.142
ini file:
Write a class that implements the feature function
The framework does the rest - no need to create a custom section in the ini file or - change StaticData class or - change Paramater class
9
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
● Dynamic suffix array● Stores training data
– Extract translation rule on-the-fly– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
● Continuous space LM– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models● Replicate Devlin et al, 2014● Large quality gains
– Transliteration
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration● Character level translation● Learns from parallel data● Integrate into decoder
• Translation rule properties
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
– Extra information for each rule● Context, syntax, domain etc
• Syntax decoding
MosesCoreYear 3 (2014)
• Exploit new framework– Updatable phrase-table
– Neural network language model
– Bilingual language models
– Transliteration
• Translation rule properties
• Syntax decoding
– Faster, memory efficient decoding
– More syntactic models
Technical Breakout• Organization and Releases
– Academic and commercial needs
– Prevent forks
– Development/Stable versions
– Forwards/Backward compatibility
– Upgradability
• Features
• Deployment
• Future development
Technical Breakout• Organization and Releases• Features• Deployment
• Future development
Technical Breakout• Organization and Releases• Features• Deployment
– Platform/Clouds
– Docker containers
– Priorities
– Interaction and data formats
• Future development
Technical Breakout• Organization and Releases• Features
• Deployment
• Future development
– User-friendliness
– End-to-end solution
– Users