Top Banner
Machine Learning for NLP Introduction session Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1
55

Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Jun 07, 2018

Download

Documents

vuongkhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Machine Learning for NLP

Introduction session

Aurélie Herbelot

2018

Centre for Mind/Brain SciencesUniversity of Trento

1

Page 2: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Machine Learning: what is it?

2

Page 3: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Machines that learn...

3

Page 4: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Everybody wants to do it...

Forrester (Nov. 2016):

‘Insight-drivenbusinesses’ will steal$1.2 trillion/year fromcompetitors by 2020.

4

Page 5: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Why should machines be able to learn?

• “Good old AI” assumed it might be possible to program anintelligent machine by hand. It failed.

• The world is complex and ‘rules’ are not so easy to writedown.Exercise: what is a chair?

• For AI purposes, it is essential that a machine has theflexibility to change in response to your new data.

5

Page 6: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Difference with human learning

• There are parallels between human and machine learning:incremental procedure involving interaction with the world(data) and possibly human beings as well as fellowmachines, but...

• Children grow in environments that are very different fromwhat we can offer machines. They are born withsensory-motor capabilities that current machines do nothave. They have innate knowledge.

• Machines, on the other hand, can be trained 24/7, on a lotmore data (and are not particularly good with small data...)

6

Page 7: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Why learn language?

7

Page 8: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Applications

• ML for NLP is used in many real-life applications. Classicuses are:

• Information retrieval (search engines).• Machine translation.• Automatic essay grading.• Recommendation systems.• Spam filtering.• (Not-so-clever) conversational agents...

8

Page 9: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Applications

• More recent developments:• Automatic medical diagnoses

(more today).• Automatic court judgments.• Cleverer conversation agents.• Fixing the world –

fact-checking, debiasing, etc(????)

9

Page 10: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

One of the most fundamental human abilities

• Language lets us communicate about things that are nothere:

• Please sit down. (You haven’t yet.)• Bring me the chair from the living room.• If it rains tomorrow...• Once upon a time, there was a unicorn...• Let x be a variable...

10

Page 11: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

One of the most fundamental human abilities

• Language allows us to speak about complex objects andprocesses:

• Look at that chair with the velvet back, the one with theflowery English pattern.

• Insulin is a peptide hormone produced by beta cells of thepancreatic islets.

• I’m jealous. It’s not that I want that car, but I don’t think heshould have it either.

• Bring the curd to the boil, let it boil for exactly three minuteswhilst gently stirring.

11

Page 12: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

One of the most fundamental human abilities

• Language allows us to change the world as we know it:• Let’s build a bridge.• What if the Universe didn’t have 3, but 10 dimensions?• We must change our political system.• I love you.

12

Page 13: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Language and AI

• See A Roadmap towards Machine Intelligence (Mikolov etal 2016)

• The last decades have been spent on research focusing onspecific applications.

• We have today tremendous computational power and hugedata. Can we go back to the goal of simulating generalintelligence?

• One crucial characteristic of an intelligent machine is theability to communicate. How do we get a machine to learnlanguage?

13

Page 14: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Communication and language

• Language is the most powerful communication device atour disposal.

• A system mastering natural language can ‘teach itself’through written material.

• It can also learn through different modes of interaction (seelecture vs reading group).

14

Page 15: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Communication and perception

• Natural language can conveynon-linguistic information viadescriptions of the environmentand associated perceptions.

• ‘Hallucinate’ perception fromlanguage.

Odena et al (2017)

15

Page 16: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Several levels of question answering

• Q: What is the density of gold?The machine searches the Internet for the answer: 19.3g/cm3.

• Q: What is a good starting point to study reinforcementlearning?The machine searches several websites, gets and idea of theirpopularity, and matches them to the user’s learning style.

• Q: What is the most promising direction to cure cancer, andwhere should I start to meaningfully contribute?1) The machine reads many research articles about the topic. 2)It finds out about the user’s perspective and currentspecialisation. 3) It may engage with other experts/machines toanswer the question.

16

Page 17: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Language and cognition

• Input and output are linguistic, but not claim is made aboutinternal representations (is there a ‘language of thought’?)

• The language of thought hypothesis (LOTH): Jerry Fodor.

• Thoughts have compositional structure, like language.

• Concepts combine to produce thoughts, following grammarrules.

17

Page 18: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Language and cognition

• There might not be a language of thought. And still,language and concepts are tightly related.

• See some results in vector-space semantics:• Psycholinguistic validity:

• Vectors reproduce similarity judgements.• They account for priming effects.• Also for some aspects of language acquisition.

• Neurolinguistic validity:• At least at a coarse level, vectors map onto brain activity.• There is actually some evidence that ‘composition’ by vector

addition reproduces brain imaging obtained for simplesentences.

18

Page 19: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

What do we have to learn to learn language?

19

Page 20: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

ML and complexity

• Deep learning as multiple featurelearning stage, followed by alogistic regression.

• Typically, implemented as severallayers of a neural network,learning more and morefine-grained features of someinput data.

• Last layer generally correspondsto some classification task.

Bojarski et al (2016)

20

Page 21: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

A standard ‘NLP pipeline’

Example NLP pipeline for a Spoken Dialogue System.

http://www.nltk.org/book_1ed/ch01.html.

21

Page 22: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning morphology

• The meaning of a word is to some extent predictable fromits parts.

• The predictable aspects are learnable, using e.g.distributional semantics techniques.

Marelli & Baroni (2015)

22

Page 23: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning syntax (recap)

Probabilistic CKY

S → NP VP 1.0VP → VP PP 0.7VP → V NP 0.5VP → eats 0.1PP → P NP 0.8NP → Det N 0.7NP → NP PP 0.2NP → she 0.1NP → cake 0.1V → eats 0.1P → with 0.2N → fork 0.1Det → a 0.2

1 2 3 4 5 61 NP V,VP NP P Det N2 S VP NP3 S PP45 VP6 S

she eats cake with a fork

(she(eats (cake))(with(a (fork))))P(T ) =

0.1∗0.1∗0.1∗0.2∗0.2∗0.1∗0.5∗0.7∗0.8∗0.7∗1.0 =

7.84.10−7

23

Page 24: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning syntax (recap)

Probabilistic CKY

S → NP VP 1.0VP → VP PP 0.7VP → V NP 0.5VP → eats 0.1PP → P NP 0.8NP → Det N 0.7NP → NP PP 0.2NP → she 0.1NP → cake 0.1V → eats 0.1P → with 0.2N → fork 0.1Det → a 0.2

1 2 3 4 5 61 NP V,VP NP P Det N2 S NP3 S PP4 NP5 VP6 S

she eats cake with a fork

(she (eats)(cake(with(a(fork)))))P(T ) =

0.1∗0.1∗0.1∗0.2∗0.2∗0.1∗0.7∗0.8∗0.2∗0.5∗1.0 =

2.24.10−7

23

Page 25: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning semantics

• The many facets of meaning:• meaning is extension / reference: it ‘points’ at things in the

world;• meaning is intension: the Morning Star and the Evening

Star point at the same object (their extension), butlinguistically, they are not the same;

• meaning is conceptual: linguistic constituents activatereproducible cognitive processes involving extra-linguisticfeatures;

• meaning is use: words that occur in similar contexts aresemantically similar.

24

Page 26: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning semantics

• Meaning is probably all of those things...

Lazaridou et al (2017)

25

Page 27: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Learning pragmatics

• How does the broader context affect meaning? (E.g.situation of utterance, community of the speaker, etc...)

• Example 1: (In a reference letter for an academic job) “MrSmith was always very punctual.”Does the letter writer think much of Mr Smith?

• Example 2: how does a community contribute to theemergence and spread of meanings? (del Tredici &Fernàndez, 2017)

26

Page 28: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Integration with the world

A system that learns referencemust be able to link language tothe world, including to itsperception of the world.

27

Page 29: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Productivity, creativity: learning to generalise

• A proficient speaker should understand that turning leftand turning right share properties.

• Fregian compositionality: the meaning of the whole isgiven by the meaning of the parts.

• Also Frege’s ‘context principle’: parts have meaning invirtue of the whole.

28

Page 30: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Productivity, creativity: learning to generalise

• The described AI requires a long-term memory to storeconcepts and algorithms. The content of the long-termmemory is extendable.

• It is essential for the agent to understand which primitivesand composition processes it should store to be asefficient and flexible as possible.

• Hard question: to what extent are linguistic expressionsdecomposable? Are there semantic primitives? (Boledaand Erk 2015)

29

Page 31: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Is there a pipeline?

• Problem: it is not clear that ‘the NLP pipeline’ is so cleanlydivided into task-specific modules.

• See e.g. Baayen et al (2015): language is not a formalcalculus but an information-theoretic process overphonemes, producing so-called lexomes, which encodeexperience-dependent meaning.

• Fundamental question: when learning, what we shouldlearn from, and what can we expect to learn?

30

Page 32: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Is there a pipeline?

31

Page 33: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

ML and complexity

• Coming back to our deeplearning car...

• If learning language is acollection of NN layers... what dothe layers encode?

• Is language even a clean stack oflinguistic skills? Probably not...

Bojarski et al (2016)

32

Page 34: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Oh, and check it works for all languages...

Languages by proportion of native speakers,

https://commons.wikimedia.org/w/index.php?curid=41715483

33

Page 35: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

What do we have to do to learn?

34

Page 36: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Example: IBM Watson for oncology

• IBM Watson:

• Watson for Oncology (WFO): an AI expert for automaticallydiagnosing cancer and making appropriate treatmentrecommendations.

• The instance of WFO in this paper has learnt from 300medical journals and textbooks, treatment guidelines,actual breast cancer cases, including patientcharacteristics and laboratory findings.

• 93% concordance between medical experts and systemwhen recommending treatment for breast cancer.

35

Page 37: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

A Twitter review of WFO

https://twitter.com/EnricoCoiera/status/971647548875186178

36

Page 38: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Comparing learning data and real-world data

Are we learning from theright data? Can thelearning be transferred tothe setting where theapplication will bedeployed?(Say you learn to drive acar, does that mean youcan drive anything? Oreven any car? What if youlearnt on an automatic?)

37

Page 39: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

How to represent the data we are learning from?

What is important in thedata? How are we going topresent it to the learner?(Is the brand of your car animportant factor in knowinghow to drive? Perhaps,perhaps not.)

38

Page 40: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

How to deal with human data used to train the system?

If there is manual humanintervention by humans,either in training or testing,what is humanperformance on the task?

39

Page 41: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

How to learn?

What algorithm is used tolearn?

40

Page 42: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

How to evaluate the system?

What did we want to learn?Are we sure we learnt it?Does our evaluationmeasure strictly assess thebehaviour we want to train?

41

Page 43: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Course overview

42

Page 44: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Goals

1. Understand core machine learning algorithms for NLP.

2. Be able to read and criticise related literature.

3. Acquire some fundamental computational skills to run MLcode and interpret its output.

43

Page 45: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Session structure

• An introductory week, followed by 9 topics, eachassociated with 3 classes:

1. A lecture presenting the topic for that week.2. A lecture (with audience participation!) presenting one or

two papers using the presented algorithm(s) / metric(s).3. A practical with a task and/or some code to play with.

44

Page 46: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 1: introduction

• Today’s lecture!

• Basic principles ofstatistical NLP.

• Run a simple authorshipclassification algorithm.

45

Page 47: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 2: data preparation

• How to choose data. Inter-annotator agreement metrics.

• How to fool a image captioning system? (Hint: give it difficult data.)How to fool oneself? (Hint: by thinking one’s annotation scheme was detailedenough.)

• Hands-on intro to crowdsourcing. Annotate and calculate your inter-annotatoragreement.

46

Page 48: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 3: supervised learning

• Introduction to regressiontechniques.

• Using regression to understandthe performance of a system forcompositional morphology.

• Introducing regression in Python.

47

Page 49: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 4: unsupervised learning

• Clustering and dimensionalityreduction.

• Latent Semantic Analysis: “Howdo children learn as much asthey do, given the littleinformation they get?”

• Document clustering forinformation retrieval. We’ll beplaying with the code for thePeARS search engine.

48

Page 50: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 5: Support Vector Machines

• Introduction to kernel machines.

• Detection of semantic errors inthe prose of non-Englishspeakers with SVMs.

• Introduction to running SVMlight.

49

Page 51: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 6: intro to Neural Nets

• Basics of NNs and generalAI concepts.

• What do NNs really haveto do with neuroscience?

• Implement a Neural Netfrom scratch! http://www.wildml.com/2015/09/implementing-a-neural-

network-from-scratch

50

Page 52: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 7: RNNs and LSTMs

• Sequence learning withneural networks.

• How to generate text withRNNs.

• Implement an RNN inTheano.

https://colah.github.io/posts/2015-08-Understanding-LSTMs/

51

Page 53: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 8: Reinforcement learning

• Basics of RL.

• Multi-agent emergence of natural language.

• Try the openAI gym! A gym for artificial agents...

52

Page 54: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Week 9: ML and ethics

• Ethical issues with ML. Bias in distributional vectors.

• Literature on bias and on de-biasing.

• Visualisation of word embeddings for bias detection.

53

Page 55: Machine Learning for NLP - aurelieherbelot.netaurelieherbelot.net/resources/slides/teaching/ml-intro.pdf · Machine Learning for NLP Introduction session Aurélie Herbelot ... The

Material

All material will be posted at:http://aurelieherbelot.net/teaching/

Any question, worry, complaint... write to:[email protected]

54