Top Banner
49

Master Thesis Slide

Jan 10, 2017

Download

Documents

Jacek Filipczuk
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Master Thesis Slide

Dipartimento di Informatica

Università degli studi di Salerno, Fisciano

Capturing Emotions with Deep Learning

Jacek Filipczuk

25 February 2016

Page 2: Master Thesis Slide

1

Content

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 3: Master Thesis Slide

2

Index

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 4: Master Thesis Slide

3

Emotion Extraction from Text

I Why is it important?I Is it useful?I How can it be done?

Jacek Filipczuk | Emotions with Deep Learning

Page 5: Master Thesis Slide

4

Problem Definition

Figure: A positive review

Jacek Filipczuk | Emotions with Deep Learning

Page 6: Master Thesis Slide

5

Problem Definition

Figure: A negative review

Jacek Filipczuk | Emotions with Deep Learning

Page 7: Master Thesis Slide

6

Index

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 8: Master Thesis Slide

7

Sentiment Analysis

Sentiment Analysis is the process of determining the emotional tonebehind a text written by some user.It helps to understand the attitudes, opinions and emotions expressedwith an online method.

Jacek Filipczuk | Emotions with Deep Learning

Page 9: Master Thesis Slide

8

Why emotions?

Social media sentiment analysis can be an excellent source ofinformation and can provide insights that can:

I determine market strategy;I improve campaign success;I improve product messaging;I improve customer service;I test business performance.

Jacek Filipczuk | Emotions with Deep Learning

Page 10: Master Thesis Slide

9

Different Models

Figure: Different Emotion Models.

Jacek Filipczuk | Emotions with Deep Learning

Page 11: Master Thesis Slide

10

Plutchik

Figure: Plutchik Emotion Model.Jacek Filipczuk | Emotions with Deep Learning

Page 12: Master Thesis Slide

11

The hourglass

Figure: The Hourglass of Emotions ModelJacek Filipczuk | Emotions with Deep Learning

Page 13: Master Thesis Slide

12

The emotions

The target emotions where inspired from the previous models.

Jacek Filipczuk | Emotions with Deep Learning

Page 14: Master Thesis Slide

13

Deep Neural Networks

The process of hierarchical feature learning is divided in two steps.First multiple layers of non-linear features are extracted. Second thelayers are passed to a classifier that combines all the features tomake predictions.Deep Neural Networks were inspired by the human brain.

Jacek Filipczuk | Emotions with Deep Learning

Page 15: Master Thesis Slide

14

A Neural Network

Figure: The architecture of the first known deep network trained by AlexeyGrigorevich Ivakhnenko in 1965.

Jacek Filipczuk | Emotions with Deep Learning

Page 16: Master Thesis Slide

15

MLP architecture

Figure: The architecture of a generic MLP neural Network.

Jacek Filipczuk | Emotions with Deep Learning

Page 17: Master Thesis Slide

16

Convolutional Neural Network

The CNN take into account the spatial structure of the dataset.

Jacek Filipczuk | Emotions with Deep Learning

Page 18: Master Thesis Slide

17

Convolutional Neural Network

Figure: An example of a CNN.

Jacek Filipczuk | Emotions with Deep Learning

Page 19: Master Thesis Slide

18

LSTM Neural Network

Long short-term memory networks are a special kind of Recurrentneural networks capable of learning long-term dependencies.LSTMs are explicitly designed to remember information for longperiods of time.

Jacek Filipczuk | Emotions with Deep Learning

Page 20: Master Thesis Slide

19

The architecture

Figure: An example of the repeating module in an LSTM neural network.

Jacek Filipczuk | Emotions with Deep Learning

Page 21: Master Thesis Slide

20

The cell

Figure: An example of the cell.

Jacek Filipczuk | Emotions with Deep Learning

Page 22: Master Thesis Slide

21

Index

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 23: Master Thesis Slide

22

The datasets

In this work were used two different datasets:I the IMDB dataset;I the ISEAR dataset.

Jacek Filipczuk | Emotions with Deep Learning

Page 24: Master Thesis Slide

23

IMDB

I Who created the dataset? —> UsersI What are the data inside? —> ReviewsI How big it is? —> 25 000

Jacek Filipczuk | Emotions with Deep Learning

Page 25: Master Thesis Slide

24

ISEAR

I Who created the dataset? —> PsychologystsI What are the data inside? –>Short sentencesI How big it is? –> 7500

Jacek Filipczuk | Emotions with Deep Learning

Page 26: Master Thesis Slide

25

Natural Language problems

Natural language processing problems are directly caused bylanguage problems. In fact there are some major language problemslike:

I Languages are changing everyday, new words, new rules, etc.I Words can have different meanings depending on context, and

they can acquire new meanings over time (apple [a fruit], Apple[a company], they can even change their parts of speech(Google –> to google).

I Computers do not understand words. They do not understandtheir true meanings as humans do.

Jacek Filipczuk | Emotions with Deep Learning

Page 27: Master Thesis Slide

26

Natural Language Toolkit

A general NLP pipeline:I Tokenization;I Tagging;I Chunking.

Jacek Filipczuk | Emotions with Deep Learning

Page 28: Master Thesis Slide

27

Chunking grammar

grammar = "NP:<DT>?<JJ>*<NN>"

Figure: Segmentation and Labeling at both the Token and Chunk Levels.

Jacek Filipczuk | Emotions with Deep Learning

Page 29: Master Thesis Slide

28

Building the dataset

Adverbs + Adjectives = Emotion Score

Feature[i] += 2+coeff

where coeff=+-1

Jacek Filipczuk | Emotions with Deep Learning

Page 30: Master Thesis Slide

29

Labeling the Dataset

The target emotion is the one with the greatest score.I ExpandedI Compressed

Jacek Filipczuk | Emotions with Deep Learning

Page 31: Master Thesis Slide

30

Index

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 32: Master Thesis Slide

31

The goal

Train and test with different Deep Neural Networks!

Jacek Filipczuk | Emotions with Deep Learning

Page 33: Master Thesis Slide

32

MLP Testing

Table: 16x7586 ISEAR

Accuracy Loss Function Learning Rate Epochs40,93% 1,594191 0,01 30043,07% 1,577638 0,01 30043,60% 1,531735 0,01 30043,27% 1,519117 0,001 30044,53% 1,483311 0,001 30042,87% 1,535861 0,001 30033,93% 1,821047 0,0001 30035,33% 1,826459 0,0001 30036,36% 1,803543 0,0001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 34: Master Thesis Slide

33

MLP Testing

Table: 116x7586 ISEAR

Accuracy Loss Function Learning Rate Epochs Accuracy Loss Function Epochs40,13% 1,592692 0,01 300 43,51% 1,531616 50042,67% 1,576612 0,01 300 43,93% 1,571319 50042,91% 1,531140 0,01 300 41,79% 1,583024 50042,71% 1,529107 0,001 300 44,69% 1,521561 50044,31% 1,482301 0,001 300 44,74% 1,433542 50043,97% 1,515163 0,001 300 42,82% 1,492315 50034,33% 1,721917 0,0001 300 37,96% 1,798342 50035,53% 1,821449 0,0001 300 37,12% 1,802671 50034,76% 1,803183 0,0001 300 35,61% 1,828191 500

Jacek Filipczuk | Emotions with Deep Learning

Page 35: Master Thesis Slide

34

MLP Testing

Table: 267x7586 ISEAR

Accuracy Loss Function Learning Rate Epochs23,50% 1,826602 0,01 30024,13% 1,807206 0,01 30023,71% 1,796651 0,01 30022,73% 1,795133 0,001 30025,13% 1,773156 0,001 30024,73% 1,776218 0,001 30022,60% 1,856014 0,0001 30024,07% 1,852020 0,0001 30022,67% 1,861846 0,0001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 36: Master Thesis Slide

35

ISEAR results

Jacek Filipczuk | Emotions with Deep Learning

Page 37: Master Thesis Slide

36

MLP Testing

Table: 16x2156 IMDB

Accuracy Loss Function Learning Rate Epochs88,00% 0,402361 0,01 30086,50% 0,466989 0,01 30091,00% 0,382211 0,01 30089,00% 0,468194 0,001 30087,25% 0,548809 0,001 30089,00% 0,433285 0,001 30078,75% 1,212943 0,0001 30079,25% 1,158447 0,0001 30075,50% 1,218672 0,0001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 38: Master Thesis Slide

37

MLP Testing

Table: 116x2156 IMDB

Accuracy Loss Function Learning Rate Epochs Accuracy Loss Function Epochs89.00% 0.394762 0,01 300 88.25% 0.409069 50089.75% 0.391763 0,01 300 88.50% 0.432282 50089.50% 0.404379 0,01 300 86.00% 0.430368 50090.00% 0.389020 0,001 300 91.50% 0.364005 50087.00% 0.510808 0,001 300 90.50% 0.416307 50087.50% 0.492533 0,001 300 89.75% 0.407521 50082.00% 1.174111 0,0001 300 83.12% 1.162348 50077.75% 1.247049 0,0001 300 79.32% 1.152141 50075.50% 1.235449 0,0001 300 80.43% 1.167145 500

Jacek Filipczuk | Emotions with Deep Learning

Page 39: Master Thesis Slide

38

MLP Testing

Table: 267x2156 IMDB

Accuracy Loss Function Learning Rate Epochs91.75% 0.388250 0,01 30092.50% 0.366930 0,01 30093.00% 0.346140 0,01 30092,25% 0,443516 0,001 30090,00% 0.475815 0,001 30087.25% 0.598331 0,001 30077.75% 1.151705 0,0001 30081.75% 1.058237 0,0001 30078.50% 1.200452 0,0001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 40: Master Thesis Slide

39

MLP Testing

Table: 267x10200 IMDB

Accuracy Loss Function Learning Rate Epochs96.25% 0.277372 0,01 30094.45% 0.335562 0,01 30095.84% 0.289589 0,01 30095,55% 0,316808 0,001 30095,30% 0,323706 0,001 30096,25% 0,297388 0,001 30087.95% 0.594173 0,0001 30087.60% 0.604254 0,0001 30088.30% 0.567574 0,0001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 41: Master Thesis Slide

40

IMDB results

Jacek Filipczuk | Emotions with Deep Learning

Page 42: Master Thesis Slide

41

CNN Testing

Table: Results of the test with Convolutional Neural Network.

Dataset Accuracy Loss Function Learning Rate EpochsIMDB 96,65% 0,113728 0,001 300ISEAR 36,21% 1,689331 0,001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 43: Master Thesis Slide

42

LSTM Testing

Table: Results of the test with Long Short-Term Neural Network.

Dataset Accuracy Loss Function Learning Rate EpochsIMDB 93,20% 0,28546 0,001 300ISEAR 41,00% 1,586147 0,001 300

Jacek Filipczuk | Emotions with Deep Learning

Page 44: Master Thesis Slide

43

Comparing results

Jacek Filipczuk | Emotions with Deep Learning

Page 45: Master Thesis Slide

44

The big difference

Jacek Filipczuk | Emotions with Deep Learning

Page 46: Master Thesis Slide

45

Index

IntroductionProblem Definition

The goalEmotionsEmotion ModelsDeep Learning Neural NetworksMLPCNNLSTM

PreprocessingDatasetsFeature CreationNLTKFeature ExtractionLabeling

Training and TestingConclusions

Jacek Filipczuk | Emotions with Deep Learning

Page 47: Master Thesis Slide

46

Final thoughts

In conclusion deep neural networks are a really powerful tool that canachieve incredibly good results even for difficult problems likecapturing emotion.

Jacek Filipczuk | Emotions with Deep Learning

Page 48: Master Thesis Slide

47

Future improvements

In future it will be interesting to improve the emotion labeling steppresented in this work and experiment more with deep neuralnetworks, adding more layers or trying different type of layers.

Jacek Filipczuk | Emotions with Deep Learning

Page 49: Master Thesis Slide