Top Banner
CS 378 lecture 16 Today INNS - LSTMS ( the type of RNN you will be using ) - Implementation IYfYIh Lm : Plñ ) or - Midterm back soon Plwilw , , . . - i -1 ) " predict the " %Id " Recipe RNNS + Language modeling I d- dim -50 in . " Q¥Q¥Q¥¥ I ¥4 z :MYw pain :L .it P ( w I I saw the dog ) = softnax ( ZIÑ )
7

Today - cs.utexas.edu

Jan 08, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Today - cs.utexas.edu

CS 378 lecture 16

TodayINNS- LSTMS ( the type of RNN you

will be using)- Implementation

IYfYIh Lm : Plñ ) or

- Midterm back soon Plwilw , ,.. -✓ i -1 )

" predict the"%Id "

Recipe RNNS + Language modelingId- dim-50

in."Q¥Q¥Q¥QÑ

☒ ¥ I ¥4 z:MYwpain:L.it

P (w I I saw the dog)= softnax (ZIÑ)

Page 2: Today - cs.utexas.edu

Training" Backpropagation through time

"

= backpropj 2-

① params

9."→ →ffplwl - - l"

.

.- pi

'

f f

(④ embeds

×, ×,

loss : - log Kuril :)

$µMultiple updates forVñi -i ÉÉi→w,yu ⇒ no problem

" V Elman network

longshort-termmemorynetworksk.tn#Many types of RNNS

ouptputsite☐→ next state

ininput

Page 3: Today - cs.utexas.edu

Lstms ( 1998)

short - term memory : what themodel

can remember in its state

☐→ - --- →☐

? does themotel

" remember" I

,

?

r

Ii LONG short-term memory( remember for longer)

Problem w/ Elman networks

vanishing /exploding gradients

☐→☐-☐→Is h-i-tnnhlwx-i-vhi.itq f I

I, I I I,= tanh (WI, +V.

tanh (WI, +V.tanh ( WI , -1J )) )

Page 4: Today - cs.utexas.edu

Assume tanh is the identity for

Ñ>= WI , + VWI

,+ VZWI

,

after n steps ⇒ V"→

I,

LSTMgatesing.FIElmmn:ñi=tmh(wIitVh

Gated : Ii = Ii - , ① f- + function ( Ii,ñi -c) É1

prevstate↳ lenientwise ✗

f- : forget gate , values in [0,1 ]

Ii -1 Do ☒Baµ=☒¥If f- =L :Tri

- ,is

totallyf- preserved

Page 5: Today - cs.utexas.edu

(added for

Where do f,i come from ?

pdletercise)bias6

f- sigmoid (wÉi+wHhi - i' b-forget)i = sigmoid (w

")Ii- W'"

thi- itbinput)

s%É,e÷.#

☐"

0+0*0Ti -1 I ? Ii update

Page 6: Today - cs.utexas.edu

Chris Olah 's blog hidden← state

forgetI

a¥¥

foutputgate

LSTM : 8 weight matrices

hidden state I

cell state c- ] tuple of theLSTM state

Ii

0=0:Ei

Page 7: Today - cs.utexas.edu

Poll :

discussed Istm - lecture .py .

I

4-pith I [1/1,2][0,1,2 ] ← [0,43]I