Top Banner
1 Decision making
25

1 Decision making. 2 How does the brain learn the values?

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Decision making. 2 How does the brain learn the values?

1

Decision making

Page 2: 1 Decision making. 2 How does the brain learn the values?

2

How does the brain learn the values?

Page 3: 1 Decision making. 2 How does the brain learn the values?

3

The computational problem

The goal is to maximize the sum of rewards

Eend

tt

V r

Page 4: 1 Decision making. 2 How does the brain learn the values?

4

The computational problem

The value of the state S1 depends on the policy

1 2ice cream V S r V S

If the animal chooses ‘right’ at S1,

Page 5: 1 Decision making. 2 How does the brain learn the values?

5

How to find the optimal policy in a complicated world?

Page 6: 1 Decision making. 2 How does the brain learn the values?

6

How to find the optimal policy in a complicated world?

• If values of the different states are known then this task is easy

1 t t tV S r V S

Page 7: 1 Decision making. 2 How does the brain learn the values?

7

How to find the optimal policy in a complicated world?

• If values of the different states are known then this task is easy

How can the values of the different states be learned?

Page 8: 1 Decision making. 2 How does the brain learn the values?

8

1 t t tV S r V S

V(St) = the value of the state at time t

rt = the (average) reward delivered at time t

V(St+1) = the value of the state at time t+1

Page 9: 1 Decision making. 2 How does the brain learn the values?

9

where

t t tV S V S

1 t t t tr V S V S

is the TD error.

The TD (temporal difference) learning algorithm

Page 10: 1 Decision making. 2 How does the brain learn the values?

10

Page 11: 1 Decision making. 2 How does the brain learn the values?

11

Dopamine

Page 12: 1 Decision making. 2 How does the brain learn the values?

12

Dopamine is good

• Dopamine is released by rewarding experiences, e.g., sex, food

• Cocaine, nicotine and amphetamine directly or indirectly lead to an increase of dopamine release

• Neutral stimuli that are associated with rewarding experiences result in a release of dopamine

• Drugs that reduce dopamine activity reduce motivation, cause anhedonia (inability to experience pleasure)

• Long-term use may result in dyskinesia (diminished voluntary movements and the presence of involuntary movements)

Page 13: 1 Decision making. 2 How does the brain learn the values?

13

No dopamine is bad

Page 14: 1 Decision making. 2 How does the brain learn the values?

14

• Bradykinesia – slowness in voluntary movement such as standing up, walking, and sitting down. This may lead to difficulty initiating walking, but when more severe can cause “freezing episodes” once walking has begun.

• Tremors – often occur in the hands, fingers, forearms, foot, mouth, or chin. Typically, tremors take place when the limbs are at rest as opposed to when there is movement.

• Rigidity – otherwise known as stiff muscles, often produce muscle pain that is increased during movement.

• Poor balance – happens because of the loss of reflexes that help posture. This causes unsteady balance, which oftentimes leads to falls.

No dopamine is bad (Parkinson’s disease)

Page 15: 1 Decision making. 2 How does the brain learn the values?

15

Schultz, Dayan and Montague, Science, 1997

Page 16: 1 Decision making. 2 How does the brain learn the values?

16

CS Reward

Before trial 1:

1 2 3 4 5 6 7 8 9

1 2 9 0 V S V S V S

In trial 1:

• no reward in states 1-7

1 0 t t t tr V S V S

0 t t tV S V S

• reward of size 1 in states 8

9 8 1 t tr V S V S

8 t tV S V S

Page 17: 1 Decision making. 2 How does the brain learn the values?

17

CS Reward

Before trial 2:

1 2 3 4 5 6 7 8 9

1 2 7 9 0 V S V S V S V S

8 V SIn trial 2, for states 1-6

1 0 t t t tr V S V S

0 t t tV S V S

For state 7,

1 t t t tr V S V S 2

7 7 tV S V S

Page 18: 1 Decision making. 2 How does the brain learn the values?

18

CS Reward

Before trial 2:

1 2 3 4 5 6 7 8 9

1 2 7 9 0 V S V S V S V S

8 V SFor state 8,

1 1 t t t tr V S V S

8 8 1 2 tV S V S

Page 19: 1 Decision making. 2 How does the brain learn the values?

19

CS Reward

Before trial 3:

1 2 3 4 5 6 7 8 9

1 2 6 9 0 V S V S V S V S

27 8 2 V S V S

In trial 2, for states 1-5

1 0 t t t tr V S V S

0 t t tV S V S

For state 6,

21 t t t tr V S V S

37 7 tV S V S

Page 20: 1 Decision making. 2 How does the brain learn the values?

20

CS Reward

1 2 3 4 5 6 7 8 9

For state 7,

21 2 2 1 t t t tr V S V S

2 2 37 7 2 1 3 2 tV S V S

Before trial 3: 1 2 6 9 0 V S V S V S V S

27 8 2 V S V S

For state 8,

1 1 2 t t t tr V S V S

8 8 2 1 1 2 tV S V S

Page 21: 1 Decision making. 2 How does the brain learn the values?

21

CS Reward

After many trials

1 2 3 4 5 6 7 8 9

1 8 91 0 V S V S V S

1 0 t t t tr V S V S

Except for the CS whose time is unknown

Page 22: 1 Decision making. 2 How does the brain learn the values?

22

Page 23: 1 Decision making. 2 How does the brain learn the values?

23Schultz, 1998

Page 24: 1 Decision making. 2 How does the brain learn the values?

24

Bayer and Glimcher, 1998

“We found that these neurons encoded the difference between the current reward and a weighted average of previous rewards, a reward prediction error, but only for outcomes that were better than expected”.

Page 25: 1 Decision making. 2 How does the brain learn the values?

25

Bayer and Glimcher, 1998