How to remove an out layer tester Lucjan Janowski Faculty of Electrical Engineering, Automatics, Computer Science and Electronics Department of Telecommunications.

Post on 21-Jan-2016

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

How to remove an out layer tester

Lucjan Janowski

Faculty of Electrical Engineering, Automatics, Computer Science and ElectronicsDepartment of Telecommunications

2

Agenda

• Can a tester be an out layer?• The detecting philosophy• Latent variables• Rasch model• WinSteps• The final decision• Conclusion

2008 I 05-07

3

Can a tester be an out layer?

2008 I 05-07

4

What would we like to model?

• Why do we use testers?• A tester represents human

perception that is difficult to model • People are different and so are our

users/clients. Our goal is to take such difference into account

• Some of us are critical and others are uncritical

• A tester can be tired or not focused enough and therefore his/her answer can be random

2008 I 05-07

5

A tired tester problem

• A user can be tired too. Should we remove all tired testers?

• Can a tester score randomly? What are the consequences?

• Note that detecting that a tester scores a picture differently than the average score does not mean that it is a random tester

• We have to be very careful with testers removal since our goal is to build a model of the average user not the proper user

2008 I 05-07

6

Why are some scores different?

• Different effects can affect tester’s judgement differently (e.g. motion intensity, color, etc.)

• Testers have different experience (e.g. watching mainly youtube or films on a DVD set)

• Each of us is more or less critic to anything that he/she judges

• The words describing the opinion scale can be understood differently (in Poland OK is good in England OK is fair)

2008 I 05-07

7

What can we do?

• We have to detect random scores• A tester that scores randomly often

should be removed from the model building

• An answer that differs from the average score is not necessarily a random one therefore we have to consider the average score but corrected by a tester individualism

• We need a mathematic model of a user behavior that takes into account those properties

2008 I 05-07

8

Latent variable

OS

This is what a tester sees

Any distortion that influences QoE

2008 I 05-07

9

Latent variable

OS

Latent variable

This is what a tester sees

Any distortion that influences QoE

2008 I 05-07

10

Latent variable manifestation

2008 I 05-07

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

5 4 3 2 1

11

An example

2008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

12

Non extreme values testers

2008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

13

Wide range for 10 and 1

2008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

14

Critical tester

2008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

15

Are the answers random?

2008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

16

Rasch model

• We assume that a latent variable is the variable that is really scored by testers

• We assume that the opinion score probability is a logit function of the model parameters

• The function has parameters describing:– a tester “criticism” factor– a film/picture/… quality– an average threshold value for particular

score

2008 I 05-07

17

Rasch model equation

• n the tester number• i the object number (what is scored)• x the opinion score value (1-5, 0-10, …)

2008 I 05-07

)(

)(

1 xin

xin

e

enix

182008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

n

192008 I 05-07

Tester IDVideo ID (increasing distortion)

0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1148 10 9 8 5 4 3 1 3 2 1 1149 8 10 9 2 7 4 3 1 1 0 1150 9 9 9 5 6 5 3 2 5 2 2151 8 7 8 7 6 6 5 2 5 3 2152 10 9 7 8 7 4 3 3 2 1 1153 3 6 4 3 3 3 3 3 3 2 1

n

20

Rasch model

• We assume that Rasch model is correct and the data that do not fit this model are incorrect [sic]

• Note that without any assumption we are not able to detect randomly scoring testers

2008 I 05-07

Data

Model values

Observed values

5

1xnixniE

21

OMS (Outfit Mean Square)

• Knowing the model probability and the user answer we can estimate how far is a tester from the model

• A tester’s accuracy or quality is based on the OMS (Outfit Mean Square)

• Rasch model can be computed by WinSteps software (http://www.winsteps.com/)

• The OMS can be interpreted on the basis of heuristically obtained ranges2008 I 05-07

22

Results interpretation

2008 I 05-07

•A tester is not relevant and he/she should be removed2<OMS

•We should be suspicious 1.5<OMS<

2

•Correct tester0.5<OMS<1.5

•A tester fits the model too well OMS<0.5

23

An example results

2008 I 05-07

Tester IDVideo ID (increasing distortion)

OMS0 1 2 3 4 5 6 7 8 9 10

147 10 9 10 7 4 2 5 4 2 1 1 1.78148 10 9 8 5 4 3 1 3 2 1 1 1.23149 8 10 9 2 7 4 3 1 1 0 1 2.81150 9 9 9 5 6 5 3 2 5 2 2 0.90151 8 7 8 7 6 6 5 2 5 3 2 0.76152 10 9 7 8 7 4 3 3 2 1 1 1.36153 3 6 4 3 3 3 3 3 3 2 1 0.67

24

Rasch model disadvantages

• It is more accurate for more data. It is difficult to have lots of results since the tests are expensive

• Not all type of correct testers’ behavior can be modeled

• The algorithms are not implemented in Matlab therefore it is difficult to implement it in an automatic analysis made in Matlab

2008 I 05-07

25

Conclusion

• A tester’s answers make it possible to model human perception but not all his/her answers are correct

• Out layers should be removed • Rasch model helps to detect not relevant

testers • The final decision should be checked since

not all correct behaviors can be modeled by Rasch model

2008 I 05-07

262008 I 05-07

top related