Top Banner
1 Het begint met een idee Experiment validity Ivano Malavolta
21

The Green Lab - [09 B] Experiment validity

Feb 14, 2017

Download

Technology

Ivano Malavolta
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Green Lab - [09  B] Experiment validity

1 Het begint met een idee

Experiment validity

Ivano Malavolta

Page 2: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

2 Ivano Malavolta / S2 group / Empirical software engineering

Planning phases

Scope of this lecture

Page 3: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

3 Ivano Malavolta / S2 group / Green Lab

Experiment validity

● We aim for adequate validity, not universal validity

○ What matters is our population of interest

Validity is the extent to which our results are sound and

applicable to the real world

● Validity is in trade-off with experiment scope

Page 4: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

Threats Identification

4

● Identifying threats helps to plan for adequate validity

● Each threat needs appropriate mitigation

● Several classifications of validity threats:

○ Campbell and Stanley [1]

○ Cook and Campbell [2]

Ivano Malavolta / S2 group / Green Lab

Page 5: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

5

Types of threat to validity

Theory

Observation

Cause EffectCausation

e.g. encoding algorithms e.g. Energy efficiency

Treatment Experiment Outcome

e.g. JPEG e.g. energy per image

Ivano Malavolta / S2 group / Green Lab

Page 6: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

Causation

Experiment

6

Types of threat to validity

Theory

Observation

Cause Effect

Treatment Outcome

Construct

InternalConclusion

Construct

External

Ivano Malavolta / S2 group / Green Lab

e.g. encoding algorithms e.g. Energy efficiency

e.g. JPEG e.g. energy per image

Page 7: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

7

Internal validity

Internal Validity: causality between treatment and outcome

● Strongly related to the experiment design and operation

○ Are my results caused by the treatment?

○ Have I considered all possible factors?

Ivano Malavolta / S2 group / Green Lab

Page 8: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

8

Internal validity: types of threat

● History○ Different trials of the experiment performed in different time frames (eg,

after holidays vs normal days)

● Maturation○ Subjects may react differently over time (eg, learning effect, tiresome,

boredome)

● Selection○ Some subjects may abandon the experiment○ Event worse, some specific type of subjects may leave it

● Direction of causal influence○ A causes B, B causes A, or X causes A and B?

Ivano Malavolta / S2 group / Green Lab

Page 9: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

9

Internal validity: mitigation

Analyze and identify confounding factors/noise

Choose appropriate experiment design

Ivano Malavolta / S2 group / Green Lab

Page 10: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

Conclusion Validity: statistical correctness and significance

● Are my conclusions correct?

● Are my results significant enough?

10

Conclusion validity

Ivano Malavolta / S2 group / Green Lab

Page 11: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

11

Conclusion validity: types of threat

● Low statistical power○ Results not statistically significant○ There is a significant difference but the statistical test does not reveal it

due to the low number of data points

● Violated assumptions of statistical tests○ eg, many tests assume normally distributed samples

● Fishing and error rate○ If you are combining multiple statistical tests, also their significance

should be adapted

● Reliability of measures○ If you repeat the measurement you should get similar results → same

conclusions

Ivano Malavolta / S2 group / Green Lab

Page 12: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

12

Conclusion validity: mitigation

Select appropriate tests

Use only as much significance as needed

Keep environment under control

Ivano Malavolta / S2 group / Green Lab

Page 13: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

13

Construct validity

● Have I defined my constructs properly?

● Am I analyzing the correct variables for the effects?

Construct Validity: relation between theory and observation

Ivano Malavolta / S2 group / Green Lab

Page 14: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

14

Construct validity: types of threat

● Inadequate preoperational explication of constructs○ construct not well defined before being translated into measures○ Theory unclear○ Comparing two methods, but not clear what does mean that a method is

better than another

● Mono-operation bias○ I have one independent variable only, one single object or treatment

→ the experiment could not represent the theory

○ eg, inspection conducted on a single document not representative of the set of documents on which the technique is often applied

● Mono-method bias○ When you use a single type of measures or observations○ The experimenter may bias the measures

Ivano Malavolta / S2 group / Green Lab

Page 15: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

15

Construct validity: mitigation

Early definition of constructs (GQM)

Use appropriate experiment design

Introduce redundancy for cross-checks

Ivano Malavolta / S2 group / Green Lab

Page 16: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

16

External validity

● Are my results valid for the whole target population?

● Have I selected a representative sample?

External Validity: generalizability of the results

Ivano Malavolta / S2 group / Green Lab

Page 17: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

17

External validity: types of threat

● Interaction of selection and treatment○ the population of subjects is not representative of the one for which I

would like to generalize my results○ eg, performing experiments with students to use results in industry

● Interaction of setting and treatment○ the experimental setting or the material are not representative○ e.g. I let the subjects using tools that they don’t use in the reality○ e.g. Web development using textual editors○ Use of toy objects

● Interaction of history and treatment○ the experiment is conducted on a special time or day which affects the

results○ eg, our experiment on green software is performed after a big congress at

which some subjects participated

Ivano Malavolta / S2 group / Green Lab

Page 18: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

18

External validity: mitigation

Use an environment as realistic as possible

Explicitly define and model your context

Ivano Malavolta / S2 group / Green Lab

Page 19: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

● You know that you have to explicitly take into account the threats to validity of your experiment

● Discussing threats actually makes your experiment stronger

○ you are not showing your weaknesses, but you are playing for replicability

● You will make tradeoffs between threats to validity in your experiment

● Consider threats to validity from the beginning

○ Reasoning on them will make you feel more confident about the scope and design of your experiment

19

What this lecture means to you?

Ivano Malavolta / S2 group / Green Lab

Page 20: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

20 Ivano Malavolta / S2 group / Empirical software engineering

Readings

Chapter 8

[1] Campbell and Stanley, Experimental and Quasi- Experimental designs for Research (1963).

(Blackboard)

[2] Cook and Campbell, Quasi-experimentation - Design and Analysis Issues for Field Settings

(1979). Available at the VU library.

Ivano Malavolta / S2 group / Green Lab

Page 21: The Green Lab - [09  B] Experiment validity

Vrije Universiteit Amsterdam

21 Ivano Malavolta / S2 group / Empirical software engineering

Some contents of lecture extracted from: ● Giuseppe Procaccianti’s lectures at VU

Acknowledgements