Top Banner
Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle @ clopinet .com
41

Lecture 5: Causality and Feature Selection Isabelle Guyon [email protected].

Dec 17, 2015

Download

Documents

Brian Small
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 2: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Variable/feature selection

Remove features Xi to improve (or least degrade) prediction of Y.

X

Y

Page 3: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

What can go wrong?

Guyon-Aliferis-Elisseeff, 2007

X2 X1

180 190 200 210 220 230 240 250 260

20

40

60

80

100

120

Page 4: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

What can go wrong?

20 40 60 80 100

8

10

12

14

16

20

40

60

80

100

X2 X1

X1

X

2

X2 X1

180 190 200 210 220 230 240 250 260

20

40

60

80

100

120

Page 5: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

What can go wrong?

Guyon-Aliferis-Elisseeff, 2007

X2 Y

X1

Y

X1X2

Page 6: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

X

Y

Causal feature selection

Uncover causal relationships between Xi and Y.

Y

Page 7: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Causal feature relevance

Page 8: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Causal feature relevance

Page 9: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Causal feature relevance

Page 10: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Markov Blanket

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

Page 11: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Feature relevance

• Surely irrelevant feature Xi:

P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i)for all S\i X\i and all assignment of values to S\i

• Strongly relevant feature Xi:

P(Xi, Y |X\i) P(Xi |X\i)P(Y |X\i)for some assignment of values to X\i

• Weakly relevant feature Xi:

P(Xi, Y |S\i) P(Xi |S\i)P(Y |S\i)for some assignment of values to S\i X\i

Page 12: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Markov Blanket

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

Page 13: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

PARENTS

Markov Blanket

Page 14: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

CHILDREN

Markov Blanket

Page 15: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

SPOUSES

Markov Blanket

Page 16: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Causal relevance

• Surely irrelevant feature Xi:

P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i)for all S\i X\i and all assignment of values to S\i

• Causally relevant feature Xi:

P(Xi,Y|do(S\i)) P(Xi |do(S\i))P(Y|do(S\i))for some assignment of values to S\i

• Weak/strong causal relevance: – Weak=ancestors, indirect causes– Strong=parents, direct causes.

Page 17: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Examples

Page 18: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Smoking

Lung cancer

Immediate causes (parents)

Genetic factor1

Page 19: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Smoking

Lung cancer

Immediate causes (parents)

Page 20: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Smoking

Anxiety

Lung cancer

Non-immediate causes (other ancestors)

Page 21: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Genetic factor1

Other cancers

Lung cancer

Non causes (e.g. siblings)

Page 22: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Y

X || Y | C

FORKCHAIN

X X

C C

Page 23: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Smoking

Anxiety

Lung cancer

Hidden more direct cause

Tar in lungs

Page 24: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Smoking

Lung cancer

Confounder

Genetic factor2

Page 25: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Coughing Metastasis

Lung cancer

Biomarker1

Immediate consequences (children)

Page 26: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancer

Strongly relevant features (Kohavi-John, 1997) Markov Blanket (Tsamardinos-Aliferis, 2003)

X

C

X

C

X

C

X || Y but X || Y | C

Page 27: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancerBio-

marker2

Biomarker1

Non relevant spouse (artifact)

X2 X1

180 190 200 210 220 230 240 250 260

20

40

60

80

100

120

Page 28: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Lung cancerBio-

marker2

Biomarker1

Another case of confounder

X2 X1

180 190 200 210 220 230 240 250 260

20

40

60

80

100

120

Systematic noise

Page 29: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Coughing

Allergy

Lung cancer

Truly relevant spouse

Page 30: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Hormonal factor

Metastasis

Lung cancer

Sampling bias

Page 31: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Coughing

Allergy

Smoking

Anxiety

Genetic factor1

Hormonal factor

Metastasis(b)

Other cancers

Lung cancerGenetic factor2

Tar in lungs

Bio-marker2

Biomarker1

Systematic noise

Causal feature relevance

Page 32: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Formalism:Causal Bayesian networks

• Bayesian network:– Graph with random variables X1, X2, …Xn as

nodes.– Dependencies represented by edges.– Allow us to compute P(X1, X2, …Xn) as

i P( Xi | Parents(Xi) ).

– Edge directions have no meaning.

• Causal Bayesian network: egde directions indicate causality.

Page 33: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Example of Causal Discovery Algorithm

Algorithm: PC (Peter Spirtes and Clarck Glymour, 1999)Let A, B, C X and V X. Initialize with a fully connected un-oriented graph.1. Find un-oriented edges by using the criterion that variable A

shares a direct edge with variable B iff no subset of other variables V can render them conditionally independent (A B | V).

2. Orient edges in “collider” triplets (i.e., of the type: A C B) using the criterion that if there are direct edges between A, C and between C and B, but not between A and B, then A C B, iff there is no subset V containing C such that A B | V.

3. Further orient edges with a constraint-propagation method by adding orientations until no further orientation can be produced, using the two following criteria:

(i) If A B … C, and A — C (i.e. there is an undirected edge between A and C) then A C. (ii) If A B — C then B C.

Page 34: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Computational and statistical complexity

Computing the full causal graph poses:• Computational challenges (intractable for large numbers of

variables)• Statistical challenges (difficulty of estimation of conditional

probabilities for many var. w. few samples).

Compromise:• Develop algorithms with good average- case

performance, tractable for many real-life datasets.• Abandon learning the full causal graph and instead

develop methods that learn a local neighborhood.• Abandon learning the fully oriented causal graph and

instead develop methods that learn unoriented graphs.

Page 35: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Target Y

A prototypical MB algo: HITON

Aliferis-Tsamardinos-Statnikov, 2003)

Page 36: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Target Y

1 – Identify variables with direct edges to the target

(parent/children)

Aliferis-Tsamardinos-Statnikov, 2003)

Page 37: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Target Y

Aliferis-Tsamardinos-Statnikov, 2003)

1 – Identify variables with direct edges to the target

(parent/children)

A

B Iteration 1: add A

Iteration 2: add B

Iteration 3: remove B because A Y | B

etc.

A

A B

B

Page 38: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Target Y

Aliferis-Tsamardinos-Statnikov, 2003)

2 – Repeat algorithm for parents and children of Y(get

depth two relatives)

Page 39: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Target Y

Aliferis-Tsamardinos-Statnikov, 2003)

3 – Remove non-members of the MB

A member A of PCPC that is not in PC is a member of the Markov Blanket if there is some member of PC B, such that A becomes conditionally dependent with Y conditioned on any subset of the remaining variables and B .

A

B

Page 40: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

Conclusion

• Feature selection focuses on uncovering subsets of variables X1, X2, … predictive of the target Y.

• Multivariate feature selection is in principle more powerful than univariate feature selection, but not always in practice.

• Taking a closer look at the type of dependencies in terms of causal relationships may help refining the notion of variable relevance.

Page 41: Lecture 5: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com.

1) Feature Extraction, Foundations and ApplicationsI. Guyon et al, Eds.Springer, 2006.http://clopinet.com/fextract-book

2) Causal feature selectionI. Guyon, C. Aliferis, A. ElisseeffTo appear in “Computational Methods of Feature Selection”, Huan Liu and Hiroshi Motoda Eds., Chapman and Hall/CRC Press, 2007.http://clopinet.com/isabelle/Papers/causalFS.pdf

Acknowledgements and references