Symbolic Logic meets Machine Learning...Responsibility and AI • Not meant to be prescriptive, but towards a shared computational framework for moral judgements (i.e., “ethics bot”)

Symbolic Logic meets Machine Learning:

Towards Transparent and Responsible AI

Vaishak Belle University of Edinburgh & Alan Turing Institute

What’s on for today?

Symbolic Logic

Machine Learning

Overview• Motivation & challenges

• Symbolic approaches for transparency & interpretability

• Symbolic approaches for ethical reasoning

• Perspectives and conclusion

Motivation & challenges

Why?

Machine Learning:

• Inductive generalization • Big data • I.i.d. random variables

Symbolic Logic:

• Deduction • Succinct assertions • Relational structure

• Transparency • Interpretability • Ethical responsibility

Distinguished history• Philosophy of science: Boole, De Finetti, Carnap, etc.

• AI: relational learning, inductive logic programming, probabilistic logical modeling, statistical relational learning, etc.

• The world is relational, consisting of objects, which have properties

• Need to reason about hard and soft symbolic knowledge

Gene expressions, gene protein interactions

Biomine

Questions in life sciences• Is gene X involved in disease Y?

• What is the probability of involvement?

• Which genes are similar to X conditioned on Y?

• Which subgraphs are most relevant for studying Y?

De Raedt (2009)

Social interactions

θ1 ∀x, y[Smokes(x) ∧ Friends(x, y) ⊃ Smokes(y)]

θ2 ∀x[Smokes(x) ⊃ Coughs(x)]

Richardson & Domingos (2005)

Reasoning & elaboration• Adding new knowledge:

• Evidential:

• Causal:

• Interventional:

• Counterfactual:

Not just specify & compute, but also learn from data

∀x, y[Smokes(x) ∧ Family(x, y) ⊃ Smokes(y)]

Pr(Smokes(A) ∣ Friends(A, B) ∧ Smokes(B))

Pr(Coughs(A) ∣ Smokes(A))

Pr(Coughs(A) ∣ do(ReduceByHalf(A)))

Pr(Coughs(A) ∣ NeverSmoked(A))

Some broader issues to consider with blackbox

models

Easy to fool

Goodfellow et al. (2014)

Difficult to contextualize

In a recent prediction model for inferring the risk of death for patients who developed pneumonia, a counterintuitive model was learnt that suggested that asthmatics are less likely to die from pneumonia, owing to a policy that asthmatics with pneumonia should be administered aggressive treatment immediately

Caruana et al. (2015)

Pedagogical reasons• Game playing: what’s the best strategy, which move

should be chosen and why?

• Self-driving cars: when to accelerate? To swerve?

• Tutoring & interaction: posit a model of the user

• Semantic understanding in language & vision

Legal and ethical reasons• An applicant was denied his credit card application. Why?

• A self-driving car causes damage. Who is to blame, and what was the cause (so that it does not repeat)?

• It is not sufficient to say we act in good faith, we need to explain why our action was morally right

Opportunities• Allow human input via hard & soft constraints

• Enable context-dependent/user-specific interpretability

• Reason about semantics, choices and models

Hybrid systems integrating symbolic reasoning & learning

Strategies• Logic is discrete, noise-free: upgrade to continuous, noisy

• Integrate low-level learning with high-level reasoning

• Inject symbolic knowledge or extract symbolic knowledge from learning methods

Upgrading to noisy & continuous

Spread of disease

conditioneffect

With noise model

Probabilistic programming: https://dtai.cs.kuleuven.be/problog/

conditionrandom variable

Continuous distributionsrandom variable distribution conditions

Nitti, Belle & De Raedt (2015)

Reasoning + sensor fusion

From low-level learning to high-level reasoning

Probabilistic model

Conditioning (observation) Query

Dries, Kimmig, Davis, Belle, De Raedt (2017)

From parsing to programs

Can we learn such programs?

I.e., extract symbolic knowledge from data?

Tabular data ID IQ Grade for

Course ALength (hrs) for A

...

1 105 Low 40 ...

2 110 Mid 50 ...

3 120 High 50 ...

... ... ... ...

Learn the (continuous) spread of values

Learn correlations

Learn the (discrete) distribution on entities

Background knowledge: first years can only take courses of length < 90 hours

minH

loss(H, B, D)

Speichert & Belle (2019)

From interpretable representations to

expressive querying

What if structure is hard to understand?

• Very large, very granular, etc.

• Perhaps we only need to provide a query interface

• We would like to learn and reason efficiently

Pr(satisfaction = high |nr_hours > 200) = ?

Compilation perspective

Bayesian networks

Markov logic networks Factor graphs

Probabilistic databases

Probabilistic logics

Weighted model counting (via circuits: expensive offline, cheap online)

What if we learn such circuits directly?

Deep probabilistic models

Kisa et al. (2014)

Deep probabilistic models • Leaf nodes: tractable univariate

distributions

• Weighted sums of products (i.e., weighted mixtures)

• Computing marginals can be done by single pass

Poon & Domingos (2011)

standard deep models P(y ∣ x) vs here: P(x, y)

Discrete to continuous

Bueff, Speichert & Belle (2018); Molina et al. (2018)

Missing Values

Levray & Belle (2018)

Tractable models for interpreting deep

learning

Fuxjaeger & Belle (2019)

Autoencoder set up• Find a way to minimize reconstruction loss between

feature layer e(x) and reconstruction d(e(x))

• Construct circuit Pr(x) for feature layer

Data Feature Layer

Encoder

Circuit

Decoder

Generate prototypes for labels via Pr(x,y)

Functional tasks• Given first image, generate

second image such that: • Given first image, generate

second image such that:

digit(img1) XOR digit(img2) = 1 digit(img1) XOR digit(img2) = 0

Recent developments• How expressive are such tractable models for causal

reasoning?

• What are the computational advantages if we avoid an explicit hypothesis/model construction?

Papantonis & Belle (2019); Belle & Juba (2019)

Ethical AI (via tractable models)

Implementing fairness

Varley and Belle (2019)

Pr(y ∣ ap) = Pr(y ∣ ¬ap)

Protected attributes + dependent variables

Too extreme?• “Germany recently proposed a code for driverless cars.

The proposal specified, among other things, that a driverless car should always opt for property damage over personal injury. Is this reasonable?”

• “Should an autonomous vehicle swerve and kill its passenger when otherwise it would kill 5 pedestrians?”

Halpern (2018)

Formal framework• Causal model to capture variables, actions and costs

• E.g., T1 (people on track #1 die) = 1 - A (lever pulled)

• Thus, T2 = A

• Degree of blame <- attempt actions with lower costs

Halpern & Kleiman-Weiner (2018)

Can you learn models of moral scenarios?

Can judgements be computed tractably?

Hammond & Belle (2018)

Learn costs by surveying people (or, say, experts)

5

¬L¬W

1

L �

0

5

¬LW

1

L �

0

5

L¬W

0.5

¬L¬W

0.5

3

0.278 0.167 0.556

1

¬R¬U

1

R �

0

1

R¬U

1

¬R�

0

1

R U

0.6

¬RU

0.4

Alignment of decisions

Best friend on main track

Learned utilities

Responsibility and AI• Not meant to be prescriptive, but towards a shared

computational framework for moral judgements (i.e., “ethics bot”)

• Automated decision making can be extremely useful, but also raises many concerns touching on philosophically vexing themes

• Unclear if formal definitions fully capture various viewpoints

• Perhaps AI can help us understand if alignment is even possible between human agenda and machine objectives

Summary & outlook• Explainability, robustness, responsibility challenging

topics

• Symbolic frameworks to specify, interpret and contextualize machine learning models

• Taking steps towards human-in-the-loop decision making

Symbolic Logic meets Machine Learning...Responsibility and AI • Not meant to be prescriptive, but towards a shared computational framework for moral judgements (i.e., “ethics bot”)

Documents