Symbolic Logic meets Machine Learning: Towards Transparent and Responsible AI Vaishak Belle University of Edinburgh & Alan Turing Institute
Symbolic Logic meets Machine Learning:
Towards Transparent and Responsible AI
Vaishak Belle University of Edinburgh & Alan Turing Institute
What’s on for today?
Symbolic Logic
Machine Learning
Overview• Motivation & challenges
• Symbolic approaches for transparency & interpretability
• Symbolic approaches for ethical reasoning
• Perspectives and conclusion
Motivation & challenges
Why?
Machine Learning:
• Inductive generalization • Big data • I.i.d. random variables
Symbolic Logic:
• Deduction • Succinct assertions • Relational structure
• Transparency • Interpretability • Ethical responsibility
Distinguished history• Philosophy of science: Boole, De Finetti, Carnap, etc.
• AI: relational learning, inductive logic programming, probabilistic logical modeling, statistical relational learning, etc.
• The world is relational, consisting of objects, which have properties
• Need to reason about hard and soft symbolic knowledge
Gene expressions, gene protein interactions
Biomine
Questions in life sciences• Is gene X involved in disease Y?
• What is the probability of involvement?
• Which genes are similar to X conditioned on Y?
• Which subgraphs are most relevant for studying Y?
De Raedt (2009)
Social interactions
θ1 ∀x, y[Smokes(x) ∧ Friends(x, y) ⊃ Smokes(y)]
θ2 ∀x[Smokes(x) ⊃ Coughs(x)]
Richardson & Domingos (2005)
Reasoning & elaboration• Adding new knowledge:
• Evidential:
• Causal:
• Interventional:
• Counterfactual:
Not just specify & compute, but also learn from data
∀x, y[Smokes(x) ∧ Family(x, y) ⊃ Smokes(y)]
Pr(Smokes(A) ∣ Friends(A, B) ∧ Smokes(B))
Pr(Coughs(A) ∣ Smokes(A))
Pr(Coughs(A) ∣ do(ReduceByHalf(A)))
Pr(Coughs(A) ∣ NeverSmoked(A))
Some broader issues to consider with blackbox
models
Easy to fool
Goodfellow et al. (2014)
Difficult to contextualize
In a recent prediction model for inferring the risk of death for patients who developed pneumonia, a counterintuitive model was learnt that suggested that asthmatics are less likely to die from pneumonia, owing to a policy that asthmatics with pneumonia should be administered aggressive treatment immediately
Caruana et al. (2015)
Pedagogical reasons• Game playing: what’s the best strategy, which move
should be chosen and why?
• Self-driving cars: when to accelerate? To swerve?
• Tutoring & interaction: posit a model of the user
• Semantic understanding in language & vision
Legal and ethical reasons• An applicant was denied his credit card application. Why?
• A self-driving car causes damage. Who is to blame, and what was the cause (so that it does not repeat)?
• It is not sufficient to say we act in good faith, we need to explain why our action was morally right
Opportunities• Allow human input via hard & soft constraints
• Enable context-dependent/user-specific interpretability
• Reason about semantics, choices and models
Hybrid systems integrating symbolic reasoning & learning
Strategies• Logic is discrete, noise-free: upgrade to continuous, noisy
• Integrate low-level learning with high-level reasoning
• Inject symbolic knowledge or extract symbolic knowledge from learning methods
Upgrading to noisy & continuous
Spread of disease
conditioneffect
With noise model
Probabilistic programming: https://dtai.cs.kuleuven.be/problog/
conditionrandom variable
Continuous distributionsrandom variable distribution conditions
Nitti, Belle & De Raedt (2015)
Reasoning + sensor fusion
From low-level learning to high-level reasoning
Probabilistic model
Conditioning (observation) Query
Dries, Kimmig, Davis, Belle, De Raedt (2017)
From parsing to programs
Can we learn such programs?
I.e., extract symbolic knowledge from data?
Tabular data ID IQ Grade for
Course ALength (hrs) for A
...
1 105 Low 40 ...
2 110 Mid 50 ...
3 120 High 50 ...
... ... ... ...
Learn the (continuous) spread of values
Learn correlations
Learn the (discrete) distribution on entities
Background knowledge: first years can only take courses of length < 90 hours
minH
loss(H, B, D)
Speichert & Belle (2019)
From interpretable representations to
expressive querying
What if structure is hard to understand?
• Very large, very granular, etc.
• Perhaps we only need to provide a query interface
• We would like to learn and reason efficiently
Pr(satisfaction = high |nr_hours > 200) = ?
Compilation perspective
Bayesian networks
Markov logic networks Factor graphs
Probabilistic databases
Probabilistic logics
Weighted model counting (via circuits: expensive offline, cheap online)
What if we learn such circuits directly?
Deep probabilistic models
Kisa et al. (2014)
Deep probabilistic models • Leaf nodes: tractable univariate
distributions
• Weighted sums of products (i.e., weighted mixtures)
• Computing marginals can be done by single pass
Poon & Domingos (2011)
standard deep models P(y ∣ x) vs here: P(x, y)
Discrete to continuous
Bueff, Speichert & Belle (2018); Molina et al. (2018)
Missing Values
Levray & Belle (2018)
Tractable models for interpreting deep
learning
Fuxjaeger & Belle (2019)
Autoencoder set up• Find a way to minimize reconstruction loss between
feature layer e(x) and reconstruction d(e(x))
• Construct circuit Pr(x) for feature layer
Data Feature Layer
Encoder
Circuit
Decoder
Generate prototypes for labels via Pr(x,y)
Functional tasks• Given first image, generate
second image such that: • Given first image, generate
second image such that:
digit(img1) XOR digit(img2) = 1 digit(img1) XOR digit(img2) = 0
Recent developments• How expressive are such tractable models for causal
reasoning?
• What are the computational advantages if we avoid an explicit hypothesis/model construction?
Papantonis & Belle (2019); Belle & Juba (2019)
Ethical AI (via tractable models)
Implementing fairness
Varley and Belle (2019)
Pr(y ∣ ap) = Pr(y ∣ ¬ap)
Protected attributes + dependent variables
Too extreme?• “Germany recently proposed a code for driverless cars.
The proposal specified, among other things, that a driverless car should always opt for property damage over personal injury. Is this reasonable?”
• “Should an autonomous vehicle swerve and kill its passenger when otherwise it would kill 5 pedestrians?”
Halpern (2018)
Formal framework• Causal model to capture variables, actions and costs
• E.g., T1 (people on track #1 die) = 1 - A (lever pulled)
• Thus, T2 = A
• Degree of blame <- attempt actions with lower costs
Halpern & Kleiman-Weiner (2018)
Can you learn models of moral scenarios?
Can judgements be computed tractably?
Hammond & Belle (2018)
Learn costs by surveying people (or, say, experts)
5
¬L¬W
1
L �
0
5
¬LW
1
L �
0
5
L¬W
0.5
¬L¬W
0.5
3
0.278 0.167 0.556
1
¬R¬U
1
R �
0
1
R¬U
1
¬R�
0
1
R U
0.6
¬RU
0.4
Alignment of decisions
Best friend on main track
Learned utilities
Responsibility and AI• Not meant to be prescriptive, but towards a shared
computational framework for moral judgements (i.e., “ethics bot”)
• Automated decision making can be extremely useful, but also raises many concerns touching on philosophically vexing themes
• Unclear if formal definitions fully capture various viewpoints
• Perhaps AI can help us understand if alignment is even possible between human agenda and machine objectives
Summary & outlook• Explainability, robustness, responsibility challenging
topics
• Symbolic frameworks to specify, interpret and contextualize machine learning models
• Taking steps towards human-in-the-loop decision making