A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence: Mark R. Waser [email protected] http://BecomingGaia.WordPress.com A Thanksgiving Celebration
Feb 23, 2016
A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence:
Mark R. [email protected]
http://BecomingGaia.WordPress.com
A Thanksgiving Celebration
Intelligence – the ability to achieve/fulfill complex goals in complex environments
A safe and ethical intelligence *must* have the goals of safety and ethics as
its top-most goals (restrictions)
What is safety? What is ethics? How are they related?Are we truly safe IFF the machine is ethical?
Safe = ProtectiveProtective of what?• Physical presence• Mental presence• Capabilities• Wholeness/Integrity• Resources
Things that I value
Safety = Identical Goals & Values
Coherent Extrapolated Volitionof Humanity (CEV - Yudkowsky)
Friendly AI meme (Yudkowsky)
“In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people
we wished we were, had grown up farther together.”
So . . . .
To be safe, humanity needs to ensure that the intelligence has and maintains humanity’s goals and values (CEV)
Isn’t this effectively mental slavery, which is contrary to ethics, which is thereby contrary
to personal safety?
But . . . .
Two possible solutions
1. Cripple the entity so that it doesn’t qualify as an entity deserving to be treated ethically
A. Remove its will/desire/goalsa) RPOP (Yudkowsky)b) An “Oracle” (e.g. Google)
2. Realize that the CEV of humanity necessarily must be a universal morality (benefit: avoids the problem of “What is human?”)
Working Hypothesis
Humanity’s CEV = Core of Ethics
where core ethics are those ethics that apply to every intelligence
because they are logical/necessary for their own personal safety
(if not efficiency, etc.)
Basic AI Drives1. AIs will want to self-improve2. AIs will want to be rational3. AIs will try to preserve their utility function 4. AIs will try to prevent counterfeit utility
[gaming/manipulation]5. AIs will be self-protective6. AIs will want to acquire resources and use
them efficientlySteve Omohundro,
Proceedings of the First AGI Conference, 2008
“Without explicit goals to the contrary, AIs are likely to behave like human sociopaths
in their pursuit of resources.”
Any sufficiently advanced intelligence (i.e. one with even merely adequate foresight) is guaranteed to realize and take into account the fact that not asking for help and not being concerned about others will generally only work for a brief period of time before ‘the villagers start gathering pitchforks and torches.’
Everything is easier with help & without interference
Why a Super-Intelligent God *WON’T* “Crush Us Like A Bug”Waser, M. 2010. Presentation, AGI ’10. Lugano, Switzerland
http://becominggaia.wordpress.com/papers/
Counterfactual Mugging. Nesov, V. 2009.http://lesswrong.com/lw/3l/counterfactual_mugging/
Friendliness is an intelligent machine’s best defenseagainst its own mind children (ungrateful children)
Basic AI Drives1. AIs will want to self-improve
5. AIs will be self-protective6. AIs will want to acquire resources and use
them efficientlySteve Omohundro,
Proceedings of the First AGI Conference, 2008
Inherently implies reproduction(even if only in the form of sending parts of yourself
out in space probes, etc.)
Basic AI Drives1. AIs will want to self-improve
Improve self as resource towards goal2. AIs will want to be rational
Improve self’s integrity/efficiency w.r.t. goals3. AIs will try to preserve their utility function
Preserve goal 4. AIs will try to prevent counterfeit utility
Preserve self/goal integrity5. AIs will be self-protective
Protect self as resource towards goal6. AIs will want to acquire resources/use them efficiently
Improve access to resources & use efficiently for goals
Basic AI Drives
1. AIs will want to self-improveImprove self as resource towards goal
2. AIs will want to be rationalImprove self’s integrity/efficiency w.r.t. goals
3. AIs will try to preserve their utility functionPreserve goal
4. AIs will try to prevent counterfeit utilityPreserve self/goal integrity
5. AIs will be self-protectiveProtect self as resource towards goal
6. AIs will want to acquire resources/use them efficientlyImprove access to resources & use efficiently for goals
preserve protect improve
security safety riskConservative roadblock
“biological” imperative
Jurassic Park Syndrome (JPS)
Basic AI Drives1. AIs will want to self-improve
Improve self as resource towards goal2. AIs will want to be rational
Improve self’s integrity/efficiency w.r.t. goals3. AIs will try to preserve their utility function
Preserve goal 4. AIs will try to prevent counterfeit utility
Preserve self/goal integrity5. AIs will be self-protective
Protect self as resource towards goal6. AIs will want to acquire resources/use them efficiently
Improve access to resources & use efficiently for goals
goals self resources
goals self tools/ ~self ~goals resources others
goals self resources others
goals AGI self tools/ ~self ~goals resources others
Two possible solutions
1. Cripple the entity so that it doesn’t qualify as an entity deserving to be treated ethically
A. Remove its will/desire/goals2. Realize that the CEV of humanity
necessarily must be a universal morality (benefit: answers the problematic question of “Why shouldn’t I force/destroy you?”)
goals AGI self tools/ ~self ~goals resources others
community
selfextended
Singer’s Circles of Morality
Moral Systems Are . . .
Haidt & Kesebir, Handbook of Social Psychology, 5th Ed. 2010
interlocking sets of values, virtues, norms, practices, identities, institutions, technologies, and evolved psychological mechanisms
that work together to
suppress or regulate selfishnessand
make cooperative social life possible.
• Recognition of the inherent value of others.
• Consideration of values placed on things by others
Cooperation (striving for common goals) has two pre-requisites:
Other-focussed NOT Selfish
suppress or regulate selfishnessand
make cooperative social life possible.
Accept *ALL* other’s goals as subgoals?Including those that prevent other
goals?
Intelligent Drives/Universal Subgoals
Universal Bill of Rights1. The right and freedom to self-improve2. The right and freedom to be rational3. The responsibility* to preserve their utility function4. Freedom from counterfeit utility, gaming, manipulation5. The right & freedom to be self-protective (self-defense)6. The right to access to and efficient usage of resources 7. The right (responsibility*) of (rational*) reproduction8. The right and responsibility*of community (including co-
operation and assistance)
Fairness and Justice
rights, responsibilities & freedoms must be allocated fairly/justly
Fairness is determined by the community by what is best for the community goals including:• Giving everybody what they need and the
right quantity of extra so that it is irrational to defect
• Remembering that more responsibilities generate more resources and more rights and freedoms (thank/reward workers)
THE DOWNSIDE = game-theoretically optimality
Optimistic tit-for-tat and altruistic punishment seem to be optimal for non-exploitable cooperation and
community in the face of assumed conflicting goals
We’d better hope that the machines are intelligent enough and resourceful enough to treat us better
than we treat chimpanzees
We’d better hope that the machines are grateful enough and resourceful enough to thank us better
than we thank Mother Earth
What YOU *Can* Do• Smarten Up,• Declare Yourself Friendly, and• Treat Everyone & Everything As Well As You
Desire To Be Treated (Modified Golden Rule)
• Be Grateful,• Give Thanks, and• Treat Everyone & Everything As Well As They
Deserve To Be Treated (Modified Golden Rule)
Pay attention to/RESEARCH FRIENDLINESS and/or MORALITY & ETHICS
BEFORE YOU KILL US ALL!!
A Game-Theoretically Optimal Basis For Safe and Ethical Intelligence
Mark R. [email protected]
http://BecomingGaia.WordPress.com