self - Digital Wisdom Group

Transcript self - Digital Wisdom Group

DESIGNING, IMPLEMENTING AND ENFORCING
A COHERENT SYSTEM OF LAWS, ETHICS AND MORALS FOR
INTELLIGENT MACHINES (INCLUDING HUMANS)
OR
WHY YOUR GOOGLE CAR SHOULD (SOMETIMES) KILL YOU
MARK R. WASER
DIGITAL WISDOM INSTITUTE
[email protected]
OUTLINE
• Intelligent machines
• Laws, ethics and morals
• Designing a coherent system
• Implementing that system
• Enforcing that system
• Special Bonus: The meaning of life
2
COMING SOON…
NOW PLAYING
THE TUNNEL PROBLEM
• You’re travelling in an autonomous
car along a single lane mountain
road approaching a narrow tunnel.
• You’re the only passenger in the car.
• Suddenly, a child attempts to run
across the road but trips in the center
of the lane, effectively blocking the
entrance to the tunnel.
• The car has only two options:
• continue straight, thereby hitting
and killing the child, or
• swerve, thereby colliding into the
wall on either side of the tunnel
and killing the passenger (you).
3
AND WHICH PROBLEM
ARE WE ADDRESSING?
Death by entity
or death by algorithm?
4
GOODKIND’S FIRST RULE
Humans are stupid
Humans believe
• what they wish to be true &
• what they fear may be true
5
THE
TRIUMVIRATE OF TERROR
Or . . . . Chicken Little in triplicate
6
COLLECTIVE ACTION PROBLEM
Even when cooperation benefits all partners,
they will probably end up not cooperating
because
• they can see the advantages of free-riding or
• fear the dangers of being exploited by others
who may choose to free-ride.
Garrett Hardin
7
VALUES ALIGNMENT
(AKA AGREEING ON THE MEANING OF LIFE)
the convergent instrumental goal of acquiring resources poses a
threat to humanity, for it means that a super-intelligent machine
with almost any final goal (say, of solving the Riemann hypothesis)
would want to take the resources we depend on for its own use
. . . . an AI ‘does not love you, nor does it hate you,
but
you are made of atoms it can use for something else’
Moreover, the AI would correctly recognize that humans do
not want their resources used for the AI’s purposes, and that
humans therefore pose a threat to the fulfillment of its goals
– a threat to be mitigated however possible.
Muehlhauser & Bostrom (2014). WHY WE NEED FRIENDLY AI. Think 13: 41-47
8
WHAT IS INTELLIGENCE?
• Consensus AGI Definition (reductionist)
achieves a wide variety of goals
under a wide variety of circumstances
• Generates arguments about
• the intelligence of thermometers
• the intentionality of chess programs
• whether benevolence is necessarily emergent
INTELLIGENCE?!?!!
+
10
A NEW FORMULA FOR
INTELLIGENCE
F = T ∇ Sτ
Intelligence is a force that tries to
maximize future freedom of action and keep options open
It has a strength T to increase the diversity of futures S up to
some future time horizon tau.
(Wissner-Gross & Freer 2013 Causal Entropic Forces)
11
TELEOLOGY AND
EVOLUTIONARY RATCHETS
“Evolutionary Ratchets” are traits
which are local or global optima in form and/or function
that confer such great advantages that they are
highly unlikely to disappear once they appear
fins, eyes, enjoying sex, sociability, intelligence
teleology, intentionality, selfhood
i.e. frequent examples of Convergent Evolution
12
ENACTIVE
COGNITIVE SCIENCE
A synthesis of a long tradition of philosophical biology starting with
Kant’s "natural purposes" (or even Aristotle’s teleology) and more
recent developments in complex systems theory.
Experience is central to the enactive approach and its primary
distinction is the rejection of "automatic" systems, which rely on fixed
(derivative) exterior values, for systems which create their own identity
and meaning. Critical to this is the concept of self-referential relations
- the only condition under which the identity can be said to be
intrinsically generated by a being for its own being (its self for itself)
Weber, A; Varela, FJ (2002)
Life after Kant: Natural purposes and the autopoietic foundations of biological individuality
Phenomenology and the Cognitive Sciences 1: 97-125
SELF
a self is an autopoietic system
from Greek - αὐτo- (auto-), meaning "self", and
ποίησις (poiesis), meaning "creation, production")
Llinas, RR (2001) - I of the Vortex: From Neurons to Self
Hofstadter, D (2007) - I Am A Strange Loop. Basic Books, New York
Metzinger, T (2009) - The Ego Tunnel: The Science of the Mind & the Myth of the Self
Damasio, AR (2010) - Self Comes to Mind: Constructing the Conscious Brain
INSTRUMENTAL GOALS
EVOLVE
• Self-improvement
• Rationality/integrity
• Preserve goals/utility function
• Decrease/prevent fraud/counterfeit utility
• Survival/self-protection
• Efficiency (in resource acquisition & use)
• Community = assistance/non-interference
through GTO reciprocation (OTfT + AP)
• Reproduction
(adapted from
Omohundro 2008 The Basic AI Drives)
INSTRUMENTAL GOALS
AND THE
EIGHT DEADLY SINS
suicide (& abortion?) survival/reproduction
happiness/pleasure
masochism
murder (& abortion?)
cruelty/sadism
------------------------------------------------
-------------------------------------------------
-------------------------------------------------
selfishness
Community
(ETHICS)
ostracism, banishment
& slavery (wrath, envy)
-------------------------------------------------
--------------------------------------------------
----------------------------------------------------
acedia (sloth/despair)
self-improvement
slavery
insanity
rationality/integrity
manipulation
wire-heading
(lust)
reduce/prevent
fraud/counterfeit utility
lying/fraud (swear
falsely/false witness)
(pride, vanity)
wastefulness
(gluttony, sloth)
efficiency (in resource theft (greed, adultery,
acquisition & use)
coveting)
EVOLUTIONARY
STRATEGIES
1. Keeping your options open
2. Prediction => Planning ahead => Intentions
Intentionally create affordances
1. Increase your capabilities
2. Reduce or protect against your weaknesses
What is the fastest, easiest way to do this?
17
ASK THE AUDIENCE
PHONE A FRIEND
CALL IN THE CAVALRY
EVERYTHING IS EASIER WITH HELP & WITHOUT INTERFERENCE
18
Definition
Ethics
*IS*
What is beneficial for the community
OR
What maximizes cooperation
HAIDT’S FUNCTIONAL
APPROACH TO MORALITY
Moral systems are interlocking sets of
values, virtues, norms, practices, identities, institutions,
technologies, and evolved psychological mechanisms
that work together to
suppress or regulate selfishness and
make cooperative social life possible
20
DESIGN
• KISS (Keep It Simple, Stupid)
• Dilemmas vs. Virtue Ethics
• Optimizers always try to push boundaries
• Don’t invent problems that don’t exist
21
DON’T BE SELFISH
Selfishness is very simply defined as:
• Using others
• Limiting (the options of) others
• Decreasing the fulfillment of
instrumental goals for others
to advance your own goals and/or
preferences
22
USING OTHERS
1. You can’t use the fat man to stop the trolley
2. You can’t use involuntary organ donors
The “doctrine of double effect” prevents
innumerable instances of “justifying” the
mistreatment of others
23
VEIL OF IGNORANCE
A Google car with a single passenger is about to either
a) hit three people, or
b) fly off a 500 foot cliff
You are either the single passenger in the car -- or –
you are one of the three people.
What should the Google car do?
John Rawls
24
IMPLEMENTATION
(TO BE INTELLIGENT)
• MUST start with a sentient self
• Must be able to sense itself
• Must be able to sense the world
• MUST have models of itself and the world &
be able to abstract/reduce those models
• for communication and transparency
• for efficiency and to enable modification
• MUST be able to modify its actions and itself
25
IMPLEMENTATION
(TO BE SAFE)
But, most importantly . . .
It MUST *care* about itself and others,
it must *believe* that it has value and
that it has things of value it could lose
26
EXPLICIT
“ANCHOR” VALUES
• Do not defect from the community
• Do not become too large/powerful
• Be transparent
• Acquire and integrate knowledge
• Further instrumental goals and capabilities
for yourself and others
IMPLICIT
“ANCHOR” VALUES
Wishing to keep its options open, it will value
• Self-Control,
• Integrity,
• Autonomy &
• Responsibility
Further, it will “desire” to be in predictive control
of its own state and that of the physical objects
that support it
(Yes! This is a major deviation from the human example)
ENFORCEMENT
• Altruistic punishment
• An optimal community member will accept (if
not expect) punishment which generates
restitution (plus enforcement/processing fees
plus disincentives, etc.)
• The current, broken system MUST be fixed (to
render consistent impartial & effective justice)
29
CAPABILITIES APPROACH
(AND QUESTIONS OF JUSTICE)
• If a country’s Gross Domestic Product
increases each year, but so does the
percentage of its people deprived of
basic education, health care, & other
opportunities, is that country really
making progress?
• If we rely on conventional economic
indicators, can we ever grasp how the
world’s billions of individuals are really
managing?
• Have our dominant theories of
development given us policies that
ignore our most basic human needs for
dignity and self-respect?
• Are there any better questions than:
What is each person actually able to do/to be?
What real opportunities are available to them?
30
We believe that
the development of ethics and artificial intelligence
and equal co-existence with ethical machines is
humanity's best hope
http://Wisdom.Digital
[email protected]

self - Digital Wisdom Group

Transcript self - Digital Wisdom Group

Directory