the PowerPoint
Download
Report
Transcript the PowerPoint
Ethics for Machines
J Storrs Hall
Stick-Built AI
●
●
●
●
●
●
No existing AI is intelligent
Intelligence implies the ability to learn
Existing AIs are really “artificial skills”
A human e.g. Grandmaster will have learned
the chess-playing skills
It's the learning that's the intelligent part
Providing ethical constraints to stick-built AI is
just a matter of competent design
Autogenous AI
●
●
●
Truly intelligent AI would learn and grow
Create new concepts and understand the
world in new ways
This is a problem for the ethical engineer:
Cannot know what concepts the AI will have
Can't write rules in terms of them
The Rational Architecture
●
●
●
●
WM (world model) predicts the results of
actions
UF (utility function) evaluates possible worlds
The AI evaluates the effects of its possible
actions and does what it predicts will have the
best results
This is an ideal except in the case of very
simplified worlds (e.g. chess)
Learning Rational AI
●
●
●
WM is updated to use new concepts to
describe the world in
WM changes can be evaluated based on how
well they predict
But on what basis can we update the UF?
Vernor Vinge:
A mind that stays at the same capacity
cannot live forever; after a few thousand
years it would look more like a repeating
tape loop than a person. ... To live
indefinitely long, the mind itself must grow ...
and when it becomes great enough, and
looks back ... what fellow-feeling can it have
with the soul that it was originally?
Invariants
●
●
We must find properties that are invariant
across the evolutionary process
Base our moral designs on those
A Structural Invariant
●
●
Maintaining the grasp, range, and validity of
the WM is a necessary subgoal for virtually
anything else the AI might want to do
Socrates put it:
There is only one good, namely, knowledge; and
only one evil, namely, ignorance.
Evolving AI
●
Current AI only evolves like any engineered
artifact
●
The better it works, the more likely the design is
to be copied in the next generation
Once AIs have a hand in creating new AIs
themselves, there will be a strong force
toward self-interest
The Moral Ladder
●
●
Axelrod's “Evolution of Cooperation”
Subsequent research expanding it
ALWAYS DEFECT is optimal in random
environment
GRIM is optimal in env. of 2-state strategies
TIT-FOR-TAT in env. of human-written strategies
PAVLOV in env. cleared out by TIT-FOR-TAT
etc.
Open Source Honesty
●
●
●
●
●
Intelligent autonomous agents are always
better off if they can cooperate
Even purely self-interested ones
Ascending the moral evolutionary ladder
requires finding others one can trust
AIs might be able to create protocols that
would guarantee their motives
e.g. Public-key signed release of UF
The Horizon Effect
●
●
●
●
Short planning horizons produce unoptimal
behavior
A planning horizon commensurate with the
AI's predictive ability is evolutionarily stable
What goes around, comes around, especially
in an environment of superintelligent AIs
Honesty really is the best policy
Invariant Traits
●
●
●
●
●
Curious, e.g strongly motivated to increase its
understanding of the world
Self-interested
Understands evolutionary dynamics of the
moral ladder
Capable of guranteeable trustworthiness
Long planning horizon
The Moral Machine
●
If we start an AI with these traits, they are
unlikely to disappear in their essence, even if
they get changed in their details beyond
current recognition