Artificial Intelligence 4. Knowledge Representation

Download Report

Transcript Artificial Intelligence 4. Knowledge Representation

Artificial Intelligence
18. Revision Lecture
Course V231
Department of Computing
Imperial College
© Simon Colton
Overview

Exams: style and example questions
–

Covered in next Thursday’s lecture (12pm)
Today we review the course material
–
–
Could you get a job in the AI industry?
Knowledge, understanding, abilities




What can I get you to write about?
What can I get you to explain?
What can I get you to do?
Try not to infer anything explicit about the exam!
–
Disclaimer: I may miss something today which appears on the
exam. I will certainly cover more stuff than is on the exam.
Four General Areas Covered

General notions of AI
–
–

Automated reasoning
–
–

Predicate logic, automating deduction,
Resolution theorem proving, constraint solving
Machine learning
–
–

Characterisations, autonomous agents, search,
Representations, game playing to demonstrate these notions
Overview (FIND-S), decision trees,
Artificial Neural Networks, Inductive Logic Programming
Evolutionary approaches
–
Genetic algorithms, genetic programming
Characterisations of AI

All about understanding
–
Know that there are different ways to split up AI




–
Long term goals – weak & strong AI
Inspirations, e.g., brain, society, logic, evolution,…
Methodologies: scruffies (me) and neat (good AI people)
Techniques, representations, applications, products
An important characterisation:

Into general problems to be solved,
–
E.g., learning, proving, competing, communicating, creating,
– Exhibiting life, etc.
Artificial Intelligence Agents

Knowledge:
–
–
Definitions of rational, autonomous, agents
What we have to worry about internally (agent)


–
What we have to worry about externally (environment)


Environmental knowledge, utility functions, goals, etc.
Software engineering considerations
Accessibility, determinism, episodes, etc.
Understanding
–
–
Why we use agents as a concept in AI
Why we worry about rationality and autonomy
Search in Problem Solving

Knowledge
–
How to specify a problem as a search problem

–
What the general problems are in search applications


–
–
Depth first, breadth first, iterative deepening, bidirectional
Difference between path cost and heuristic functions
Types of heuristic search


–
What are you looking for (path or artefact)?
Completeness/soundness, time/space tradeoffs, background
Types of uninformed search

–
Initial state, operators, goal test
Uniform path cost, greedy, A*, IDA*
Admissible heuristics, comparing heuristics with effective branching
Hill climbing searches

Local max/min problems, random restart, simulated annealing
Search in Problem Solving

Understanding
–
–
–
Agenda analogy and graph analogy
Why we have to search for an answer to problems (+coursework)
Why we need different types of search

–
–

Uninformed searches, heuristic searches
Why completeness & soundness are important notions
Why heuristics are needed, what they are in general
Abilities
–
–
Specify a problem as a search problem
Simulate a specific kind of search

–
E.g., which nodes are expanded next (draw graph, etc.)
Calculate effective branching rates, compare heuristics

Calculate search space sizes, calculate heuristic measures
Knowledge Representation

Knowledge
–
General types of representation available

–
Logical representations available


Propositional, predicate, higher order, fuzzy, etc.
Understanding
–
–
–

Logical, graphical, production rules, frames
Why different representations are required
Limitations of each representation
Expressiveness in logical representations
Abilities
–
Represent information

Logically, as semantic networks, in a frame based way, etc.
Game Playing

Knowledge
–
–
–
–
–

Understanding
–

What a zero-sum two player game is
What the minimax principle is in general, what cutoff search is
Evaluations functions, weighted linear functions
Alpha-beta pruning
Expectimax, chance nodes
Why using minimax strategies is rational
Abilities
–
–
–
Write down entire search trees for simple games
Propagate scores from the bottom to top of the trees
Work out the next move for a player, including expectimax
Representing Knowledge
in Predicate Logic

Knowledge
–
Syntax and semantics of first order predicate logic

–
–
–
–
–
Sentences, connectives, constants, predicates, variables,
functions, quantifiers, etc.
What quantifiers mean, their scope, etc.
Instantiation, ground variables, etc.
Horn clauses, logic programs, body, head
How search is undertaken in Prolog, LIPS, WAM
OR-parallelism, AND-parallelism
Representing Knowledge
in Predicate Logic

Understanding
–
Differences between propositional and predicate logic

–
–
–

Benefits to predicate logic (expressivity)
What the terms syntax and semantics mean
Prolog is a declarative, not procedural language
How logic-based expert systems work in general
Abilities
–
–
Translate English to first order predicate logic
Translate first order predicate logic to English

–
–
(Without making mistakes either way!)
Simulate a Prolog-style search
Identify parts of logic sentences and logic databases

e.g., quantifiers, constants, head, body, literal, Horn clauses
Making Deductive Inferences

Knowledge
–
–
That and, or are commutative, distributive
Some commonly used propositional equivalence rules

–
Some commonly used implication rules


–
Unit resolution, and elimination, or introduction,
Existential elimination, etc.
I tend to supply inference rules in exams

–
Double negation, rules [E1], [E4], [E5], de Morgans law
Not simple/common ones
Different ways of chaining together inference steps

Forward/backward chaining, proof by contradiction
Making Deductive Inferences

Understanding
–
–
–
–

What it means for two sentences to be logically equivalent
What it means for a sentence to be false
What it means for one sentence to entail another
How rewrite rules can be used for proving equivalences
Abilities
–
Use truth tables to


–
Apply inference rules

–
Show equivalences, tautologies, that a statement is false
That one statement implies another
Show what’s above and below the line
Translate sentences:

Be fluent in rewriting sentences
The Resolution Method

Knowledge
–
–
–
–
What conjunctive normal form is
What a substitution is, what unification does
Overview of the unification algorithm
The resolution rule


Unit resolution, full resolution, generalised resolution
Understanding
–
–
–
That resolution is refutation-complete
Why we need conjunctive normal form/unification
Why the occurs-check is important in unification
The Resolution Method

Abilities
–
Translate something into conjunctive normal form



By using equivalence rules
Organising quantifiers, standardising variables, etc.
Existential elimination
–
Put a constant in place of an existential variable
– Not using full skolemisation (we skipped over that)
–
Prepare a set of sentences for use in a resolution proof

–
Needs all sentences as single clauses (just split them)
Find a unifying set of substitutions

Apply them to unify two sentences
Resolution Theorem Proving

Knowledge
–
–
–
–
Specifying a problem as axioms and theorem
As a search problem: operators, initial states (CNF), the goal test
Dealing with equality (demodulation)
Heuristic strategies:

–
Overview of some other topics


Unit preference, set of support, input resolution, subsumption
Higher order proving, interactive, etc.
Understanding
–
–
–
–
Why deriving the empty clause means a contradiction
Why we negate the theorem statement
Why proof by contradiction is valid
Know that resolution has been applied to mathematics
Resolution Theorem Proving

Abilities:
–
Prove a theorem using the resolution method




Remember to negate theorem statement
Follow proof all the way
Draw the proof tree
Or organise the resolution steps in a way
–
–
Which makes me think you know what you’re doing
Deduce something from a set of axioms

Not necessarily related to proving something
Overview of Machine Learning

Knowledge
–
What ML problem constituents are:


–
Examples (pos & neg), background information
What kind of errors can occur in the data
What ML method constituents are:


Representation of solutions (v. important)
How to search for solutions, how to choose between solutions
–
–
The FIND-S method

–
Simple, guaranteed to find the most specific solutions
How we assess hypotheses


–
Occam’s (Ockham’s) razor: choose simplest if all else equal
False negatives, false positives
Predictive accuracy on training set, test set, comprehensibility
How we assess learning methods: cross-validation, hold-back

Definition of overfitting
Overview of Machine Learning

Understanding
–
–
–
Learning from examples
How induction differs from deduction
That positives and negatives are:

–
–
What robustness means
That hypothesis accuracy doesn’t necessarily mean that the
method is a good one


Correct and incorrect classifications
That methods can overfit data (memorise)
Abilities
–
–
–
Specify machine learning problems
Identify problem constituents/method constituents
Simulate search in the FIND-S method
Decision Tree Learning

Knowledge
–
–
–
–
What entropy and information gain are
How the ID3 algorithm works
How we can try to avoid overfitting trees
What are good problems for decision tree approaches


Attribute-value pairs, discrete values, etc.
Understanding
–
What the tree representation is


–
Why it is both a graphical and a logical representation
How it can be thought of as a categorisation problem
What entropy is estimating
Decision Tree Learning

Abilities
–
–
–
–
–
To read decision trees
To construct decision trees from English
specifications
Specify a learning problem for decision tree learning
Calculate entropy and information gain for attributes
Simulate the ID3-algorithm in action

Calculate information gain, choose attributes, restrict data
(Sv), how/when to end branches
Two Layer ANNs

Knowledge
–
How information is stored in an ANN (in the weights)

–
–
–
How ANNs are used to classify examples
What are input/hidden/output units/layers
What a perceptron is

–
How weighted sums are calculated
Threshold functions: step, sigma, linear
Perceptron training rule


Using a learning rate
How to calculate weight changes
–

–
Target and observed output values for output units
Epochs over all the examples
What linearly seperable means, what boolean function means
Two Layer ANNs

Understanding
–
–
–
–
–

Difference between symbolic and non-symbolic representations
Motivation from biology (and the limitations of this)
Why perceptrons are limited
Why linear separability is required for perceptrons to learn something
Why learning rate is usually set to a small value (undo previous)
Abilities
–
–
–
–
–
Write down a perceptron to calculate a given function (e.g., boolean)
Describe what a ANN would calculate
Propagate values from left to right to make classifications
Calculate weight changes in the perceptron learning rule
Simulate the perceptron learning rule
Multi-layer ANNs

Knowledge
–
–
Perceptron units in multi-layer ANNs
How sigmoid units calculate outputs from weighted sums

–
How feed forward networks calculate values

–
Overview of how this works (prop-back), epochs, weight changes, initial
random assignment of weights, etc.
How to avoid local minima


–
How these values are turned into classifications
Backpropagation Learning Routine

–
Formula for the sigma function
Calculating network error (in overview)
Adding momentum
How to avoid overfitting

Validation set, weight decay
Multi-Layer ANNs

Understanding
–
–
Why sigma being differentiable is important
Which kinds of problems are suitable for ANNs

–

Long training, short execution, comprehension not a problem, etc.
Why momentum works
Abilities
–
Feed values forward to calculate outputs, use to classify examples

–
–
–
–
–
E.g., numerical functions, pixel data, etc.,
Calculate output values for the sigma function
Calculate error terms for the output units (given formula)
Calculate error terms for the hidden units (given formula)
Calculate weight changes using the error terms
Simulate the back-propagation algorithm
Inductive Logic Programming

Knowledge
–
Problem context and specification



–
How we can invert resolution



–
Logic programs (background, E+, E-, hypothesis)
Prior satisfiability and necessity
Posterior satisfiability and sufficiency
To use induction (rules given) to find new sentences
Absorption, identification, etc.
What V and W operators are
How ILP systems search for hypotheses



Specific to general (using induction) G to S (using deduction)
How pruning and sorting increase efficiency
How language restriction increase efficiency
Inductive Logic Programming

Understanding
–
Why logic programs are a good representation

–
–
Why the problem specification is necessary
Why we invert resolution for our operators


Easy to read, logical
Prove that the observations follows from axioms + H
Abilities
–
Apply rules of inference such as absorption

–
–
Rules would be given
Use resolution to demonstrate how the hypothesis proves the
observations (other sentences)
Determine which are more general/specific sentences using
entailment
Constraint Satisfaction Problems

Knowledge
–
What a formal representation of a CSP is

–
–
In terms of variables, domains, constraints (fully written out)
What a binary CSP is
What arc consistency is

How to make a problem specification arc-consistent
–
–
–
–
–

By removing values from domains of variables
What happens in backtracking search
What forward checking is
What variable and value ordering heuristics do
What the fail first heuristic is
Understanding
–
–
Why we write out CSPs formally
Why we write out constraints as tuples which are allowed
Constraint Satisfaction Problems

More understanding
–
–
–
–

Why binary CSPs are important
Why forward checking works
Why fail first is so-called, and why it works
Why value-ordering methods may be bad ideas
Abilities
–
–
–
–
To specify a problem as a formal constraint satisfaction problem
(even if this is annoying for you!)
Translate constraint formalisms to general constraints (e.g.,
using less than/greater than in linear arithmetic)
Make problem specifications arc-consistent
Simulate back-tracking search (with forward checking)

Show understanding of fail-first etc. by doing hand calculations
Genetic Algorithms

Knowledge
–
Of the evolutionary approach in overview

–
Generate populations, fitness functions, recombination, etc.
Canonical genetic algorithm




Describe this schematically
How individuals are selected to mate (fitness function, number
which are guaranteed entry into intermediate population)
How pairs are chosen to mate, and chosen to produce offspring
How offspring are produced through recombination
Genetic Algorithms

Understanding
–
The inspiration from natural evolution (species/genes)

–
–
–

And its limitations
That it’s difficult to specify solution representations
That it’s difficult to specify fitness functions
Why mutation is important (local maxima avoidance)
Abilities
–
–
Represent things (e.g., integers) as bit strings
Perform crossover and mutation operators

–
One point and two point crossover
Calculate evaluation functions

Use these in fitness functions for the intermediate population
Genetic Programming

Knowledge
–
Representation of programs as graphs

–
Specifying a problem for a GP approach

–
–
What result producing branches are
Solution space of programs, fitness function (evaluation function)
What the terminal set, function set, control parameters and
termination conditions are
How individuals are chosen for mating

Using fitness for probabilities for intermediate population
–
–
–
Or using tournament selection, or ranking
What the reproduction, crossover and mutation operators do
What the crossover fragment is
Genetic Programming

Understanding
–
That it is automatic programming

–
–
Why graphical representation of programs is good
That architecture altering operations are used

–

That this is hard, other programs do this in a limited way
On more sophisticated program spaces
That GP can achieve human-level performance
Abilities
–
Translate from functions (with things like if-statements)

–
–
–
To graphs and back again
Determine function and terminal sets from programs
Perform crossover and mutation
Simulate (describe) how a GP approach would proceed