module overview and introduction
Download
Report
Transcript module overview and introduction
CS256 Intelligent Systems
-Vision Systems
Module Overview
Timetable
Week(mode)
Topic
1 (2L)
Introduction to the module and vision systems
2(2L)
Case studies and basic concepts
3(2L)
Java and image Fundamentals
4(LP)
Feature Extraction and Image Transforms
5(LP)
Edge Detection and Segmentation
6 (LP)
Colour and Texture
7 (LP)
Recover 3D information
8 (LP)
System Architecture
9 (LP)
Knowledge and Reasoning
10(2L)
Image Classification and Retrieval (including
revision)
Coursework
• Develop a system that is able to identify key
features in selected images.
• Write a report to describe the design,
implementation and evaluation of the
system. Please see details in separate
document on coursework assignment.
• Questions will be asked during lab sessions
• Deadline: Monday 18th April, 2005
Assessment
• Examination
– 60%
– three questions from four
• Coursework
– 40%
– Report based on experiments
Recommended Texts
• Nick Efford, Digital Image Processing, A Practical
Introduction using Java, Addison Wesley, ISBN 0201596237,
May 2000
• Tim Morris (2004), Computer Vision and Image Processing,
Palgrave MacMillan, ISBN 0333994515
• Patrick H Winston, (1992), Artificial Intelligence (Third
Edition), Addison Wesley Publishers Co. ISBN 0201533774
• Rob Callan (2003), Artificial Intelligence, Palgrave
MacMillan, ISBN 0333801369
• Paul F Whelan and Dereck Molloy (2001), Machine Vision
Algorithms in Java: Techniques and Implementation,
Springer, ISBN 1852332182
Objectives of the module
• Understand the fundamentals in machine
intelligence
– Focus on vision systems, but will relate to other
domains
• Understand components in vision systems
– Be familiar with common operations for processing
images
– Be able to implement simple image processing
operations
• Evaluate a vision system
• additionally: encourage the students to practise
more basic and advanced Java programming
Intelligence and Perception
• First to understand how we perceive the world
then to teach the machine to interpret the world
based on primitive data it has received
• Human Perceptual Modalities
–
–
–
–
–
Tactile – touch
Gustatory – taste
Visual – sight
Auditory – hearing
Olfactory – smell
Intelligent Systems
• intelligent robots and intelligent machines
– With artificial intelligence principles
– reason about the world and take appropriate
actions by manipulating knowledge
– sense the world directly
• Vision - computational perception
– a diverse and interdisciplinary body of knowledge
and techniques
– to understand the principles behind the processes
that interpret perceptual signals provided by
various sensors.
Intelligent Systems
• In vision, software’s job is to process the input
from the hardware or sensors
• Humans have the natural abilities to speak, to
see, to think, to smell, to sense etc. Machines do
not have such inborn abilities, but only have
simple engines to follow logical algorithms.
• The procedure to have the computer obtain the
similar natural abilities like speaking and vision,
are closely related to building knowledge system,
but it is also the combination of simulating the
perception procedure and knowledge
Intelligent Systems
• Integrate different levels of processing for
bridging different gaps – sensors, raw data,
low level processing, high level processing
and knowledge, for building a complete
intelligent system
• Reflected in this module structure
Figure 5-10 image B95-00016-01.3.S1.X5.4.jpg (above) and the its
annotation window generated in I-Browse system
Applications
• Classical
–
–
–
–
robot
medical imaging
remote sensing
astronomy
• Today
–
–
–
–
–
–
DTV
image interpretation
biometry
GIS, (Earth/Planetary Observation, monitoring, exploration)
human genome project
Creative media and art, entertainment
Sample applications
- Biometry
• Using personal characteristics to identify a
person
–
–
–
–
–
–
fingerprints
face
iris
DNA
gait
etc
Iris Scan
• Striations on iris are individually unique
• Obvious applications
– security
– PIN
Locate the eye in the head image
Radial resampling of iris
} fixed number of samples
Analysis
Numerical description
Image Representation
1
m
1
y
(r(x,y), g(x,y), b(x,y))
f(x,y)
n
x
An array F:A digital image
consisting of an array
of m x n pixels in the
xth column and the yth
row has an intensity
equal to f(x,y).
Colour image and video sequence
• colour can be conveyed by combining different
colours of light, using three components (red, green
and blue): R = r(x,y); G = g(x,y); B = b(x,y), where R,
G, B are defined in a similar way to F.
• The vector (r(x,y), g(x,y), b(x,y)) defines the intensity
and colour at the point (x,y) in the colour image.
• A video sequence is, in effect, a time-sampled
representation of the original moving scene.
• Each frame in the sequence is a standard colour, or
monochrome image and can be coded as such.
• a monochrome video sequence may be represented
digitally as a sequence o 2-D arrays [F1, F2, F3..FN].
Java example for image representation;-
The Difficulty in Vision Computing –
Taking the Human Visual System for
Granted
• The processing capability of human visual
systems is often taken for granted
• The subtlety and difficulty of describing the
exact operation of the subconscious functions
presents significant difficulty in developing
algorithms to emulate human visual
behaviour
• If we are computer…
Difficulties in vision computing
- the sensory gap
• The sensory gap is the gap between the
object in the world and the information
in a (computational) description
derived from a recording of that scene.
• disambiguation processing
Difficulties in vision computing
- The semantic gap
• The semantic gap is the lack of coincidence
between the information that one can
extract from the visual data and the
interpretation that the same data have for a
user in a given situation. (Arnold, 2000)
• The higher level interpretation, the more
more domain knowledge and its
management are required.