Towards Solving Metric Labeling Problems in Computer Vision

Transcript Towards Solving Metric Labeling Problems in Computer Vision

Image Basics
Hao Jiang
Computer Science Department
Sept. 4, 2014
1
Image Formulation
 The most common way to obtain an image is from a
camera
2
A “Simple” Camera
Let’s hold a sensor (a film) in front of the object.
Hopefully we will have an image…
3
A “Simple” Camera
Unfortunately, at the same image point, light may come
from different source points on an object.
4
The Pinhole Camera
5
Camera with Lens
6
The Imaging Model
lighting
Camera pose,
Optical properties
Surface property: material, geometry.
7
Images as Surfaces
Image can be treated as a 2D function z = f(x, y).
Image Profile
9
Sampling
 To “digitize” the continuous image, we need to
sample the image first.
Sampling on a grid
Sampling problem
The image of Barbara
Aliasing due to sampling
1
0.8
0.6
Amplitude
0.4
0.2
0
fs = 2.5f
-0.2
-0.4
-0.6
-0.8
-1
0
10
20
30
40
50
t
60
70
80
90
100
1
0.8
A new component is added
0.6
Amplitude
0.4
0.2
0
fs = 1.67f
-0.2
-0.4
This is denoted
as aliasing.
Original signal
-0.6
-0.8
-1
0
10
20
30
40
50
t
60
70
80
90
100
Image Resolution
 Sensor: size of the real world scene into a single
image pixel.
 Image: number of Pixels.
14
Digitization
 The samples are continuous and have infinite
number of possible values.
 The digitization process approximates these values
with a fixed number of numbers.
 To represent N numbers, we need log2N bits.
 So, what determines the number of bits we need for
an image?
Image as Matrices
174 167 184
207 213 227
16
Types of Digital Images
 Grayscale image
 Usually we use 256 levels for each pixel. Thus we need 8bits
to represent a pixel (2^8 == 256)
 Some images use more bits per pixel, for example MRI
images could use 16bits / pixel.
A 8bit grayscale
Image.
 Binary Image
A binary image has only two values (0 or 1).
Binary image is quite important in image analysis and object
detection applications.
Gay Scale Image as a Stack of Binary
Images
[ b7 b6 b5 b4 b3 b2 b1 b0]
MSB
LSB
Each bit plane is a binary image.
Dithering
 A technique to represent a grayscale image with a
binary one.
Convert image to
4 levels:
I’ = floor(I/64)
0 
1
2 
3
Color Image
r
g
b
24 bit image
Color Table
Image with 256 colors
b
g
It is possible to
use much less colors
to represent a color image
r without much degradation.
Clusters of colors
Gamma Correction
 Display device’s brightness is not linearly related to
the input.
I’ = Ig
 To compensate for the nonlinear distortion we need
to raise it to a power again
(I’)1/g = I
g for CRT is about 2.2.
Gamma Correction
Linearly increasing intensity
without gamma correction
Linearly increasing intensity
with gamma correction
Image File Formats
 An image in “ppm” format:
P6: (this is a ppm image)
Resolution: 512x512
Depth: 0-255 (8bits per pixel in each channel)
An image
contains
a header and
a bunch of
(integer) numbers.
Image Compression and Encoding
 Raw image takes a lot of space. Compute the file
sizes of a raw image that has resolution 512x512 in
true color.
 BMP, PPM, TXT
 Images can be “compressed” losslessly or lossly
 Lossy image format: JPEG
 Losslessly compressed image format: PNG
 Compression ratio and bit rate
27
Digital Video
time
Frame N-1
Frame 0
Digital video is digitized
version of a 3D function
f(x,y,t)

Towards Solving Metric Labeling Problems in Computer Vision

Transcript Towards Solving Metric Labeling Problems in Computer Vision

Directory