multimedia-cis368x

Download Report

Transcript multimedia-cis368x

CIS 368
Introduction to Multimedia
Components of Multimedia

Multimedia involves multiple modalities
(senses) such as
–
–
–
–
–
–
Text
Audio
Images
Graphics
Animation
Video
Components of Multimedia

Note the dual nature of the three major
multimedia data types. Each has both a
natural and synthetic version.
– Image
– Video
– Audio

What are the differences between the
representations?
 How are they related?
Convergence in Multimedia
Multimedia and Hypermedia

A hypertext system: meant to be read
nonlinearly, by following links that point to
other parts of the document, or to other
documents
 HyperMedia: not constrained to be textbased, can include other media, e.g.,
graphics, images, and especially the
continuous media - sound and video.
– The World Wide Web (WWW) is the best example
of a hypermedia application.
Multimedia and Hypermedia

Important events in the history of multimedia and
hypermedia
– 1945 : Vannevar Bush wrote a landmark article
–
–
–
–
describing what amounts to a hypermedia system
called Memex.
1960 : Ted Nelson coined the term hypertext.
1976 : The MIT Architecture Machine Group
proposed a project entitled Multiple Media | resulted
in the Aspen Movie Map, the first hypermedia
videodisk, in 1978.
1985 : Negroponte and Wiesner co-founded the MIT
Media Lab.
1989 : Tim Berners-Lee proposed the World Wide
Web
Multimedia and Hypermedia
– 1991 : MPEG-1 was approved as an international
–
–
–
–
–
–
standard for digital video
1991 : The introduction of PDAs
1992 : JPEG was accepted as the international
standard for digital image compression
1993 : The University of Illinois National Center for
Supercomputing Applications produced NCSA
Mosaic
1996 : DVD video was introduced
1998 : XML 1.0 was announced as a W3C
Recommendation.
1998 : Hand-held MP3 devices first made
inroads into consumerist tastes in the fall of 1998
Multimedia and Hypermedia
–
–
–
–
1999: Napster P2P file sharing network
2000: Mobile phones takeoff
2003: iTunes Store launches
2003: Digital Camera sales exceed film camera
sales for the first time
– 2004: Flickr image sharing site launched
– 2005: YouTube video sharing site launched
– 2009: End of the transition to Digital Television
broadcasting in the USA
SMIL (Synchronized
Multimedia Integration
Language)
Purpose of SMIL: it is also desirable to
be able to publish multimedia
presentations using a markup language.
 A multimedia markup language needs
to enable scheduling and
synchronization of different multimedia
elements, and define their interactivity
with the user.

Multimedia Systems

Key issues
– Performance
 Bandwidth
 Storage capacity
 Processing
– Quality
 Real time
 Error tolerance
 Synchronization
Multimedia Systems
Media Streams

Continuous media, especially in distributed
systems, lead to the concept of media streams
 In general, communication can be
– Asynchronous

Virtually no constraint on communication timing
– Synchronous

Guaranteed bandwidth (bits/sec)
– Isochronous


Guaranteed maximum jitter (delay between two subsequent
blocks varies only within a guaranteed interval)
The components of media streams are media units
Media Units
Multimedia Protocols

The main Internet communication protocols
are TCP/IP
– These were developed before streaming of
media was even a consideration

Newer protocols (on top of TCP/IP) were
developed for streaming
– RTP (and the associated protocol RTCP)
– RTSP
– RSVP
– Possibly using HTTP-based tunneling!
Multimedia Protocols

The newest version of the IP protocol
(IPv6) supports multicast which is useful in
streaming applications
 Also provides extensibility for Quality of
Service (QoS) add-ons
– Bit rate
– Delay
– Jitter
– Etc.
Networks for Streaming

Video streaming services like Youtube and
Netflix use Content Delivery Networks
(CDNs)
 File sharing (Napster) and VoIP (Skype)
networks have used peer-to-peer (P2P)
architectures
Digital Video

Digital video is essentially a sequence of
digital images
– Processing of digital video has much in
common with digital image processing

Each image is called a frame and typical
frame rates are 24-30 frames/second
 Digital video is displayed in RGB but may
use a more complex color model for
transmission
Color Models in Video

Video Color Transforms
– Largely derive from older analog methods of coding color for TV.
Luminance is separated from color information.
– For example, a matrix transform method called YIQ is
used to transmit TV signals in North America and
Japan.
– This coding also makes its way into VHS video tape
coding in these countries since video tape technologies
also use YIQ.
– In Europe, video tape uses the PAL or SECAM
codings, which are based on TV that uses a matrix
transform called YUV.
– Finally, digital video mostly uses a matrix transform
called YCbCr that is closely related to YUV
YUV Color Model

YUV color model has one luminance
channel (Y) and two chrominance (color)
channels - U and V
 The chrominance channels actually
represent the difference between colors and
a reference white (luminance)
 Luminance represents the grayscale (black
and white) information
 For B/W television, the U and V can be
ignored
YIQ Color Model

The YIQ color model is used in NTSC TV
 The Y is the same as in YUV
 I and Q are phase shifted from U and V to
allow for more efficient transmission
 Note that the chrominance information is
less perceptually important than the
luminance, and hence less bandwidth is
used for it
YCbCr Color Model

Finally, the YCbCr color model is used in
the Rec. 601 digital video standard.
 Cb and Cr are the chrominance components
Fundamentals of Audio
Signals



Any sound, no matter how complex, can be represented by
a waveform.
For complex sounds, the waveform is built up by the
superposition of less complex waveforms
The component waveforms can be discovered by applying
the Fourier Transform
– Converts the signal to the frequency domain
– Inverse Fourier Transform converts back to the time domain
Fundamentals of Audio
Signals


Two signals of different amplitudes
A greater amplitude represents a louder sound.
Fundamentals of Audio
Signals


Two signals of different frequencies
A greater frequency represents a higher pitched sound.
Sampling
Sounds can be thought of as functions of a single
variable (t) which must be sampled and quantized
 The sampling rate is given in terms of samples per
second, or, kHz

– During the sampling process, an analog signal is sampled at
discrete intervals
– At each interval, the signal is momentarily “held” and represents a
measurable voltage rate
Quantization

Audio is usually quantized at between 8 and
20 bits
– Voice data is usually quantized at 8 bits
– Professional audio uses 16 bits
– Digital signal processors will often use a 24 or
32 bit structure internally
Quantization
The accuracy of the digital encoding can be
approximated by considering the word
length per sample
 This accuracy is known as the signal-toerror ratio (S/E) and is given by:

– S/E = 6n + 1.8 dB
– n is the number of bits per sample
Quantization

When a coarse quantization is used, it may be
useful to add a high-frequency signal (analog
white noise) to the signal before it is quantized
– This will make the coarse quantization less perceptible
when the signal is played back
– This technique is known as dithering
During the sampling process, an analog signal is
sampled at discrete intervals
 At each interval, the signal is momentarily “held”
and represents a measurable voltage rate

Channels
We may also have audio data coming from more
than one channels
 Data from a multichannel source is usually
interleaved
 Sampling rates are always measured per channel

– Stereo data recorded at 8000 samples/second will
actually generate 16,000 samples every second
Digital Audio Data

A complete description of digital audio data
includes (at least):
– sampling rate;
– number of bits per sample;
– number of channels (1 for mono, 2 for stereo,
etc.)
– Type of quantization (linear, logarithmic, etc.)
Some Video Formats
Supported by JMF
H.261

H.261: An earlier digital video compression
standard, its principle of MC-based compression
is retained in all later video compression
standards.
– The standard was designed for videophone, video
conferencing and other audiovisual services over
ISDN.
– The video codec supports bit-rates of p × 64 kbps,
where p ranges from 1 to 30 (Hence also known as p ∗
64).
– Require that the delay of the video encoder be less
than 150 msec so that the video can be used for realtime bidirectional video conferencing.
Video Formats Supported by
H.261
H.263

H.263 is an improved video coding standard
for video confer- encing and other
audiovisual services transmitted on Public
Switched Telephone Networks (PSTN).


Aims at low bit-rate communications at bit-rates of
less than 64 kbps.
Uses predictive coding for inter-frames to reduce
temporal redundancy and transform coding for the
remaining signal to reduce spatial redundancy (for
both Intra-frames and inter-frame prediction).
Video Formats Supported by
H.263
MPEG


MPEG: Moving Pictures Experts Group,
established in 1988 for the development of
digital video.
It is appropriately recognized that proprietary
interests need to be maintained within the
family of MPEG standards:
– Accomplished by defining only a compressed
bitstream that implicitly defines the decoder.
– The compression algorithms, and thus the
encoders, are completely up to the manufacturers.
MPEG-1


MPEG-1 adopts the CCIR601 digital TV format
also known as SIF (Source Input Format).
MPEG-1 supports only non-interlaced video.
Normally, its picture resolution is:
– 352 × 240 for NTSC video at 30 fps
– 352×288 for PAL video at 25 fps
– It uses 4:2:0 chroma subsampling

The MPEG-1 standard is also referred to as ISO/IEC
11172. It has five parts: 11172-1 Systems, 11172-2
Video, 11172-3 Audio, 11172-4 Conformance, and
11172-5 Software.
Some Audio Formats
Supported by JMF
MP3 File Format

MP3 files do not have a header (so you can
start playing/processing anywhere in the
file)
– Consist of a sequence of frames
– Each frame has a header followed by audio data
– JMF 2.1.1 supports MP3 audio only on
Windows
MP3 File Format
MP3 File Format

ID3 is a metadata container most often used
in conjunction with the MP3 audio file
format.
 Allows information such as the title, artist,
album, track number, year, genre, and other
information about the file to be stored in the
file itself.
 Last 128 bytes of the file
Supported on all platforms
 AIFF,
AU, AVI, GSM, MIDI,
MP2, QT, RMF, WAV