Transcript Document

Media Types





Text
Image
Graphics
Audio
Video
Text
Representation
ASCII
ISO Character Sets
Marked-up Text
Structured Text
Hypertext
Operations
Character Operations
String Operations
Editing
Formatting
Pattern-matching & searching
Sorting
Compression
Encryption
Language-specific operations
Text - Representation

ASCII




7-bit code
128 values in ASCII character set
use of 8th bit in text editors/word processors creates
incompatibility
ISO character sets


extended ASCII to support non-English text
ISO Latin provides support for accented characters



à, ö, ø, etc.
ISO sets include Chinese, Japanese, Korean & Arabic
UNICODE


16 bit format
32768 different symbols
Text - Representation

Marked-up text



nroff, troff
LaTEX
SGML




Structured Text



HTML
HyTime
XML, XSL, XLL
structure of text represented in data structure, usually treebased
ODA, structure embedded in byte-stream with content
Hypertext



non-linear
graph or “web” structure : nodes and links
currently subject of intensive ISO standards activity
Text - Operations

Character operations



String operations




basic data type with assigned value
permits direct character comparison (a<b)
comparison
concatenation
substring extraction and manipulation
Editing



perhaps the most familiar set of operations on text
cut/copy/paste
strings v. blocks, dependent on document structure
Text - Operations

Formatting


interactive or non-interactive (WYSIWYG v. LaTEX)
formatted output



font management




bitmap
page description language (Postscript, PDF)
typeface
point size (1 point = 1/72 of an inch)
TrueType fonts : geometric description + kerning
Pattern-matching and Searching




search and replace
wildcards
regular expressions
for large bodies of text, or text databases, use of inverted
indices, hashing techniques and clustering.
Text - Operations

Sorting



numerous varieties of sort, all of them extensively studied in
basic programming
sort complexity is a major factor in data handling
performance
Compression



ASCII uses 7 bits per character, though most word-processors
actually use the 8th bit to use up a byte per character
Information theory estimates 1-2 bits per character to be sufficient
for natural language text
This redundancy can be removed by encoding :
 Huffman : varies the numbers of bits used to represent
characters, shortest codes for highest frequency characters
 Lempel-Ziv : identifies repeating strings and replaces them by
pointers to a table
 Both techniques compress English text at a ratio of between 2:1
and 3:1
Text - Operations

Encryption


text encryption is widely used in electronic mail and
networked information systems
most widely-used techniques :




subject of major controversy :




DES
RSA public-key
PGP
key escrow systems
Clipper chip
“strong” encryption now being legally outlawed in a number of
countries
Language-specific operations



spell-checking
parsing and grammar checking
style analysis
Image
Representation
Colour Model
Alpha Channels
Number of Channels
Channel Depth
Interlacing
Indexing
Pixel Aspect Ratio
Compression
Operations
Editing
Point operations
Filtering
Compositing
Geometric transformations
Conversion
Image - Representation

Colour Model

2 main types



colour production on output device
theory of human colour perception
CIE colour space


international standard used to calibrate other
colour models
developed in 1931, as CIE XYZ, based on
tristimulus theory of colour specification
Image - Representation

RGB



HSB




numeric triple specifying red, green and blue intensities
convenient for video display drivers since numbers can be easily
mapped to voltages for RGB guns in colour CRTs
Hue - dominant colour of sample, angular value varying from red to
green to blue at 120° intervals
Saturation - the intensity of the colour
Brightness - the amount of gray in the colour
CMYK



displays emit light, so produce colours by adding red, green and
blue intensities
paper reflects light, so to produce a colour on paper one uses inks
that subtract all colours other than the one desired
printers use inks corresponding to the subtractive primaries,
cyan, magenta and yellow (complements of RGB)
Image - Representation



additionally, since inks are not pure, a special black ink is used
to give better blacks and grays
YUV
 colour model used in the television industry
 also YIQ, YCbCr, and YPbPr
 Y represents luminance, effectively the black-and-white
portion of a video signal
 UV are colour difference signals, form the colour portion of a
video signal, and are called chrominance or chroma
 YUV makes efficient use of bandwidth as the human eye has
greater sensitivity to changes in luminance than chrominance,
so bandwidth can be better utilised by allocating more to
luminance and less to chrominance
Alpha Channels

images may have one or more alpha channels defining regions
of full or partial transparency
Image - Representation

can be used to store selections and to create masks and blends

Number of channels
 the number of pieces of information associated with each pixel
 usually the dimensionality of the colour model plus the number of
alpha channels

Channel depth
 number of bits-per-pixel used to encode the channel values
 commonly 1,2,4 or 8 bits, less commonly 5,6,12 or 16bits
 in a multiple channel image, different channels can have different
depths

Interlacing
 storage layout of a multiple channel image could separate channel
values (all R values, followed by all G, followed by all B) or could
use interlacing (all RGB for pixel 1, all RGB for pixel 2.........)
Image - Representation

Indexing
 pixel colours can be represented by an index in a colour map or a
colour lookup table (CLUT)

Pixel aspect ratio
 ratio of pixel width to height
 square pixels are simple to process, but some displays and scanners
work with rectangular pixels
 if the pixel aspect ratios of an image and a display differ the image
will appear stretched or squeezed

Compression
 a page-sized 24-bit colour image produced by a scanner at 300dpi
takes up about 20 Mbytes
 many image formats compress pixel data, using run-length coding,
LZW, predictive coding and transform coding
 many image formats : JPEG, GIF, TIFF, BMP most widely used
Image - Operations


These operations can operate directly on pixel data or
on higher-level features such as edges, surfaces and
volumes
Operations on higher-level features fall into the
domain of image analysis and understanding and will
not be considered here

Editing



changing individual pixels for image touch-up, forms the basis
of airbrushing and texturing
cutting, copying and pasting are supported for groups of pixels,
from simple shape manipulation through to more complex
foreground and background masking and blending
Point operations

consists of applying a function to every pixel in an image
Image - Operations


only uses the pixels current value, neighbouring pixels cannot
be used
Thresholding


Colour Correction


a pixel is set to 1 or 0 depending on whether it is above or
below a threshold value - creates binary images which are often
used as masks when compositing
modifying the image to increase or reduce contrast, brightness,
gamma effects, or to strengthen or weaken particular colours
Filtering


like point operations, operate on every pixel in an image, but
use values of neighbouring pixels as well
used to blur, sharpen or distort images, producing a variety
of special effects
Image - Operations



Compositing
 the combining of two or more images to produce a new image
 generally done by specifying mathematical relationships between
the images
Geometric Transformations
 basic transformations involve displacing, rotating, mirroring or
scaling an image
 more advanced transformations involve skewing and warping
images
Conversions
 conversions between image formats are commonplace and a
number of p.d, shareware and commercial tools exist to support
these
 other forms of conversion include compression and decompression,
changing colour models, and changing image depth and resolution
Graphics
Representation
Geometric Models
Solid Models
Physically-based Models
Empirical Models
Drawing Models
External formats for Models
Operations
Primitive Editing
Structural Editing
Shading
Mapping
Lighting
Viewing
Rendering
Graphics - Representation


The central notion of graphics, as opposed to image
data, is in the rendering of graphical data to produce
an image. A graphics type or model is therefore the
combination of a data type plus a rendering operation
Graphics Representation

Please note - object in graphics modelling usually refers to
an element of the scene being modelled, unless you are
using object-oriented graphics programming

Geometric Models



consist of 2D and/or 3D geometric primitives
2D primitives include lines, rectangles, ellipses plus more
general polygons and curves
3D primitives include the above plus surfaces of various forms.
Curves and curved surfaces described by parameterised
polynomials
Graphics - Representation




primitives are first described in local or object co-ordinates,
then arranged in groups in a common world co-ordinate
system by applying modelling transformations
transformations include rotation, translation and scaling
primitives can be used to build structural hierarchies, allowing
each structure thus created to be broken down into lower-level
structures and primitives (i.e. blueprinting)
Several standard device-independent graphics libraries are
based on geometric modelling




GKS (Graphic Kernel System(ISO))
PHIGS (Programmers Hierarchical Interactive Graphic System
(ISO)) - see also PHIGS+ and PEX
OpenGL - portable version of Silicon Graphics library
Solid Models

Constructive Solid Geometry (CSG) : solid objects are combined
using the set operators union, intersection and difference.
Graphics - Representation




Physically-based Models



Surfaces of revolution : a solid is formed by rotating a 2D curve
about an axis in 3D space - lathing
Extrusion : a 2D outline is extended in 3D space along an
arbitrary path
Using the above techniques will produce models much faster
than building them up from geometric primitives, but rendering
them will be expensive
realistic images can be produced by modelling the forces,
stresses and strains on objects
when one deformable object hits another, the resulting shape
change can be numerically determined from their physical
properties
Empirical Models

complex natural phenomena (clouds, waves, fire, etc.) are
difficult to describe realistically using geometric or solid
modelling
Graphics - Representation




Drawing Models



while physically based models are possible, they may be
computationally expensive or intractable
the alternative is to develop models based on observation rather
than physical laws, such models do not embody the underlying
physical processes that cause these phenomena but they do
produce realistic images
fractals, probabilistic graph grammars (used for branching plant
structures) and particle systems(used for fires and explosions)
are examples of empirical models
describing an object in terms of drawing or painting actions
the description can be seen as a sequence of commands to an
imaginary drawing device - Postscript, LOGO turtle graphics
External formats for Models


need for export/import formats between graphics packages
CGM & CAD are OK. Postscript and RIB are render-only
Graphics - Operations

Primitive editing
 specifying and modifying the parameters associated with the model
primitives
 e.g. specify the type of a primitive and the vertex coordinates and
surface normals

Structural editing
 creating and modifying collections of primitives
 establish spatial relationships between members of collections

Shading
 the modelling techniques described so far have provided the means
to specify the shape of objects, but shading provides further
information for the image in describing the interaction of light with
the object. This interaction is described in terms of the colour of an
object, how it reflects light and if it transmits light
Graphics - Operations

several general-purpose methods exist to describe shading,
most initially describe the surface of the object using meshes of
small, polygonal surface patches






flat shading - each patch is given a constant colour
Gouraud shading - colour information is interpolated across a patch
Phong shading - surface normal information is interpolated across a
patch
Ray tracing & Radiosity - physical models of light behaviour are used
to calculate colour information for each patch, giving highly realistic
results
for photorealistic images extremely flexible shading is required,
tools such as RenderMan actually provide programmable shaders
which can be attached to objects, simulating different light
effects and surface normals.
Mapping

techniques for enhancing the visual appearance of objects
Graphics - Operations

Texture mapping




Bump mapping



an image, the texture map, is applied to a surface
requires a mapping from 3D surface coordinates to 2D image
coordinates, so given a point on the surface the image is sampled
and the resulting value used to colour the surface at that point
shaders can also provide solid textures, where the texture is
obtained from 3D rather than 2D space, and procedural textures,
where the texture is calculated rather than sampled
as texture mapping, but used to change the vector of the surface
rather than the colour
used to describe minor surface changes such as scratches or
scrapes
Displacement mapping


local modifications to the position of a surface
produces ridges or grooves
Graphics - Operations

Environment mapping



Shadow mapping



also known as reflection mapping, used to handle limited forms of
reflection
more primitive technique than ray-tracing
similar to environment mapping in that it provides a primitive
lighting effect without the expense of ray-tracing
produces shadows
Lighting

within a model, in addition to the graphics objects, there are
lights to illuminate the scene. There are various forms of light
source, each of which can be parametrically specified


ambient light - background lighting, comes from all directions with
equal intensity
point lights - come from specific points in space, intensity governed
by inverse square law
Graphics - Operations



Viewing




directional lights - located at infinity in some direction, intensity is
constant
spot lights - illuminating a cone-shaped volume
to produce an image of a 3D model we require a transformation
which projects 3D world coordinates onto 2D image coordinates
transformation applied to viewing volume, that part of the
model that appears in the image
view specification consists of selecting the projection
transformation, usually from parallel or perspective projections
although camera attributes can be specified in some
renderers, and the view volume
Rendering


rendering converts a model, including shading, lighting and
viewing information, into an image
software allows selection and fine-tuning of control parameters
Graphics - Operations


output resolution - the width and height of the output
image in pixels, and the pixel depth
rendering time - quick and low-quality v. slow and high
resolution
Digital Video
Representation
Analog formats sampled
Sampling rate
Sample size and quantisation
Data rate
Frame rate
Compression
Support for interactivity
Scalability
Operations
Storage
Retrieval
Synchronisation
Editing
Mixing
Conversion
Digital Video - Representation

Analog formats sampled

Digital video frames can obtained in two ways :


Synthesis - usually by a computer program
Sampling - of an analog video signal. Since analog video
comes in various different flavours, according to frame rate,
scan rate, composite v component, sampling rate and size
vary.
Digital Video - Representation

Sampling rate




the value of the sampling rate determines the storage
requirement and data transfer rate
the lower limit for the frequency at which to sample in order to
faithfully reproduce the signal, the Nyquist rate, is twice the
highest frequency within the signal
video processing is simplified if each frame and each scan line
give rise to the same number of samples, requiring the sampling
frequency to be an integer multiple of the scan rate
Sample size and quantisation



sample size is the number of bits used to represent sample
values
quantisation refers to the mapping from the continuous range of
the analog signal to discrete sample values
choice of sample size is based on :


signal to noise ratio of sampled signal
sensitivity of medium used to display frames
Digital Video - Representation



sensitivity of the human eye
digital video commonly uses linear quantisation, where
quantisation levels are evenly distributed over the analog range
(as opposed to logarithmic quantisation)
Data rate

high data rate formats can be reduced to lower data rates by a
combination of :




compression
reducing horizontal and vertical resolution
reducing the frame rate
for example :





start with broadcast quality digital video at 10Mbytes/s
divide the horizontal and vertical resolutions by 2, giving VHS
quality resolution
divide the frame rate by 2
compress at a rate of 10:1
data rate becomes 1Mbit/s, suitable for use on LANs and on optical
storage devices (i.e. CD-ROM)
Digital Video - Representation

Frame rate



Compression

we have already considered compression techniques, in digital
video we can compare methods by three factors :




25 or 30 fps equates to analog frame rate, or full-motion video
at 10-15 fps motion is less accurately depicted and the image
flickers, but the data rate is much reduced
Lossy v. lossless
Real-time compression - trade-off between symmetric models and
asymmetric models with real-time decompression
Interframe (relative) v. Intraframe (absolute) compression (i.e.
MPEG-1 v. Motion JPEG)
Support for interactivity



random access to frames
differential rate and reverse playback
cut and paste capability
Digital Video - Representation

Scalability

scalable video allows control over video quality, we can identify
2 forms :



Transmit scalability - encoded data rate is chosen at compression
time from a range of rates, governed by transmission and
processing constraints and/or storage capacity. Currently in use for
low rate digital video
Receive scalability - decoded data rate is chosen at
decompression time to match playback requirements. Attractive
concept but not yet available in current video coding standards
current approaches to low rate digital video include :


DVI (Digital Video Interactive) - two forms, Production Level
Video (PLV) and Real-Time Video (RTV). PLV only really intended
for playback, RTV produces poorer quality but is intended for
compression. Both use interframe compression to achieve rates of
1Mbit/s, but require costly hardware.
MPEG-1 - 1Mbit/s
Digital Video - Representation






MPEG-2 - broadcast quality video at rates between 215Mbit/s
MPEG-4 - low data rate video
MPEG-7 - metadata standard for video representation
Motion JPEG
px64 (CCITT H.261) - intended for video applications using
ISDN (Integrated Services Digital Network). Known as px64
since it produces rates that are multiples of ISDNs 64Kbits/s
B channel rate. Uses similar techniques to MPEG but, since
compressions and decompression must be real-time, quality
tends to be poorer.
H.263 - based on H.261, but offers 2.5 times greater
compression, uses MPEG-1 and MPEG-2 techniques.
Digital Video - Operations

Storage


to record or playback digital video in real-time, the storage system
must be capable of sustaining data transfer at the video data rate
4 main forms of storage for digital video are :
 Magnetic tape - at present only magnetic tape can provide the
vary high capacity storage required for digital video at practical
costs ( 1 hour of CCIR 601 4:2:2 uses 72 Gbytes, while 1 hour
of digital HDTV requires nearly 1 Tbyte)
 Special purpose magnetic storage systems - useful for short
durations of high data rate digital video, can be connected
direct to external equipment and are thus useful for capture
and editing (see diagram)
 Video memory boards - specialist boards with large amounts of
semiconductor memory (several hundred Mbytes or more),
capable of storing short durations of uncompressed digital
video, useful for capture and editing.
Digital Video - Operations


General purpose magnetic and optical storage systems - most low
data rate video representations (MPEG, etc.) were designed to
support the use of conventional storage media for real-time video
playback. Problem is size of storage, even using MPEG-1 13 minutes
of video will fill a 100Mbyte disk.
Retrieval

uses frame addressing, as in analog video, but there are some
problems :


low data rate formats result in variable sized frames, so an index
giving frame offsets needs to be maintained to support random
access
interframe compression techniques, i.e. MPEG, only code key frames
independently, other frames are derived from these key frames. So
random access requires to first find the nearest key frame and then
use this to decode the desired frame, again using the index but
enhancing it with key frame locations
Digital Video - Operations

Synchronisation



suffers same problems as analog video, so uses same
techniques
digital video also has some additional techniques not available in
analog video, such as changing resolution to maintain frame rate
Editing

2 types :



tape-based - same procedures as with analog video, except no
generation loss and the players are on the same machine
nonlinear - basically a clips-library, using cut and paste techniques
to build a video sequence
Mixing

real-time effects, such as tumbles, wipes and fades, are
calculated in the same way as for analog video, in fact for the
majority of such effects whether the original source is analog or
digital, the effects are digitised
Digital Video - Operations


non-real-time effects are only possible using digital video,
and obviate the need for specialist equipment, being only
dependent on the speed of the processor and the patience of
the user, storage considerations can be overcome with the
use of pointers and single frame editing
Conversion



variety of formats demands conversion formats
real-time conversion requires specialist hardware
compression/decompression within a single format also
requires specialist software/hardware
Digital Audio
Representation
Sampling frequency
Sample size and quantisation
Number of channels (tracks)
Interleaving
Negative samples
Encoding
Operations
Storage
Retrieval
Editing
Effects and filtering
Conversion
Digital Audio - Representation

Digital Audio Representation

2 main areas :




telecommunications
entertainment (audio CD)
Produced by sampling a continuous signal generated by a sound
source. An analog-to-digital converter (ADC) takes as input
an electrical signal corresponding to the sound and converts it
into a digital data stream. The reverse process, to generate the
sound through an amplifier and speakers, involves a digital-toanalog converter (DAC)
Sampling frequency (rate)

sampling theory shows that a signal can be reproduced without
error from a set of samples, providing the sampling frequency is
at least twice the highest frequency present in the original signal
Digital Audio - Representation



telephone networks allocate a 3.4kHz bandwidth to voice-grade
lines, thus a sampling rate of 8kHz is used for digital
telecommunications
the human ear is sensitive to frequencies of up to about 20kHz,
so to digitise any perceivable sound a sampling rate of over
40kHz is required
Sample size and quantisation



during sampling, the continuously varying amplitude of the
analog signal is approximated by digital values, this introduces a
quantisation error, being the difference between the actual
amplitude and the digital approximation
quantisation error is apparent when the signal is reconverted to
analog form as distortion, a loss in audio quality
quantisation error can be reduced by increasing the sample size,
as allowing more bits per sample will improve the accuracy of
the approximation
Digital Audio - Representation

quantisation refers to breaking the continuous range of the
analog signal into a number of unique digital intervals, based on
one of a number of schemes :



linear quantisation - uses equally spaced intervals, so if the sample
size is 3 bits and the maximum signal variation is 5.0 then the
quantisation interval would be 0.625 units of signal amplitude
nonlinear quantisation (especially logarithmic quantisation) - uses
non-equally spaced intervals, lower amplitude intervals are more
closely spaced than higher amplitude, results in greater sensitivity
to lower amplitude sound where the human ear is most sensitive
Number of channels (tracks)




speech quality audio is mono (1 track)
stereo audio requires 2 tracks
some consumer audio equipment use 4 tracks (quadrophonic)
professional audio equipment uses 16, 32 or more
Digital Audio - Representation

Interleaving




a multi-channel audio value can be encoded by interleaving
channel samples or by providing separate streams for each
channel
the advantage of interleaving is in synchronisation, and it also
offers some benefits in storage and transmission
the disadvantages of interleaving are that it can be wasteful of
space or bandwidth if not all channels are needed, it freezes the
synchronisation between channels thus preventing temporal
shifts, and it may not allow variation in the number of channels
Negative samples


the voltages found in analog audio signals alternate between
positive and negative values
negative values can be encoded successfully for processing in
twos complement, ones complement or sign-magnitude
representation
Digital Audio - Representation

Encoding


encoding audio data reduces storage and transmission costs,
and compressed audio also provides better quality when
compared to uncompressed audio at the same data rate
2 commonly-used methods :
 PCM (Pulse Code Modulation) - uses the fact that a
digital signal can be formed from a series of pulses. PCM
values are simply sequences of uncompressed samples,
so they provide a reference format for comparison with
more complex coding methods
 ADPCM (Adaptive Delta Pulse Code Modulation) reduces PCM data rate by encoding the differences
between samples. ADPCM is widely used and is
associated with some encoding standards, such as CCITT
G.721.
Digital Audio - Operations

Storage




it is possible to record digital audio, even at the data rates of the
high quality formats, on general purpose magnetic storage
theoretically, a magnetic disk with a sustainable transfer rate of
5 Mbytes per second could playback 50 channels of CD-quality
digital audio. In practice this would not be possible without a
highly optimised layout, but one or two channels are easily
within the reach of small computer systems
since an hour of stereo digital audio, at the CD data rate,
requires over half a Gigabyte of storage, tertiary storage in the
form of DAT tapes, CD discs or optical disks is normally adopted,
with the information being mounted onto the system manually
or through a jukebox
Retrieval

need to support random access and ensure continuous flow of
data to DAC
Digital Audio - Operations




portions of audio sequences, segments, are identified by their
starting time and duration, these can be located is by mapping
the starting time to a segment address, which the file system
then maps to a physical address on disk
where there is no direct mapping to enable segment location by
time code, an index of segments must be separately maintained
continuous flow of data is easy to maintain with a dedicated
storage system, but requires careful control where storage is
scheduled for a number of such tasks
Editing

as with digital video, 2 types :



tape-based
disk-based
to avoid audible clicks when inserting one sample into another,
cross-fades are used, where the amplitudes of the original
segment and the inserted segment are added and scaled about
the insertion point
Digital Audio - Operations



digital audio also supports non-destructive editing, where the segments
of data are accessed through a data structure known as a play-list,
which essentially contains a set of pointers to the data and details on
ordering and other forms of edit to be performed on the data when it is
joined
Effects and filtering
 digital filtering techniques permit a number of effects on audio :
 Delay
 Equalisation & Normalisation
 Noise reduction & Time compression and expansion
 Pitch shifting
 Stereoisation
 Acoustic environments
Conversion
 one format to another (uncompressing ADPCM->PCM)
 altering encoding parameters (i.e. resampling at lower frequency)
Music
Representation
Operational v. Symbolic
MIDI
SMDL
Operations
Playback & Synthesis
Timing
Editing & Composition
Music - Representation



The existence of powerful, low-cost, digital signal
processors mean that many computers can now record,
generate and process music.
Music is also widely used in multimedia applications, so we
require a media type for music to focus on the computers
musical capabilities.
Representation of Music

Operational v. Symbolic



operational representations specify exact timings for music
and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to
describe the form of the music and allow great freedom in
the interpretation
both types are described as structural representations,
since instead of representing music by audio samples there is
information about the internal structure of the music
Music - Representation



The existence of powerful, low-cost, digital signal
processors mean that many computers can now record,
generate and process music.
Music is also widely used in multimedia applications, so we
require a media type for music to focus on the computers
musical capabilities.
Representation of Music

Operational v. Symbolic



operational representations specify exact timings for music
and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to
describe the form of the music and allow great freedom in
the interpretation
both types are described as structural representations,
since instead of representing music by audio samples there is
information about the internal structure of the music
Music - Representation



The existence of powerful, low-cost, digital signal
processors mean that many computers can now record,
generate and process music.
Music is also widely used in multimedia applications, so we
require a media type for music to focus on the computers
musical capabilities.
Representation of Music

Operational v. Symbolic



operational representations specify exact timings for music
and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to
describe the form of the music and allow great freedom in
the interpretation
both types are described as structural representations,
since instead of representing music by audio samples there is
information about the internal structure of the music
Music - Representation

To illustrate the structural representations, we can consider two
:



MIDI - a widely use protocol allowing the connection of computers
and musical equipment, an operational representation
SMDL - a proposal for a standard structure for documents
containing musical information, having both operational and
symbolic aspects
MIDI


the Musical Instrument Digital Interface was developed in the
early ‘80s by musical equipment makers
Devices :
 electronic keyboards and synthesisers
 drum machines
 sequencers (to record and play back MIDI messages)
 music<->film and music<->video synchronisation equipment
Music - Representation

Connection ports :





MIDI OUT - allows a device to send MIDI messages it has produced to
other MIDI devices
MIDI IN - receives MIDI messages from other MIDI devices
MIDI THRU - repeats received messages, permitting daisy-chaining of
MIDI devices
MIDI devices process MIDI messages differently, according to
their function or to the sound palette used by the device, hence
different synthesisers can produce different sounds supplied with
the same MIDI messages
MIDI Concepts:



Channel - a MIDI connection has 16 message channels, devices can
be set to respond to all channels or only to specific channels
Key number - notes are identified by key number, 128 compared with
a standard keyboard of 88
Controller - 128 different controllers are available under the MIDI
protocol, though not all are currently defined, changing the value of a
controller typically alters sound production
Music - Representation






Patch/program - an audio palette is called a program or patch, a
synthesiser capable of having a number of patches active at the same
time is called multi-timbral
Polyphony - the ability of a synthesiser to play many notes at a time
Song - a recorded or preprogrammed MIDI sequence
Timing clock - a MIDI sequencer timestamps messages using a
timebase measured in parts per quarter note (PPQ). Typical
timebase values are 24, 96 and 480 PPQ. To convert the timebase
into actual time you use the tempo, measured in beats per minute
(BPM) where we assume that one beat is equal to a quarter note.
Thus if we have a tempo of 180 BPM, a time base of 96PPQ = 1/3 x
1/96 = 3.47ms
MIDI synchronisation - MIDI devices can be set to internal synch or
external synch, when set to internal synch a device is known as a
master and produces a timing clock message on its MIDI OUT at
24PPQ which slave devices use for external synch
MTC - MIDI Time Code is used to synchronise MIDI with film or
video, used to trigger sound effects or musical sequences
Music - Representation

MIDI Protocol :





based on 8-bit code for messages, each message consists of a
single command byte and possibly one or more data bytes (see
table)
Channel voice messages (8c-Ec) - determine the actual notes
played, speed of hit and release and the values of controllers
Channel mode messages (Bc, with controllers 121-127) - selects the
mode of a synthesiser, responding to one channel or all channels,
each channel separately voiced or all voices used for one channel
System messages (F0-FF) - general system functions, timing clock,
MIDI time code messages, system reset, start device, stop device,
etc.
Limitations of MIDI :



operates at 31250bps, allows 500 notes per second which may not
be enough for complex pieces
limited number of channels, lack of device addressing and other
flaws make configuring large MIDI networks difficult
device dependence of MIDI data
Music - Representation

SMDL
 the Standard Music Description Language was developed by the
MIPS committee of ANSI
 SMDL encompasses representation of music for electronic
dissemination and production by software, the representation of
scores and musical examples in printed documents and the
representation of musical annotation and attributes used for musical
analysis or by music databases
 SMDL is a DTD of SGML, based on a document type called musical
works or works. Each work has 4 hierarchically structured
sections:
 core section - musical events, such as note sequences, which
form the work
 gestural section - performances of the core, which may differ in
interpretation
 visual section - displays the core in printed, includes formatting
and lyrics
 analytical section - allows a number of theoretical analyses on
the core, its score and performances to be included in the work
Music - Operations

In considering music representation, we can
recognise several advantages over audio :





music representation will be more compact than audio
it is portable and can be synthesised with the fidelity and
complexity appropriate to the output devices used
while digital audio suffers from inherent noise, musical
representations are noise free
many operations can be performed on music that would be
infeasible or require extensive processing on audio
Playback & Synthesis

during audio playback, the listener has limited influence
over the musical aspects of the performance, beyond
changing the volume or processing the audio in some way.
If music is produced by synthesis from a structural
representation the listener can
Music - Operations

independently change pitch and tempo, increase or decrease individual
instruments volumes or change the sounds they produce
musical representations offer greater potential for interactivity than
audio

Timing
 structural representation makes timing of musical events explicit
 the ability to modify tempo makes it possible to alter the timing of
groups of musical events and adjust the synchronisation of those events
with other events (film, video, etc.)

Editing & Composition
 basic editing allows the user to modify primitive events and notes
 more complex editing operations operate on musical aggregates (chords,
bars, etc.) to permit phrase-repetition, melody replacement and other
such functions
 composition software simplifies the task of generating and combining or
rearranging tracks, and prints the score
Animation
Representation
Cel models
Scene-based models
Event-based models
Key frames
Articulated objects &
hierarchical models
Scripting & procedural models
Physically-based & empirical
models
Operations
Graphics operations
Motion & parameter control
Rendering
Playback
Animation - Representation




Separating animation and video follows the same track we took
in separating image and graphic, based on modelling.
Animation types provide models which are rendered to produce
video.
Animation is distinct from graphic in that it is time-dependent,
but as in the image<->video relationship, sampling an
animation model at a particular time will result in a graphics
model, which can be rendered to produce an image
Animation Representation

Cel models


early animators drew on transparent celluloid sheets or cels,
different sheets contained different parts of the scene, which was
assembled by overlaying the sheets
in animation, cels are digital images with a transparency channel
Animation - Representation



Scene-based models



scenes are rendered by drawing the cels back to front, with
movement being added by changing the position of cels from
one frame to the next
a cel model is therefore a set of images, their back to front
order, and their relative position and orientation in each frame
simply a sequence of graphics models, each representing a
complete scene
highly redundant and do not support continuity of activities
Event-based models


expresses the difference between successive scenes as events
that transform one scene to the next
still discrete rather than continuous, but permits the
management of scenes by input devices (i.e. mouse, tablet,
etc.) rather than each scene having to be entered manually
Animation - Representation

Key frames


Articulated objects & hierarchical models



in essence, the animator models the beginning and end frames of a
sequence and lets the computer calculate the others by interpolation
attempt to overcome the problems of key frames by developing
articulated objects, jointed assemblies where the configuration and
movement of sub-parts are constrained
ensures proper relative positioning and constraint maintenance
during interpolation (will not allow solid objects to pass through
other solid objects)
Scripting and procedural models


current state-of-the-art animation modelling systems have tools
allowing the animator to specify key frames, preview sequences in
real time and control the interpolation of model parameters
an additional feature in many such systems are scripting languages
Animation - Representation


scripting languages offer the animator the opportunity to
express sequences in concise form, particularly useful for
repetitive and structured motion and also provide highlevel operations intended specifically for animation
Physically-based models & empirical models


this approach is used to produce sequences depicting
evolving physical systems
a mathematical model of the system is derived from
physical principles or empirical data and the model is then
solved, numerically or through simulation, at a sequence
of time points, each one resulting in a single frame for the
sequence
Animation - Operations

Graphics operations
 since animation models are graphics models extended in time, all
the graphics operations we have already covered are applicable here

Motion and parameter control
 since the essential difference between graphics and animation
operations is the addition of the temporal dimension, graphics
objects become animations through the assignment of complex
trajectories or behaviours over time
 commercial 3D animation systems provide modelling tools and
animation tools, the modelling tools produce 3D graphic models
and the animation tools add temporal transformations to these
objects

Rendering
 2 basic forms :
 real-time - model is rendered as frames are displayed, 10+
frames per second are required to avoid jerkiness, so only
appropriate for simple models or with special hardware
 non-real-time -frames are pre-rendered, taking as long as
necessary to do so, provides higher visual quality and
consistency of frame-rate
Animation - Operations

Playback


non-real-time rendering offers the same operational
possibilities in playback as digital video, over rate and
direction
real-time rendering is much more interactive and
modifiable, objects can be added and removed, lights
turned on and off, the viewpoint changed, and so on