Transcript Compression
Multimedia Information Retrieval
• Unlike alphanumeric data, multimedia data do not have
any semantic structure
• Achieving symmetry between annotation and query is
difficult
• Retrieval is based on similarity between query and stored
information instead of exact match
• Stored information is represented using indexing
IR Model
• Information is preprocessed to extract features and
semantic contents
• Indexed based on these features and semantics
• User’s query is processed and main features are extracted
• Query’s features are then compared with features or index
of each information item in the database
• Information item whose features are similar to those of the
query are retrieved and presented to the user
Design Issues
• Indexing
– a mechanism that reduces the search space of an
operator without losing any relevant information
• Similarity Computation
– easy to compute and should conform to human
judgement
Performance Measures
• Retrieval speed, recall, precision
• Recall measures the ability of retrieving relevant
information items from the database
– defined as the ratio between the number of retrieved relevant items
and the total number of relevant items in the database
• Precision measures retrieval accuracy
– defined as the ratio between the number of retrieved relevant items
and the number of total retrieved items
• Recall and precision are usually considered together
– high recall and low precision
– high precision and low recall
Text Retrieval
• Text may be used to annotate other media such as audio,
images and video and conventional IR techniques used to
retrieve multimedia information
• Boolean IR systems or text-pattern search systems
• Substantial effort is spent in analyzing the contents of the
documents and in generating keywords and indices
• Boolean queries are keywords connected with logical
operators (AND, OR, NOT)
File Structures
• Flat files
• Inverted files
– for each term a separate index is constructed that stores the
document identifiers for all documents containing the term
– each term and the document IDs containing the term are organized
into one row
– searching and retrieval is fast because only rows containing the
query terms need to be retrieved and there is no need to search the
whole database
Extensions
• Nearness parameters used in query specification help
define the topic more precisely and therefore increase
probable relevance of the retrieved item
• Within Sentence and Adjacency specification in queries
• Term location information is included in the inverted file
– Term i : document id, paragraph no., sentence no., word no.
• For example, if an inverted file has the following entries:
information: R99, 10, 8, 3; R155, 15, 3, 6; R166, 2, 3,1
retrieval: R77, 9, 7, 2; R99, 10, 8, 4; R166, 10, 2, 5
Indexing
• Stop words -- grammatical functional words, such as “of,”
“the,” and “a.”
• Stemming -- reducing words to a common root form
• Thesaurus -- list of synonyms
• Weighting -- term significance derived from occurrence
frequency within a document and among different
documents
Relevance Feedback
• Query modification
– terms occurring in documents previously identified as relevant are
added to the original query or the weight of such terms is increased
– terms occurring in documents previously identified as irrelevant
are deleted from the query or the weight of such terms is reduced
• Document modification
– terms in the query, but not in the user-judged relevant documents,
are added to the document index list with an initial weight
– weights of index terms in the query and also in relevant documents
are increased by a certain amount
– weights of index terms not in the query but in the relevant
documents are decreased by a certain amount
Audio Search and Retrieval
• Keywords can be highly subjective because of a different
perspective or even a different taxonomy
• Hard to browse directly since it must be heard in real-time
(unlike video which can be keyframed)
• Two categories : Speech and Non-speech
– with speech, indexing and retrieval is based on obtaining spoken
words either manually or by speech recognition technique
– with non-speech, indexing and retrieval may be based on text
annotation (but will it help a query like “find the first occurrence of
the note G-sharp.”)