IR performance and Dialog 2

Download Report

Transcript IR performance and Dialog 2

LIS618 lecture 2
Thomas Krichel
2004-02-08
Structure
• Theory: information retrieval performance
• Practice: more advanced dialog.
retrieval performance evaluation
• "Recall" and "Precision" are two classic measures
to measure the performance of information retrieval
in a single query.
• Both assume that there is an answer set of
documents that contain the answer to the query.
• Performance is optimal if
– the database returns all the documents in the answer set
– the database returns only documents in the answer set
• Recall is the fraction of the relevant documents that
the query result has captured.
• Precision is the fraction of the retrieved documents
that is relevant.
recall and precision curves
• Assume that all the retrieved documents arrive
at once and are being examined.
• During that process, the user discover more and
more relevant documents. Recall increases.
• During the same process, at least eventually,
there will be less and less useful document.
Precision declines (usually).
• This can be represented as a curve.
Example
• Let the answer set be {0,1,2,3,4,5,6,7,8,9}
and non-relevant documents represented
by letters.
• A query reveals the following result:
7,a,3,b,c,9,n,j,l,5,r,o,s,e,4.
• For the first document, (recall, precision) is
(10%,100%), for the third (20%,66%), for
the sixth (30%,50%), for the tenth
(40%,40%), and for the last (50%,33%).
recall/precision curves
• Such curves can be formed for each
query.
• An average curve, for each recall level,
can be calculated for several queries.
• Recall and precision levels can also be
used to calculate two single-valued
summaries.
– average precision at seen document
– R-precision
R-precision
•
•
•
•
•
•
•
This is a pretty ad-hoc measure.
Let R be the size of the answer set.
Take the first R results of the query.
Find the number of relevant documents
Divide by R.
In our example, the R-precision is 40%.
An average can be calculated for a
number of queries.
average precision at seen
document
• To find it, sum all the precision level for
each new relevant document discovered
by the user and divide by the total number
of relevant documents for the query.
• In our example, it is (100+66+50+40+
33)/5=57.8%
• This measure favors retrieval methods that
get the relevant documents to the top.
critique of recall & precision
• Recall has to be estimated by an expert.
• Recall is very difficult to estimate in a large
collection.
• They focus on one query only. No serious
user works like this.
• There are some other measures, but that
is more for an advanced course in IR.
Looking at database structure
• Up until now, we have looked at
commands that take a full-text view of the
database.
• Such commands can be executed for
every database.
• If we want to make more precise queries,
we have to take account of database
structure.
bluesheet
• Each database name is linked to a blueish
pop-up window called the blue sheet for
the database.
• This is called the bluesheet.
• It contains the details of the database.
closer look at the bluesheet
• file description
• subject coverage (free vocabulary)
• format options, lists all formats
– by number (internal)
– by dialog web format (external, i.e. crossdatabase)
• search options
– basic index, i.e. subject contents
– additional index, i.e. non-subject
basic vs additional index
• the basic index
– has information that is relevant to the
substantive contents of the data
– usually is indexed by word, i.e. connectors are
required
• the additional index
– has data that is not relevant to the substantive
matter
– usually indexed by phrase, i.e. connectors are
not required
search options: basic index
• select without qualifiers searches in all
fields in the basic index
• bluesheet lists field indicators available for
a database
• also note if field is indexed by word or
phrase. proximity searching only works
with word indices. when phrases are
indexed you don't need proximity
indicators
search in basic index
• a field in the basic index is queried through
term/IN, where term is a search term and
IN is a field indicator
• Thomas calls this a appending indicator
• several field indicators can be ORed by
giving a comma separated list
• for example mate/ti,de searches for mate
in the title or descriptor fields
limiters and sorting
• Some databases allow to restrict the
search using limiters. For example
– /ABS
– /ENG
require abstract present
English language publication
• Some fields are sortable with the sort
command, i.e. records can be sorted by
the values in the fields. Example: sort
s1/all/ti.
• Such features are database specific.
additional indices
• additional indices lists those terms that
can lead a query. Often, these are phrase
indexed.
• Such fields a queried by prefix IN=term
where IN is the field abbreviator and term
is the search term
• Thomas calls this a pre-pending indicator
expanding queries
• names have to be entered as they appear in the
database.
• The "expand" command can be used to see
varieties of spelling of a value
• It has to be used in conjunction with a field
identifier, example
– expand au=cruz, b?
– expand au=barrueco?
to search for misspellings of José Manuel
Barrueco Cruz
expanding queries II
•
search produces results of the form
Ref
Items
Index-term
– Ref is a reference number
– Items is the number of items where the
index term appears
– Index-term is the index term
•
"s Ref" searches for the reference term.
expand topics
• You can also expand a topic in a database
to see what index terms are available that
start with the term. Example “b 155 ; e
cold”
• If you expand an entry in the expansion list
again, you can see a list of related terms
to the term, if such a list is available.
Example
• How many domain names are currently
registered in Novosibirsk, Russia?
• Hint: use domain name database file 225.
• Note that this database also covers noncurrent domains.
ranking
• The rank command can be use to show
the most frequent values of a phrase
indexed field in a search set.
• Example
– rank au s1
shows the most frequent authors
– rank de s1
shows most frequent descriptors
• read the screens following rank command
for instructions.
example
• Who wrote on interest rates and growth
rates. Use EconLit “b 139”
• “s interest(n)rate? and growth(n)rate?”
• “rank au s1”
• You can then set some authors you are
interested in, “1-5” for example
• “exit” to leave rank, confirm with “yes”.
• “exs” to search for those authors.
topic searches
• Often we want to know what literature is
available on a certain topic.
• Many times authors do not use obvious
words that occur to the searcher.
• Using descriptors can be very helpful.
– Conduct a search
– Look for descriptors
– Use those in other searches
Initial file selection
• On the main menu, go to the database
menu.
• After the principle menu, you get a search
box
• There you can enter full-text queries for all
the databases
• You can then select the database you
want
• And get to the begin databases stage.
database categories
• In order to help people to find databases (files),
DIALOG have grouped databases by
categories.
• categories are listed at
http://library.dialog.com/bluesheets/html/blo.html
• 'b category' will select databases from the
category category at the start.
• 'sf category' selects files belonging to a category
category at other times.
add/repeat
• add number, number
adds databases by files to the last query
• example "add 297" to see what the bible
says about it
• repeat
repeats previous query with database
added
to find publications
• Sometimes, you want to find out if a
certain publication, say, a serial, is
available on Dialog
• http://library.dialog.com/bluesheets/
has a search box specifically for journal
data.
http://openlib.org/home/krichel
Thank you for your attention!