Transcript dialog 3
LIS618 lecture 6
Thomas Krichel
2003-03-05
structure
• DIALOG
– basic vs additional index
– initial database file selection (files)
• Lexis/Nexis
basic vs additional index
• the basic index
– has information that is relevant to the
substantive contents of the data
– usually is indexed by word, i.e. connectors are
required
• the additional index
– has data that is not relevant to the substantive
matter
– usually indexed by phrase, i.e. connectors are
not required
search options: basic index
• select without qualifiers searches in all
fields in the basic index
• bluesheet lists field indicators available for
a database
• also note if field is indexed by word or
phrase. proximity searching only works
with word indices. when phrases are
indexed you don't need proximity
indicators
search in basic index
• a field in the basic index is queried through
term/IN, where term is a search term and
IN is a field indicator
• Thomas calls this a appending indicator
• several field indicators can be ORed by
giving a comma separated list
• for example mate/ti,de searches for mate
in the title or descriptor fields
limiters and sorting
• Some databases allow to restrict the
search using limiters. For example
– /ABS
– /ENG
require abstract present
English language publication
• Some fields are sortable with the sort
command, i.e. records can be sorted by
the values in the fields. Example: “sort /ti”
Such features are database specific.
additional indices
• additional indices lists those terms that
can lead a query. Often, these are phrase
indexed.
• Such fields a queried by prefix IN=term
where IN is the field abbreviator and term
is the search term
• Thomas calls this a pre-pending indicator
expanding queries
• names have to be entered as they appear
in the database.
• The "expand" command can be used to
see varieties of spelling of a value
• It has to be used in conjunction with a field
identifier, example
expand au=cruz, b?
to search for misspellings of José Manuel
Barrueco Cruz
expanding queries II
•
search produces results of the form
Ref
Items
Index-term
– Ref is a reference number
– Items is the number of items where the
index term appears
– Index-term is the index term
•
"s Ref" searches for the reference term.
add/repeat
• add number, number
adds databases by files to the last query
• example "add 297" to see what the bible
says about it
• repeat
repeats previous query with database
added
Initial file selection
• On the main menu, go to the database
menu.
• After the principle menu, you get a search
box
• There you can enter full-text queries for all
the databases
• You can then select the database you
want
• And get to the begin databases stage.
database categories
• In order to help people to find databases (files),
DIALOG have grouped databases by
categories.
• categories are listed at
http://library.dialog.com/bluesheets/html/blo.html
• 'b category' will select databases from the
category category at the start.
• 'sf category' selects files belonging to a category
category at other times.
Nexis
Lexis/Nexis
• Lexis is a specialized legal research
service
• Nexis is primarily a news services
• adds an important temporal component to
all its contents
• restricts contents as compared to Dialog
• potentially bad competition from Google
compilation of Nexis
• Uses a number of news sources such as
newspapers.
• Uses company reports databases
• Uses web sites, the URLs of which are
found in the news sources
• There is a problem with quality control of
web sites, some pornographic sites are
included
Smart indexing
• Nexis keep a list of terms that are used for
indexing.
• A computer program will relate synonyms
to an official term.
– Example: replace “LIU” with “Long Island
University”
• Queries are not pre-processed.
Nexis basic search
• implicit Boolean "or" between terms
• Otherwise double quotes for
• in fact searches
– Smart index keywords extracted
– HLEAD for news
– TITLE for legal documents
– WEB-SEARCH-TEXT for web pages
relevance ranking
• Lexis is based on the vector model
• The precise relevance ranking seems a
company secret. Ranking depends on
– where terms appear within the document
– how many occurrences of the search terms
appear in the document
– how often those search terms appear
throughout the document