Transcript Dialog

LIS618 lecture 3
Thomas Krichel
2002-09-23
Structure of talk
• The blue sheet
• Working with Dialog
• Nexis.com
using dialog
•
•
•
•
go to command search
pass warning screen
you get to "dialog web command search"
http://www.dialogweb.com/cgi/logoff?mode
=guided&url=/cgi/dwframe?href=search.ht
ml
• searches there do not work well at this
level
blue sheet
• each database name is linked to a blueish
pop-up window called the blue sheet for
the database
• Contents of bluesheet is covered later
• at this stage we choose a database and hit
"begin". We see that there is a command
selected: "be numbers" where numbers
are the ones for the databases selected,
separated by comma.
database types
•
•
•
•
full-text database
bibliographic databases
directory databases
numeric databases
– but they are not classified as such
finding a database
• file 411 contains the database of
databases
• 'sf category' selects files belonging to a
category category
• categories are listed at
http://library.dialog.com/bluesheets
• 'b ref,ref' will select databases
closer look at the bluesheet
• file description
• subject coverage (free vocabulary)
• format options, lists all formats
– by number (internal)
– by dialog web format (external, i.e. crossdatabase)
• search options
– basic index, i.e. subject contents
– additional index, i.e. non-subject
search options: basic index
• select without qualifiers searches in all
fields in the basic index
• bluesheet lists field indicators available for
a database
• also note if field is indexed by word or
phrase. proximity searching only works
with word indices. when phrases are
indexed you don't need proximity
indicators
search in basic index
• basic index is queried through /IN, where
IN is a field indicator
• Thomas calls this a appending indicator
• several field indicators can be ORed by
giving a comma separated list, example
• mate/ti,de
additional features
• Some databases allow to restrict the
search with unary expressions
– /ABS
– /ENG
require abstract present
English language publication
• Some fields are sortable with the sort
command, i.e. records can be sorted by
the values is the fields
Such d are database specific.
additional indices
• additional indices lists those terms that
can lead a query. Often, these are phrase
indexed.
• Such fields a queried by prefix IN=term
where IN is the field abbreviator and term
is the search term
• Thomas calls this a pre-pending indicator
the 's' (select) command
• Once issued the "be" command to select a
database, we issue the "s" command:
• "s keywords" where keywords is a
Boolean expression.
• This will search the selected database in
full-text view for the Boolean query issued
• probably just searches the main index
• keywords can be added
display
• you are allowed to select a format and a
number of items to be displayed.
• formats vary from database to database,
some databases can not display certain
formats
Setting additional terms
• It appear that "drinking and mate" seems a
better search term…
• What other terms to be used?
– matear
– matero
– cebar
– cebador
(suck mate)
(mate sucker)
(prepare mate)
(mate preparer)
• prefix queries can be formed by appending
a '?' to the query term.
connectors
I
• '(W)' requires terms to appear one after
the other next to each other e.g.
'yerba(W)mate?' matches "yerba mate".
• '(i W)' where i is an integer, means
followed by at most i words, e.g.
'ceba?(3W)mate?' matches "cebar un
maravilloso mate" but not "cebador guapo
mirando un mate"
connectors
II
• '(N)' requires terms to be next to each
other e.g. 'yerba(N)mate?' matches "yerba
mate" or "mate yerba".
• '(i N)' where i is an integer, means
proximity by at most i words, e.g.
'ceba?(3N)mate?' matches "cebar mate"
or "matear con la cebadora".
• '(S)' searches for the occurrence of
connected terms in the same paragraph.
connectors
III
• (F) words in the same field, no order
• (L) words in the same descriptor field,
used to link headings and sub-headings.
This is a hierarchical connector.
• Note: connectors are processed left-toright. Use parenthesis whenever in doubt.
Boolean operators
• when using Booleans, be aware that "and"
has higher precedence than "or".
• Thus:
a or b and c
is not the same as
(a or b) and c
but it is
a or (b and c)
executing several searches
• there can be several searches done
sequentially, and the results sets are
saved by the system.
• Each time the system assigns a set
number.
• These can be combined in Boolean
expressions, e.g. 's S1 or S2 and S3'
• Remember that Boolean operations are
set-theoretic!
Reminder: fielded searches
• search terms can be limited to fields by
appending '/field_identifier' to the query
term, where field_identifier is the identifier
of a field.
• identifiers of fields are also important in
the "expand" command
common field identifiers
•
•
•
•
•
•
•
•
•
'co'
'de'
'au'
'df'
'ti'
'cc'
'pn'
'pc'
'px'
company name
descriptor
author name
one-word descriptor
title
classification value
product name
product code
company type
narrowing by date
• 'PY=yyyy', where 'yyyy' is the four digit
identifier for a year, limits the publication
• 'PD=yyyymmdd' where 'yyyy' is the four
digit identifier for a year, when 'mm' is a
two-digit identifier
expanding queries
• names have to be entered as they appear
in the database.
• The "expand" command can be used to
see varieties of spelling of a number.
• It has to be used in conjunction with a field
identifier, example
expand au=cruz, b?
to search for misspellings of José Manuel
Barrueco Cruz
expanding queries
•
search produces results of the form
Ref
Items
Index-term
– Ref is a reference number
– Items is the number of items where the
index term appears
– Index-term is the index term
•
"s Ref" searches for the reference term.
DS (display sets)
• This command can be executed any time
to review the sets that have been formed
since the last B (begin) command.
the stop words
• an and by for from of the to with
add/repeat
• add number, number
adds databases by files to the last query
example add 297 to see what the bible
says about it
• repeat
repeats previous query with database
added
the target command
• "target set" where set is a search result
fixes a subset of the "statistically most
relevant results"
• new result set is being formed.