preparing to search and Dialog 1

Download Report

Transcript preparing to search and Dialog 1

LIS618 lecture 1
Thomas Krichel
2003-09-20
Structure of talk
• Recap on Boolean
• Before online searching
• Working with DIALOG
–
–
–
–
Overview
Search command
Bluesheets
Basic and additional index
before a search
• what is purpose
– brief overview
– comprehensive search
• What perspective on the topic
– scholarly
– technical
– business
– popular
I
before search
• What type of information
–
–
–
–
Fulltext
Bibliographic
Directory
Numeric
• Are there any known sources?
–
–
–
–
Authors
Journals
Papers
Conferences
II
before search
•
•
•
•
III
What are the language restrictions?
What, if any, are the cost restrictions?
How current need the data to be?
How much of each record is required?
two steps in DIALOG
• step one: select databases (aka files) to
look at
• step two: perform searches on the
selected databases
• You may wonder why one does not have
one single step like in a search engine.
Discuss.
• today we concentrate on the second step
working on selected files
• We assume that we have selected
database that we know and we look at the
search interface on the selected database.
• The database selection process is a bit
more complicated, covered next week.
• First, let us login and look at the command
prompt.
• Then we select the first database (file) with
the begin command
The begin command
• As its name suggests, usually the first
command.
• begin number, number,…
• selects files with numbers number
• Once they are selected they can be
searched.
• Now select the ERIC "begin 1"
• "Begin 1" can be abbreviated as "b 1"
Substeps in the second step
• Identify search terms
• Use Dialog basic commands to conduct a
search
• View records online or print the results
the 's' (select) command
• Once issued the "begin" command to select a
database, we issue the "s" command on the
database.
• "s query_terms" where query_terms are the
query terms
• This will search the index of selected database
in full-text view for the query issued
• It will not find any of the following: "an and by for
from of the to with". They are stop words.
connectors
• If you want to use several keywords there
are three ways
– you can truncate search terms
– you can build an expression by putting
several keywords together. This is achieved
by DIALOG's connectors.
– you can combine several expressions with the
use of Boolean operators
• we will cover this is in turn now
truncation of terms
• Open Truncation
– "select path?" retrieves all words that begin
with path: paths, pathos, pathway, pathology
• Controlled-Length Truncation
– "select path??"
retrieves the root and up to
two additional characters: paths, pathos
truncation of terms II
• Embedded Character truncation can be used
for variant spellings:
– "select organi?ation" -> organization organisation
– "select fib??board" -> fiberboard fibreboard
• This truncation feature is also useful for
searching for unusual plural forms:
– "select wom?n"
-> woman women
• You can also do prefixes by putting the ? in the
beginning.
– "?mobile"
->
automobile metamobile
Use of connectors
• Connectors are used to put several words
together.
• One instance where this is useful is when
you have words that on their own mean
different things.
• For example "mate" is a herbal beverage
consumed in South America. Looking for
mate on the Internet retrieves a lot of
singles' pages.
example: terms related to "mate"
What other terms to be used?
– matear
– matero
– cebar
– cebador
– yerba
– bombilla
(drink mate)
(mate drinker)
(prepare mate)
(mate preparer)
(mate herb)
(mate straw)
connectors
I
• '(W)' requires terms to appear one after
the other next to each other e.g.
'yerba(W)mate?' matches "yerba mate".
• '(i W)' where i is an integer, means
followed by at most i words, e.g.
'ceba?(3W)mate?' matches "cebar un
maravilloso mate" but not "cebador guapo
mirando un buen mate"
connectors
II
• '(N)' requires terms to be next to each
other e.g. 'yerba(N)mate?' matches "yerba
mate" or "mate yerba".
• '(i N)' where i is an integer, means
proximity by at most i words, e.g.
'ceba?(3N)mate?' matches "cebar mate"
or "matear con la cebadora".
• '(S)' searches for the occurrence of
connected terms in the same paragraph.
using Boolean operators
• In your query, you can combine several
expressions with Boolean operators
• Example: "S LIBRARY(W)SCHOOL? AND
DISTANCE(W)EDUCATION"
• But I usually do not issue such fancy
queries.
executing several searches
• there can be several searches done
sequentially, and the results sets are
saved by the system.
• Each time the system assigns a set
number, Si,
• These can be combined in Boolean
expressions, e.g. 's S1 or S2 and S3'
• Remember that Boolean operations are
set-theoretic!
Boolean operators on sets
• when using Booleans, be aware that "and"
has higher precedence than "or".
• Thus:
a or b and c
is not the same as
(a or b) and c
but it is
a or (b and c)
• use parenthesis when in doubt
DS (display sets)
• This command can be executed any time
to review the sets that have been formed
since the last B (begin) command.
• This can be useful to review your search
history.
the target command
• "target set" where set is a search result
set creates a subset of the "statistically
most relevant results" in the original set.
• I have not seen details about how this
subset is computed.
• new result set is being formed.
display: the type command
type set/format/range
• set is a result set
• format is a format
• range can be
– start – end
• start is a record number to start
• end is a record number to end
– all
standard delivery formats
•
•
•
•
•
2 -- full record except abstract
3 or medium – citation
5 or long – full except full text
6 or free – title and dialog number
8 or short – title plus indexing terms
– useful to find other indexing terms
• 9 or full – everything
• KWIC or K – keywords in context
options for delivery
• I once tried to email results to me, to no
avail
• You can save the html of the search
results in the browser.
• You can print the results within the
browser.
http://openlib.org/home/krichel
Thank you for your attention!