Search Tips - Freedom Forum
Download
Report
Transcript Search Tips - Freedom Forum
Search Tips
or with competition with search robots
Inspired by
Mary Ellen Bates’ workshop
“Tips From a Super Searcher: Getting the Most From
the Web and Online Sources”, Prague , 2003.
Toshka Borisova
AUBG Freedom Forum Journalism Library Coordinator
Search Tips
The World Wide Web contains more
information than any other single resource in
existence today. Finding the information you
are looking for among the billions of web
pages on the web can be tough. This guide of
search tips will have you on the road to
finding information quickly and effectively.
Web search tips
The invisible web
19 and 26 June 2003
Toshka Borisova
2
Online Search Strategies
What are you looking for:
Full text or abstracts?
Current material or 10 years back?
Basic or advanced material?
Short or in-depth articles?
Any "validating" sources?
Exact match or something close?
Leads to identify experts to call?
White papers ( White Papers contain an official set of proposals in
specific policy areas), statistics and other info more likely to
be on web sites?
19 and 26 June 2003
Toshka Borisova
3
Online Search Tips
Use "advanced search" option
http://www.aubg.bg/library/text.php?i=680
Google Well known as the "king of search," this engine
has one of the largest databases of web pages in the
world. Fast, accurate results are common here and
chances are good that if you can't find it in Google, it's
not meant to be found.
19 and 26 June 2003
Toshka Borisova
4
Online Search Tips
Plan on two separate search sessions
Be sure to value your time
White Paper on the true cost of searching
the open web vs. the professional online
Services www.factiva.com/infopro/BusIntellletter.pdf
Assume you will find something
We have higher relevance expectations
than our patrons
Watch for what's not online
19 and 26 June 2003
Toshka Borisova
5
Online Search Tips
Watch for references to "grey literature“
"That which is produced on all levels of government, academics,
business and industry in print and electronic formats, but which is
not controlled by commercial publishers."
Include www or http in your search strategies
to find mentions of web sites
Always use several tools for the same search
Watch for alternate spellings and
phrasings
Use same words in different order
19 and 26 June 2003
Toshka Borisova
6
Web Search Tips
Use tools, not search engines. There is
absolutely no pattern
Wayback Machine
http://www.archive.org/
Purge your "assumptions cache" regularly
Keep a trail of where you have been
Be sure to value your time
19 and 26 June 2003
Toshka Borisova
7
Web Search Tips
When exploring a site, use the Site Map or Site Index
Use the [Search This Site] feature to find hidden
pages
Know the "power tools" of each search engine
Field searches
File-type searches
Limits by date, language, site
Truncation
Boolean
19 and 26 June 2003
Toshka Borisova
8
Search Tips
Keyword Search
Many search engines by default offer a keyword search
Phrase Search.
Boolean Operators
Named after mathematician George Boole, Boolean logic
involves the operators AND, OR, NOT, and occasionally NEAR
19 and 26 June 2003
Toshka Borisova
9
Online Search Tips
Keyword Search
Use KWIC (Key Word In Context)
Try to find synonyms, acronyms
http://www.keyworddensity.com/
http://www.wordtracker.com/
Search for key words in title
Use the "at least X times" feature
DJI/Factiva, LexisNexis, Dialog:
19 and 26 June 2003
Toshka Borisova
10
Web Search Tips
Phrase Searching
Requires the terms to appear in the exact order that
they are typed. Most systems that allow phrase
searching have the user enter the phrase in quotes.
"national endowment for the arts"
Phrase Searching”- Supported by all
Google - Phrases may not be on page
Teoma- “Not always exact matches” (FIXED)
Openfind Debuting in beta form in July 5, 2002
Openfind is a new, large independently-built search engine, initially
claiming 3.5 billion pages. It is based on research in Taiwan and has a
Chinese version as well. None available now
19 and 26 June 2003
Toshka Borisova
11
Web Search Tips
Boolean operators
Just use it wisely
– Simple ANDs, ORs
– Narrows results
Boolean NOT ( - )
– Exclude meaning
– Exclude domains
19 and 26 June 2003
Boolean OR
Toshka Borisova
– Crucial synonyms
– Need more pages
12
Web Search Tips
To OR or not to OR
Google: OR in CAPS, advanced
– Does not always work right
– yellowstone bison OR buffalo
AlltheWeb: use ( ) or Advanced Boolean Box
– yellowstone (bison buffalo)
AltaVista: normal
– yellowstone AND (bison OR buffalo)
Gigablast: Use + (but not the same)
– +yellowstone bison buffalo
Teoma
– yellowstone bison OR buffalo
– Becomes(yellowstone AND bison) OR buffalo
19 and 26 June 2003
Toshka Borisova
13
Web Search Tips
Proximity
– Text matching
– citation hunt
– plagiarism check
– Q&A
NEAR and Other Proximity
– AltaVista only
19 and 26 June 2003
Toshka Borisova
14
Web Search Tips
Truncation
Searches for variants of a word by using a symbol to
represent one or more characters. The most common
symbols are * (asterisks), ? (question marks), and !
(exclamation marks). If truncation is not supported by
the search engine use the Boolean operator OR to
combine like terms.
– AltaVistaTruncation
HotBot & MSN Truncation
Another term ”Stemming”: MSN (e.g., find "movies" if your
search word is "movie")
19 and 26 June 2003
Toshka Borisova
15
Web Search Tips
Case Sensitive (
alaskan pipeline- with the incorrect lowercase "a")
– AltaVista Advanced or Quoted Simple
– MIT vs. mit or IT vs. it
19 and 26 June 2003
Toshka Borisova
16
Web Search Tips
Wild Card Word in Phrase
Wild Card characters represent undefined letters or numerals in a
search term. Wild Card characters allow for retrieval of:
- Singular and plural word forms
- Spelling variations (e.g., British/American spellings)
- Word stems with prefixes and suffixes
* - Represents zero to any number of characters at the
beginning or end of a term. *GROW* - Possible Retrievals
GROW , GROWS, OUTGROWTH
? - Represents exactly one character within a term...
T??TH TEETH, TOOTH, TRUTH
...or one character at the end of a term AMIN? AMINE , AMINO
19 and 26 June 2003
Toshka Borisova
17
Web Search Tips
Field Searching
Fields searching allows the searcher to designate
where a specific search term will appear. Rather than
searching for words anywhere on a Web page, fields
define specific structural units of a document. The title,
the URL, an image tags, or a hypertext link are
common fields on a Web page.
How search engines work
Spidering program - Collect links
Indexing program - Include metatags
Search/retrieval program - Sort results
19 and 26 June 2003
Toshka Borisova
18
Web Search Tips
Link Searching
Pages include a link to the specified URL.
Link Updates, Impact Analysis
- Best at AltaVista, AlltheWeb
– Can have different results for
http://www.name.org/
Example: http://www.freedomforum.org/ - finds pages with links to
this site
Title:searching will look for the word 'searching' in the
title of a Web page. Hits have the term(s) in the HTML
title element.
title: "search engines”
19 and 26 June 2003
Toshka Borisova
19
Web Search Tips
Field Searching
IP: Page is the specified IP range. Incomplete numbers
are truncated.
ip:216.32.120 finds any computer in 216.32.120.*
Site: Results are only from the specified site.
site:nasa.gov - finds pages at NASA's Web site
Suburl: Pages have the term(s) somewhere in the URL
(host name, path, or filename).
suburl:searchenginewatch
URL: Result must be exactly this URL and nothing
else.
url: www.slashdot.com/index.html
19 and 26 June 2003
Toshka Borisova
20
Web Search Tips
– Field Searching
title:
intitle:
url:
inurl:
site:
link:
anchor:
image:
19 and 26 June 2003
AltaVista, AlltheWeb, HotBot, Lycos,
Gigablast
Google Google, Teoma
AltaVista, AlltheWeb, Lycos, Gigablast
Google, Teoma
AlltheWeb, Gigablast, Google, Teoma
AltaVista, Google, AlltheWeb, HotBot,
Gigablast
AltaVista
AltaVista
Toshka Borisova
21
Web Search Tips
Selected Limits
Usually on advanced search form
Language:
At most, languages vary
Date: AlltheWeb, AltaVista, Google, Inktomi
– Cut out old material,
focus search
standard
– Or to find old information
File Type: AlltheWeb, AltaVista, Google, Inktomi.
PDFs at all, Flash at AlltheWeb,
Media Type: HotBot, MSN, AlltheWeb
Page Size: AlltheWeb
IP Range:AlltheWeb
19 and 26 June 2003
Toshka Borisova
22
Web Search Tips
Diacritics: é
Does e find é? - Sometimes
Not at Google
– Exact match on diacritics only
At other search engines
– e usually finds e OR é
é usually finds only é
Use English equivalents for special letters and
omit diacritics
19 and 26 June 2003
Toshka Borisova
23
Web Search Tips
Counting Complexities
Search Engines Can’t Count
Only the big search engines count, top10 search engines
Numbers constantly change
– From one page of results to the next
– From one minute to the next
Try reloading for more
19 and 26 June 2003
Toshka Borisova
24
Web Search Tips
Feature Inconsistencies
Databases Changes
– Constant
– If they don’t . . .
• They get old, out-of-date, dead links
– Size Changes Often Sudden
– Database Reversions
– Searching Failures And Other Unexpected Results
On the Fly Analysis
Always Question Results
Evaluate and Compare
Find one unique, low-posted term
– Use for search engine comparisons
– Evaluate change over time
“On-the-Fly Search Engine Analysis.” ONLINE 23(5):63-66, Sept.
1999. onlinemag.net/OL1999/net9.html
19 and 26 June 2003
Toshka Borisova
25
Web Search Tips
CEO - Search Engine Optimization
SearchEngineShowdown.com
More on Advanced Features
Feature Chart
Detailed Reviews
Search Engine Watch
http://www.searchenginewatch.com/facts/ataglance.h
tml
19 and 26 June 2003
Toshka Borisova
26
Inconsistencies
Low Recall or "I am not finding any sites on my
topic!!"
Have I chosen the correct database?
Have I been too specific in formulating the search?
Have I included all possible terms and word forms? Should I
use truncation?
Was Boolean logic used correctly?
Did I make a technical error, e.g., spelling, or command
syntax?
Low Precision or "I found hundreds of citations and
many are not on my topic!!"
Delete less specific synonyms and ambiguous terms
Search fewer fields e.g., just the title field or URL
Add additional facets with AND or NOT
Add restrictions, e.g., date of publication
19 and 26 June 2003
Toshka Borisova
27
The Invisible Web
What is it?
It consists of searchable information resources whose
contents cannot be indexed by traditional search
engines.
Content in databases
Professional online services
Non-ASCII files
Sites that require log-in or registration
Real-time information
Dynamically-created web pages
Discussion forums and BBSs
19 and 26 June 2003
Toshka Borisova
28
Searching the Invisible Web
Much "invisible" content has a
"visible web" front
Some databases are opening up
Google searches PDF, XLS, RTF, DOC files
19 and 26 June 2003
Toshka Borisova
29
Searching the Invisible Web
Use directories and portals
-Open Directory Project http://www.dmoz.org is the
largest, most comprehensive human-edited directory of
the Web. It is constructed and maintained by a vast,
global community of volunteer editors.
-Librarian’s Index to the Internet http://www.lii.org
-Subject-specific directories http://www.econ.bg
Experts and info pros watch for this material
Experts.com www.experts.com
A reliable and diverse source of experts, many of whom
are outside the academic arena.
Yahoo - http://groups.yahoo.com/
Search for database or forum along with subject terms
19 and 26 June 2003
Toshka Borisova
30
Searching the Invisible Web
Use meta-search engines
DogPile.com
MetaCrawler.com
Use Teoma.com's "Experts' Links“
Scan the libraries of relevant
discussion groups
Lurk on lists
19 and 26 June 2003
Toshka Borisova
31
Searching the Invisible Web
Use reverse link look-up to find "more
like this"
Google and Alta Vista:
link:www.BatesInfo.com
HotBot: http://www.hotbot.com/
link:www.aubg.bg/fforum - use [Links to this URL]
19 and 26 June 2003
Toshka Borisova
32
The Invisible Web
Invisible Web Directories
http://www.invisibleweb.com/
The InvisibleWeb Catalog™ contains over 10,000 databases and
searchable sources that have been frequently overlooked by traditional
searching.
CompletePlanet.com Contains 103 searchable databases
DirectSearch Difficult to use but extensive
http://www.internets.com/They have assembled the
largest filtered collection of useful search engines and newswires
anywhere on the World Wide Web. There are 1-2 billion documents, on
the "surface web". The deep web is estimated to be approximately 500
billion documents.
Good hierarchy of databases
19 and 26 June 2003
Toshka Borisova
33
Web Search Tips
Set aside one afternoon every two
weeks for your web reading !!!
More info
http://www.BatesInfo.com
19 and 26 June 2003
Toshka Borisova
34