Size of Internet/World Wide Web
Download
Report
Transcript Size of Internet/World Wide Web
Intermediate Internet
Searching
Or
How to really find information on
the internet
Shayna Keces
Reference Librarian
Agenda
Size of Internet
Types of search engines
Search strategies
Choosing a search engine
Interpretation of search results
Size of Internet/World Wide
Web
July 2000 2.1 billion web pages, est. 4 billion
pages by early 2001 (Some place much
higher if count invisible or deep web)
Size of search engine databases
Google 2 billion
Fast (alltheweb) 625 million
AltaVista 550 million
Yahoo 2 million catalogued (Google for not
catalogued)
Search strategies
Do nots
use search button
use a string of keywords without specifying
Boolean properties
use upper case unless part of strategy
use NOT or - unless absolutely sure is
necessary
elimination of unanticipated pages
format is non standardized
Search Strategies
Do
Consider what type of resource will best
answer your question and search for that
resource (eg. dictionary or certain type of web
page)
think of a list of keywords that will narrow or
broaden your search keeping in mind that with
the internet, narrowing your search is usually
better
Stick to small list of search engines and learn
the search syntax for the search engine you’re
using
Boolean Search
Developed by mathematician George Boole
Or widens a search
AND and AND NOT narrows a search
Parentheses used to group operations that
have to be done together
(Public libraries OR bookstores) AND
(Ottawa OR Nepean OR Gloucester OR
Goulbourn OR Carp)
Group A AND
Group B
Public libraries
OR bookstores
Ottawa OR Nepean
OR Gloucester OR
Kanata OR
Goulbourn OR Carp
Orange = Senators
AND NOT hockey
hockey
Senators
Types of search engines
Keyword or robot based (builds a database)
Directory based (categories indexed by
people rather than computer)
Annotated directory-based search engines
Meta indexes (can combine searches or allow
you to search a variety of engines
individually)
Specialized search engines
Keyword or robot based
Search Engines
Large database of web pages
No human involvement and no quality control
Can submit website or will find some on own
Searches full text to certain level, does not
search deep or invisible web
Google (www.google.com)
Alta Vista (www.altavista.com)
Hotbot (www.hotbot.com)
Fast (www.alltheweb.com)
Google (www.google.com)
Presently largest database (1.5-2 billion)
Very sophisticated placement of results
particularly good for popular sites, company
sites
Advanced search can limit search to title of
page or to URL
implied AND
+ for stop words
Google (www.google.com)
cont.
If you want or needs to be expressed in
caps
not case sensitive
no stemming
description shows keywords in context
cached pages
AltaVista (www.altavista.com)
One of larger search engines
Particularly good for finding less popular sites
Implied “or” probably, often changes
Case sensitive when word is in quotations
Stemming with * at end or in middle of words
Search within these results
Sophisticated search of elements, url, text,
etc.
http://help.altavista.com/adv_search/syntax
AltaVista Advanced Search
Has guided search as well as blank space for
true Boolean search using Boolean terms and
parenthesis
Must use Boolean operators or equivalent
symbol (not + and -)
No operators implies phrase
Has sort by feature which can be used to
determine how results are
Can specify dates of last modification
Directory-based Search
Engines
Indexed by individuals so subject searches
will be more accurate
Smaller database than Robot engines
Used mainly for finding good site on general
topic
Yahoo (www.yahoo.com or ca.yahoo.com)
About (about.com or
home.about.com/aboutcanada)
Looksmart (www.looksmart.com)
Yahoo (ca.yahoo.com)
Most popular of directory based search
engines
Many different versions (international have
same pages as others but local options are
supplied first)
Uses Google as search engine
Can search by categories and move up and
down the category structure by clicking on
category and looking at hierarchy
About (about .com or
home.about.com/aboutcanada
Another popular directory-based search
engine
Volunteer guides responsible for finding
good websites on appropriate subjects
Some guides exist on all version of
About but geographic versions have
items specific to country
Annotated directory-based
search engines
Because annotated, database is even
smaller than Directory-based engine
Quality of web pages is better
Web pages often rated
Librarian’s Index to the Internet (lii.org)
Argus Clearinghouse
(www.clearinghouse.net)
Argus Clearinghouse
(www.clearinghouse.net)
Topical list of fairly scholarly guides submitted
to Argus on a variety of subjects.
Can have more than one guide or page on
the same subject.
Not all are accepted and all are objectively
rated by Argus staff and the detailed rating in
available.
Because Argus does not solicit web pages,
coverage is uneven
Date of rating is also provided
Meta indexes
One site searches more than one
search engine
Results can be separated or combined
Sometimes a problem in interpreting
question for all search engines
Used if not sure which search engine
will give you best results and/or obscure
topics
Meta indexes examples
Dogpile (www.dogpile.com)
Metacrawler
(www.metacrawler.com/index.html)
Surfwax (www.surfwax.com)
All4one Search machine
(www.all4one.com)
Specialized Search Engines
Geographic based
(www.altavistacanada.com,
http://www.ottawastart.com/
Phone directories (canada411.sympatico.ca/,
home.infospace.com/)
Newsgroup searching (groups.google.com)
Women’s information (wwwomen.com)
Specialized sites
Ottawa Public Library
(www.library.ottawa.on.ca)
Reference tools (see library reference sites,
eg. lii.org, www.ipl.org/ref)
Encyclopedias (www.britannica.com,
Columbia encyclopedia www.bartleby.com/65/
Canadian information (vrl.tpl.toronto.on.ca/,
Canadian information by subject www.nlcbnc.ca/caninfo/ecaninfo.htm, Canadian
encyclopedia online,
www.thecanadianencyclopedia.com/
Some hints on selecting
search strategies
For any page on general topic you need an
introduction try Directory-based search
engine. If do not need specific quality can
use address bar search
For web page of major company or
organization try Google or Alta Vista if more
obscure
For a specific web page that would not
necessarily be popular try Alta Vista.
Some hints on selecting
search strategies cont.
For health topics try health website
engine like www.medbroadcast.com or
health links on OPL web page.
For very obscure topic topic try Google
or Alta Vista or one of meta indexes
Interpretation of search results
Look at results and reformat search using
things like searching within results and
adding new keywords
Analytically choose which sites to look at in
result list
Anatomy of URL domain + type of name
Do not look through pages and pages of
results. If first three pages are not promising
redo search
Some useful tutorials for
searching
See “Learning to search” section of Collection
of special search engines
www.leidenuniv.nl/ub/biv/specials.htm
Web searching tips
www.searchenginewatch.com/facts/index.htm
l
Net tutor (gateway.lib.ohiostate.edu/tutor/les5/)
Check links under Internet, General in OPL
adult links (www.library.ottawa.on.ca)
To find more info on search
engines
Searchenginewatch
(www.searchenginewatch.com)
Searchengineshowdown
(www.searchengineshowdown.com)