Size of Internet/World Wide Web
Download
Report
Transcript Size of Internet/World Wide Web
Basic Internet Search
Techniques
Or
How to really find information on
the internet
Shayna Keces
Reference Librarian
236-0301 ext. 441
August 2004
Agenda
Size of Internet
Types of search engines
Search strategies
Some hints on selecting search
strategies
Interpretation of search results
Tutorials on searching and search
engines
Size of Internet/World Wide
Web
July 2000 2.1 billion web pages, est. 4 billion
pages by early 2001 (Some place much
higher if count invisible or deep web)
Size of search engine databases
Google 4.28 billion
Fast (alltheweb) 2.1 billion
AltaVista 1.1 billion
Yahoo 2 million catalogued
Search strategies
Do not
use search button
use a string of keywords without specifying
Boolean properties
use upper case unless part of strategy
use NOT or - unless absolutely sure is
necessary
elimination of unanticipated pages
format is non standardized
Search Strategies
Do
Consider what type of resource will best
answer your question and search for that
resource (eg. dictionary or certain type of web
page)
think of a list of keywords that will narrow or
broaden your search keeping in mind that with
the internet, narrowing your search is usually
better
Stick to small list of search engines and learn
the search syntax for the search engine you’re
using
Types of search engines
Keyword or robot based (builds a database)
Directory based (categories indexed by
people rather than computer)
Annotated directory-based search engines
Meta indexes (can combine searches or allow
you to search a variety of engines
individually)
Specialized search engines
Keyword or robot based
Search Engines
Large database of web pages
No human involvement and no quality control
Can submit website or will find some on own
Searches full text to certain level, does not
search deep or invisible web
Google (www.google.com)
Alta Vista (www.altavista.com)
Fast (www.alltheweb.com)
Wisenut (www.wisenut.com)
Google (www.google.com)
Presently largest database (ca. 4 billion)
Very sophisticated placement of results
particularly good for popular sites, company
sites
Advanced search can limit search to title of
page or to URL
implied AND
+ for stop words
If you want or needs to be expressed in caps
not case sensitive
Google (www.google.com)
cont.
no stemming or truncation (except on ad hoc
basis controlled by Google.
description shows keywords in context
cached pages helpful for sites not working
Searches some formats not found in other
search engines (eg. Adobe acrobat and
postscript files, Excel, Powerpoint, and Word
files as well as rich text files.)
Innovative in new features (eg. ability to
convert measurements, eg. 4 miles in km)
See www.google.ca/help/features.html for a
description of features.
AltaVista (www.altavista.com)
One of larger search engines (1.3 billion
pages/objects or more)
Particularly good for finding less popular sites
Implied “and” but noted for changing
Case sensitive when word is in quotations
Stemming with * at end or in middle of words
Has related terms which helps you focus your
search
AltaVista Advanced Search
Has “build a Boolean search” facility or
can create your own
Can specify pages be from certain
country based on country codes so will
not include .com etc.
Can specify dates of last modification
Directory-based Search
Engines
Indexed by individuals so subject searches
will be more accurate
Smaller database than Robot engines
Used mainly for finding good site on general
topic
Yahoo (www.yahoo.com or ca.yahoo.com)
About (about.com )
Looksmart (www.looksmart.com)
Yahoo (ca.yahoo.com)
Most popular of directory based search
engines
Many different versions (international have
same pages as others but local options are
supplied first)
Now has own web search which is competing
with Google’s
Can search by categories and sub-categories
Annotated directory-based
search engines
Because annotated, database is even
smaller than Directory-based engine
Quality of web pages is better
Web pages often rated
Librarian’s Index to the Internet (lii.org)
The Internet Public Library
(www.ipl.org/)
Librarian’s Index to the
Internet (lii.org)
Topical list of high quality websites with
abstracts and qualitative analysis
Can willow down by topic or use search
capability
Only websites which meet the standards of
the editors are included
Provides date site was added to index as well
as date the lii entry was last updated
Meta indexes
One site searches more than one search
engine
Results can be separated or combined
Sometimes a problem in interpreting question
equally effectively for all search engines
Used if not sure which search engine will give
you best results and/or for obscure topics
Meta indexes examples
Dogpile (www.dogpile.com)
Metacrawler
(www.metacrawler.com/index.html)
Surfwax (www.surfwax.com)
Hotbot (www.hotbot.com)
Specialized Search Engines
Geographic based
(www.altavistacanada.com,
http://www.ottawastart.com/
Phone directories (canada411.sympatico.ca/,
www.infospace.com/canada/index.htm)
Newsgroup searching (groups.google.com)
News searching (news.google.ca)
Women’s information (wwwomen.com)
Different formats (www.gimpsy.com/,
www.kartoo.com/)
Specialized sites
Ottawa Public Library
(www.library.ottawa.on.ca)
Reference tools (see library reference sites,
eg. lii.org, www.ipl.org/ref)
Encyclopedias (www.britannica.com,
Columbia encyclopedia www.bartleby.com/65/
Canadian information (vrl.tpl.toronto.on.ca/,
Canadian information by subject www.nlcbnc.ca/caninfo/ecaninfo.htm, Canadian
encyclopedia online,
www.thecanadianencyclopedia.com/
Some hints on selecting
search strategies
For any page on general topic to which you
need an introduction try Directory-based
search engine. If do not need specific quality
can use address bar search
For web page of major company or
organization try Google or Alta Vista
For a specific web page that would not
necessarily be popular try Alta Vista or
Google
Some hints on selecting
search strategies cont.
For health topics try a health website engine
like www.medbroadcast.com or the Canadian
Health Network www.canadian-healthnetwork.ca/customtools/homee.html, or the
library’s health database, Health Source
(www.library.ottawa.on.ca/electronic/index.ht
m), or the health links on the library’s web
page
(www.library.ottawa.on.ca/english/links/Public
Adults/index.htm).
Some hints on selecting
search strategies cont.
For very obscure topic topic try Google
or Alta Vista or one of meta indexes
For items in databases, try to find the
correct host or search a special site for
invisible websites (eg. www.invisibleweb.net/)
Interpretation of search results
Look at results and reformat search using
things like searching within results, Prisma
and adding new keywords.
Analytically choose which sites to look at in
result list
Anatomy of URL domain + type of name, I.e.
the name or organization followed by the type
of organization. Some popular suffixes are:
.com for commercial sites, .edu for university
sites (mainly American), .org for non-profit
organizations, .gov for U.S. government sites,
and .gc.ca for Canadian government sites.
Interpretation of search results
cont.
Consider things like the authority of the
author, the currency of the information, and
the reason for creating the website
(implications for bias)
Do not look through pages and pages of
results. If the first three pages are not
promising refine the search (see the first
point on interpreting the results).
Some useful tutorials for
searching
See “Learning to search” section of Collection
of special search engines (appears under
contents on left-hand side of the page)
www.leidenuniv.nl/ub/biv/specials.htm
Web searching tips
www.searchenginewatch.com/facts/index.htm
l
Net tutor (gateway.lib.ohiostate.edu/tutor/les5/)
Some useful tutorials for
searching cont.
In the links section of the Ottawa Public
Library’s web site,
(www.library.ottawa.on.ca/english/links/
PublicAdults/index.htm), look under the
category WWW under the subcategory
Internet
To find more info on search
engines
Searchenginewatch
(www.searchenginewatch.com)
Searchengineshowdown
(www.searchengineshowdown.com)
For More Help on Searching
Contact the Reference Dept. of the
Main Branch of OPL by phoning 2360302, ext. 233, or email
[email protected]
Consult this web page or other
specialized web presentations on the
library’s web page at
http://www.library.ottawa.on.ca/english/s
ervices/reference/index.htm