Searching Intelligently

Download Report

Transcript Searching Intelligently

Searching Intelligently
How to do better research using
your favorite search engine.
Bill G. Kelm - Spring 2007
Today’s Goals – To Learn
• How is the web indexed?
– Google in particular.
• Which tool to use searching the web?
– Search engines, directories, hidden web,
listservs and online discussion groups.
• Drawbacks and advantages of the web.
• Browser tips and research power tools.
• Horizontal searching.
Bill G. Kelm - Spring 2007
How Search Engines
Work
1.
Discovery and Database
2.
User Search
3.
Presentation and Ranking
Source: http://www.webreference.com/content/search/
Bill G. Kelm - Spring 2007
Google Background
• “Google's mission is to organize the world's
information and make it universally accessible and
useful.”
• Google's founders Larry Page and Sergey Brin
developed Google in a Stanford University dorm
room and it is currently the world's largest
search engine.
Source: http://www.google.com/corporate/
Bill G. Kelm - Spring 2007
Google’s Discovery and
Database
• Google has programs called spiders (a.k.a. Google
bots) constantly searching the web for new or
updated web pages
• When a spider finds a new or updated page, it
reads that entire page, reports back to Google,
and then visits all of the other pages to which
that new page links
Bill G. Kelm - Spring 2007
Google’s Cache
• When the spider reports back to Google, it
doesn’t just tell Google the new or updated page’s
URL.
• The spider also sends Google a complete copy of
the entire Web page – HTML, text, images, etc.
• Google then adds that page and all of its content
to Google’s cache.
Bill G. Kelm - Spring 2007
How Google Works
• When you search for multiple keywords, Google
first searches for all of your keywords as a
phrase.
• So, if your keywords are baseball spring
training, any pages on which those words appear
as a phrase receive a score of X.
Bill G. Kelm - Spring 2007
Google – Adjacency
• Google then measures the adjacency between
your keywords and gives those pages a score of Y.
• A page with “baseball spring training” next to
each other gets a higher score than one with
“baseball” and then “spring training” farther
down the page
Bill G. Kelm - Spring 2007
Google - Weights
• Then, Google measures the number of
times your keywords appear on the
page (the keywords’ “weights”) and
gives those pages a score of Z.
Bill G. Kelm - Spring 2007
Presentation & Ranking
• Google takes
– The phrase hits (the Xs),
– The adjacency hits (the Ys),
– The weights hits (the Zs), and
– About 100 other secret variables
• Throws out everything but the top 2,000
• Multiplies each remaining page’s individual score
by it’s “PageRank”
• And, finally, displays the top 1,000 in order.
Bill G. Kelm - Spring 2007
Google – PageRank?
• There is a premise in higher education that the
importance of a research article can be judged by
the number of citations to it from subsequent
articles in the same field.
• Google applies this premise to the Web: the
importance of a Web page can be judged by the
number of hyperlinks pointing to it from other
pages.
Bill G. Kelm - Spring 2007
Google Advanced &
Tricks
• Calculator
• Define
• ~, +, • Advanced Searching
• Finding Information on the Internet a Tutorial
Bill G. Kelm - Spring 2007
My Favorite Quote:
• “Focus on users and their tasks, not the
technology.” – Jeff Johnson
Bill G. Kelm - Spring 2007
When Searching the
Web:
• “Focus on your query, not the technology.”
Bill G. Kelm - Spring 2007
Four tools:
1.
Search Engines
2.
Directories
3.
Invisible Web (Deep Web)
4.
Listservs and Online Discussion Groups
Bill G. Kelm - Spring 2007
Which Tool to Use?
• “It all Depends.”
Bill G. Kelm - Spring 2007
When to use a
Search Engine:
• You are looking for the “Society of
American Registered Architects.”
• You have a specific phrase or unique
keyword
Bill G. Kelm - Spring 2007
Which Search Engines
are Used?
Source: http://searchenginewatch.com/showPage.html?page=2156431
Bill G. Kelm - Spring 2007
Rating Search Engines
• Search Engine Watch
• Search Engine Showdown
Bill G. Kelm - Spring 2007
Problems With Search
Engines:
• Speed response eliminates some
documents
• Bias toward text
• User expectation and skills
• Costs of crawling
• Metasearch engine: jux2
Bill G. Kelm - Spring 2007
When to Use a
Directory:
• “I’m looking for sites on American
Architecture.”
• Broad category
• Early in your research
• Opposing viewpoints
Bill G. Kelm - Spring 2007
Sample Directories
• Google Directory
• Internet Scout Project
• Internet Resources Columns
• Targeted Directories:
Classics Resources
Bill G. Kelm - Spring 2007
Problems With
Directories:
• Small
Editorial policies
• Timeliness
• Charging for listing
Bill G. Kelm - Spring 2007
Hidden/Invisible Web
• Searchable databases
• Excluded pages
Bill G. Kelm - Spring 2007
When to use the
Invisible Web:
• “I’m looking for a list of architects in
Baltimore.”
• “I need a specific statistic on the death
rate of women with heart disease in 2002.”
• “I’m looking for information on a plane
crash in Salem, OR in 1979.”
Bill G. Kelm - Spring 2007
How to Find
the Hidden Web
• Google:
– Databases + your topic
• Searching general web directories
– Librarians Index
– Infomine
Bill G. Kelm - Spring 2007
When to use a Listserv?
• “If I’m looking for an opinion on a
particular topic.”
Bill G. Kelm - Spring 2007
How to Find a Listserv:
• Tile.net
• Google: “topic” and listserv
• Google Groups
Bill G. Kelm - Spring 2007
Browser Tips & Tools
• Bookmarks (personal toolbars) – Del.icio.us
• History
• ConQuery (search plugins)
– Journal Title List
– Creative Commons
– Open WorldCat via Google
• Bookmarklets
• Tabs, Tabs, Tabs
Bill G. Kelm - Spring 2007
Horizontal Searching
• Use the web in conjunction with library catalogs
and databases
• Search the Web for titles of articles
• Locate more bibliographies that can be
incorporated into new searches for books, journal
articles, etc.
• Search for authors from books and articles
Bill G. Kelm - Spring 2007
Horizontal Searching:
Search a Library Database
Bill G. Kelm - Spring 2007
Horizontal Searching:
Search title of article on the Web
Bill G. Kelm - Spring 2007
Horizontal Searching:
Follow citations from Web site
Bill G. Kelm - Spring 2007
Horizontal Searching:
Search Book Title
in the Library Catalog
Bill G. Kelm - Spring 2007
Horizontal Searching:
Follow subject headings from article
Bill G. Kelm - Spring 2007
Horizontal Searching:
Follow cited references / and search
Bill G. Kelm - Spring 2007
Horizontal Searching:
Organization Web sites
and Official Reports
Bill G. Kelm - Spring 2007
Horizontal Searching:
Contact actual researchers on the topic
Bill G. Kelm - Spring 2007
Wrap Up:
•
•
•
•
•
Know how the web is indexed and collected.
Choose the correct tool for your question.
Realize more than one tool may be needed.
Carefully evaluate whatever you find on the Web.
Think horizontally in searching: library databases,
Web, bibliography, Web, library catalog, Web,
reference book, Web…
Bill G. Kelm - Spring 2007
Bibliography
• Cohen Laura (2001) 10 tips for teaching how to surf the
Web. American Libraries, 32, 44-46.
• Sherman, C., Price, G. (2001). The Invisible Web: Uncovering
Information Sources Search Engines Can't See. Medford,
N.J.: Information Today, Inc.
• Dale Vidmar’s: Horizontal Searching
• Linda Goff’s: Googling to the Max
Bill G. Kelm - Spring 2007