SSD - Willamette University

Download Report

Transcript SSD - Willamette University

Searching Intelligently:
It’s No Longer a Nightmare
Oregon Library Support Staff Division
Gateways 2006 Conference
Bill G. Kelm - July 21, 2006
Today’s Goals – To Learn

How is the web indexed?
– Google in particular.

Which tool to use?
– Search engines, directories, hidden web,
listservs and online discussion groups.

Drawbacks and advantages of the Web.
 Browser tips and research power tools.
 Horizontal searching.
Bill G. Kelm - July 21, 2006
How Search Engines Work
1.
2.
3.
Discovery and Database
User Search
Presentation and Ranking
Source: http://www.webreference.com/content/search/
Bill G. Kelm - July 21, 2006
Google Background
“Google's mission is to organize the world's
information and make it universally
accessible and useful.”
 Google's founders Larry Page and Sergey
Brin developed Google in a Stanford
University dorm room and it is currently the
world's largest search engine.

Source: http://www.google.com/corporate/
Bill G. Kelm - July 21, 2006
Google’s Discovery and
Database

Google has programs called spiders (a.k.a.
Google bots) constantly searching the web
for new or updated web pages.
 When a spider finds a new or updated page, it
reads that entire page, reports back to
Google, and then visits all of the other pages
to which that new page links.
Bill G. Kelm - July 21, 2006
Google’s Cache

When the spider reports back to Google, it
doesn’t just tell Google the new or updated
page’s URL.
 The spider also sends Google a complete
copy of the entire Web page – HTML, text,
images, etc.
 Google then adds that page and all of its
content to Google’s cache.
Bill G. Kelm - July 21, 2006
How Google Works

When you search for multiple keywords,
Google first searches for all of your
keywords as a phrase.
 So, if your keywords are baseball
spring training, any pages on which
those words appear as a phrase receive a
score of X.
Bill G. Kelm - July 21, 2006
Google – Adjacency

Google then measures the adjacency
between your keywords and gives those
pages a score of Y.
 A page with “baseball spring training”
next to each other gets a higher score than
one with “baseball” and then “spring
training” farther down the page.
Bill G. Kelm - July 21, 2006
Google - Weights

Then, Google measures the number of times
your keywords appear on the page (the
keywords’ “weights”) and gives those pages
a score of Z.
Bill G. Kelm - July 21, 2006
Presentation & Ranking

Google takes
–
–
–
–
The phrase hits (the Xs),
The adjacency hits (the Ys),
The weights hits (the Zs), and
About 100 other secret variables

Throws out everything but the top 2,000
 Multiplies each remaining page’s individual score
by it’s “PageRank”
 And, finally, displays the top 1,000 in order.
Bill G. Kelm - July 21, 2006
Google – PageRank?

There is a premise in higher education that
the importance of a research article can be
judged by the number of citations to it from
subsequent articles in the same field.
 Google applies this premise to the Web: the
importance of a Web page can be judged by
the number of hyperlinks pointing to it from
other pages.
Bill G. Kelm - July 21, 2006
Google Advanced & Tricks

Calculator
 Define
 ~, +,  Advanced Searching
 Finding Information on the Internet a
Tutorial
Bill G. Kelm - July 21, 2006
My Favorite Quote:

“Focus on users and their tasks, not the
technology.” – Jeff Johnson
Bill G. Kelm - July 21, 2006
When Searching the Web:

“Focus on your query, not the technology.”
Bill G. Kelm - July 21, 2006
Four tools:
1.
2.
3.
4.
Search Engines
Directories
Invisible Web (Deep Web)
Listservs and Online Discussion Groups
Bill G. Kelm - July 21, 2006
Which Tool to Use?

“It all Depends.”
Bill G. Kelm - July 21, 2006
When to use a Search Engine:

You are looking for the “Society of
American Registered Architects.”

You have a specific phrase or unique
keyword.
Bill G. Kelm - July 21, 2006
Which Search Engines are Used

Can you guess what percentage of people
use the various search engines available?
Bill G. Kelm - July 21, 2006
Share of Searches 2005
Source: http://searchenginewatch.com/reports/article.php/2156451
Bill G. Kelm - July 21, 2006
Rating Search Engines

Search Engine Watch

Search Engine Showdown
Bill G. Kelm - July 21, 2006
Problems With Search Engines:

Speed response eliminates some documents
 Bias toward text
 User expectation and skills
 Costs of crawling
 Metasearch engine: jux2
Bill G. Kelm - July 21, 2006
When to Use a Directory:
“I’m looking for sites on American
Architecture.”
 Broad category
 Early in your research
 Opposing viewpoints

Bill G. Kelm - July 21, 2006
Sample Directories

Google Directory
 Internet Scout Project
 Internet Resources Columns

Targeted Directories:
Classics Resources
Bill G. Kelm - July 21, 2006
Problems With Directories:

Small
 Editorial policies
 Timeliness
 Charging for listing
Bill G. Kelm - July 21, 2006
Hidden/Invisible Web

Searchable databases
 Excluded pages
Bill G. Kelm - July 21, 2006
When to use the Invisible Web:
“I’m looking for a list of architects in
Baltimore.”
 “I need a specific statistic on the death rate
of women with heart disease in 2002.”
 “I’m looking for information on a plane
crash in Salem, OR in 1979.”

Bill G. Kelm - July 21, 2006
How to Find the Hidden Web

Google:
– Databases + your topic

Searching general web directories
– Librarians Index
– Infomine
Bill G. Kelm - July 21, 2006
When to use a Listserv?

“If I’m looking for an opinion on a
particular topic.”
Bill G. Kelm - July 21, 2006
How to Find a Listserv:

Tile.net
 Google: “topic” and listserv
 Google Groups
Bill G. Kelm - July 21, 2006
Browser Tips & Tools

Bookmarks (personal toolbars)
 History
 ConQuery (search plugins)
– Journal Title List
– Creative Commons
– Open WorldCat via Google

Bookmarklets
 Tabs, Tabs, Tabs
Bill G. Kelm - July 21, 2006
Horizontal Searching

Use the web in conjunction with library
catalogs and databases.
 Search the Web for titles of articles.
 Locate more bibliographies that can be
incorporated into new searches for books,
journal articles, etc.
 Search for authors from books and articles.
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search a Library Database
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search title of article on the Web
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow citations from Web site
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search Book Title in the Library Catalog
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow subject headings from article
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow cited references / and search
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Organization Web sites and Official Reports
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Contact actual researchers on the topic
Bill G. Kelm - July 21, 2006
Wrap Up:





Know how the web is indexed and collected.
Choose the correct tool for your question.
Realize more than one tool may be needed.
Carefully evaluate whatever you find on the Web.
Think horizontally in searching: library databases,
Web, bibliography, Web, library catalog, Web,
reference book, Web…
Bill G. Kelm - July 21, 2006
Bibliography

Cohen Laura (2001) 10 tips for teaching how to
surf the Web. American Libraries, 32, 44-46.

Sherman, C., Price, G. (2001). The Invisible Web:
Uncovering Information Sources Search Engines
Can't See. Medford, N.J.: Information Today, Inc.


Dale Vidmar’s: Horizontal Searching
Linda Goff’s: Googling to the Max
Bill G. Kelm - July 21, 2006