SSD - Willamette University
Download
Report
Transcript SSD - Willamette University
Searching Intelligently:
It’s No Longer a Nightmare
Oregon Library Support Staff Division
Gateways 2006 Conference
Bill G. Kelm - July 21, 2006
Today’s Goals – To Learn
How is the web indexed?
– Google in particular.
Which tool to use?
– Search engines, directories, hidden web,
listservs and online discussion groups.
Drawbacks and advantages of the Web.
Browser tips and research power tools.
Horizontal searching.
Bill G. Kelm - July 21, 2006
How Search Engines Work
1.
2.
3.
Discovery and Database
User Search
Presentation and Ranking
Source: http://www.webreference.com/content/search/
Bill G. Kelm - July 21, 2006
Google Background
“Google's mission is to organize the world's
information and make it universally
accessible and useful.”
Google's founders Larry Page and Sergey
Brin developed Google in a Stanford
University dorm room and it is currently the
world's largest search engine.
Source: http://www.google.com/corporate/
Bill G. Kelm - July 21, 2006
Google’s Discovery and
Database
Google has programs called spiders (a.k.a.
Google bots) constantly searching the web
for new or updated web pages.
When a spider finds a new or updated page, it
reads that entire page, reports back to
Google, and then visits all of the other pages
to which that new page links.
Bill G. Kelm - July 21, 2006
Google’s Cache
When the spider reports back to Google, it
doesn’t just tell Google the new or updated
page’s URL.
The spider also sends Google a complete
copy of the entire Web page – HTML, text,
images, etc.
Google then adds that page and all of its
content to Google’s cache.
Bill G. Kelm - July 21, 2006
How Google Works
When you search for multiple keywords,
Google first searches for all of your
keywords as a phrase.
So, if your keywords are baseball
spring training, any pages on which
those words appear as a phrase receive a
score of X.
Bill G. Kelm - July 21, 2006
Google – Adjacency
Google then measures the adjacency
between your keywords and gives those
pages a score of Y.
A page with “baseball spring training”
next to each other gets a higher score than
one with “baseball” and then “spring
training” farther down the page.
Bill G. Kelm - July 21, 2006
Google - Weights
Then, Google measures the number of times
your keywords appear on the page (the
keywords’ “weights”) and gives those pages
a score of Z.
Bill G. Kelm - July 21, 2006
Presentation & Ranking
Google takes
–
–
–
–
The phrase hits (the Xs),
The adjacency hits (the Ys),
The weights hits (the Zs), and
About 100 other secret variables
Throws out everything but the top 2,000
Multiplies each remaining page’s individual score
by it’s “PageRank”
And, finally, displays the top 1,000 in order.
Bill G. Kelm - July 21, 2006
Google – PageRank?
There is a premise in higher education that
the importance of a research article can be
judged by the number of citations to it from
subsequent articles in the same field.
Google applies this premise to the Web: the
importance of a Web page can be judged by
the number of hyperlinks pointing to it from
other pages.
Bill G. Kelm - July 21, 2006
Google Advanced & Tricks
Calculator
Define
~, +, Advanced Searching
Finding Information on the Internet a
Tutorial
Bill G. Kelm - July 21, 2006
My Favorite Quote:
“Focus on users and their tasks, not the
technology.” – Jeff Johnson
Bill G. Kelm - July 21, 2006
When Searching the Web:
“Focus on your query, not the technology.”
Bill G. Kelm - July 21, 2006
Four tools:
1.
2.
3.
4.
Search Engines
Directories
Invisible Web (Deep Web)
Listservs and Online Discussion Groups
Bill G. Kelm - July 21, 2006
Which Tool to Use?
“It all Depends.”
Bill G. Kelm - July 21, 2006
When to use a Search Engine:
You are looking for the “Society of
American Registered Architects.”
You have a specific phrase or unique
keyword.
Bill G. Kelm - July 21, 2006
Which Search Engines are Used
Can you guess what percentage of people
use the various search engines available?
Bill G. Kelm - July 21, 2006
Share of Searches 2005
Source: http://searchenginewatch.com/reports/article.php/2156451
Bill G. Kelm - July 21, 2006
Rating Search Engines
Search Engine Watch
Search Engine Showdown
Bill G. Kelm - July 21, 2006
Problems With Search Engines:
Speed response eliminates some documents
Bias toward text
User expectation and skills
Costs of crawling
Metasearch engine: jux2
Bill G. Kelm - July 21, 2006
When to Use a Directory:
“I’m looking for sites on American
Architecture.”
Broad category
Early in your research
Opposing viewpoints
Bill G. Kelm - July 21, 2006
Sample Directories
Google Directory
Internet Scout Project
Internet Resources Columns
Targeted Directories:
Classics Resources
Bill G. Kelm - July 21, 2006
Problems With Directories:
Small
Editorial policies
Timeliness
Charging for listing
Bill G. Kelm - July 21, 2006
Hidden/Invisible Web
Searchable databases
Excluded pages
Bill G. Kelm - July 21, 2006
When to use the Invisible Web:
“I’m looking for a list of architects in
Baltimore.”
“I need a specific statistic on the death rate
of women with heart disease in 2002.”
“I’m looking for information on a plane
crash in Salem, OR in 1979.”
Bill G. Kelm - July 21, 2006
How to Find the Hidden Web
Google:
– Databases + your topic
Searching general web directories
– Librarians Index
– Infomine
Bill G. Kelm - July 21, 2006
When to use a Listserv?
“If I’m looking for an opinion on a
particular topic.”
Bill G. Kelm - July 21, 2006
How to Find a Listserv:
Tile.net
Google: “topic” and listserv
Google Groups
Bill G. Kelm - July 21, 2006
Browser Tips & Tools
Bookmarks (personal toolbars)
History
ConQuery (search plugins)
– Journal Title List
– Creative Commons
– Open WorldCat via Google
Bookmarklets
Tabs, Tabs, Tabs
Bill G. Kelm - July 21, 2006
Horizontal Searching
Use the web in conjunction with library
catalogs and databases.
Search the Web for titles of articles.
Locate more bibliographies that can be
incorporated into new searches for books,
journal articles, etc.
Search for authors from books and articles.
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search a Library Database
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search title of article on the Web
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow citations from Web site
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Search Book Title in the Library Catalog
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow subject headings from article
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Follow cited references / and search
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Organization Web sites and Official Reports
Bill G. Kelm - July 21, 2006
Horizontal Searching:
Contact actual researchers on the topic
Bill G. Kelm - July 21, 2006
Wrap Up:
Know how the web is indexed and collected.
Choose the correct tool for your question.
Realize more than one tool may be needed.
Carefully evaluate whatever you find on the Web.
Think horizontally in searching: library databases,
Web, bibliography, Web, library catalog, Web,
reference book, Web…
Bill G. Kelm - July 21, 2006
Bibliography
Cohen Laura (2001) 10 tips for teaching how to
surf the Web. American Libraries, 32, 44-46.
Sherman, C., Price, G. (2001). The Invisible Web:
Uncovering Information Sources Search Engines
Can't See. Medford, N.J.: Information Today, Inc.
Dale Vidmar’s: Horizontal Searching
Linda Goff’s: Googling to the Max
Bill G. Kelm - July 21, 2006