NOT - Santa Rosa Junior College
Download
Report
Transcript NOT - Santa Rosa Junior College
LIR 10 Week 5
Searching and Evaluating Information
on the Internet
This week’s class
Class announcements
Internet directories
Search engines
Evaluating Internet Sources
The “hidden’ Internet?
Finding Information on the Internet
What is the Internet?
Not just World Wide Web
Communication from user to user: telnet, ftp,
Usenet, MUDs, e-mail
“Web” and “Internet”
interchangeable?
WWW only one
way to communicate
HTTP
How is the World Wide Web Organized?
It’s not!!!
The Internet is not organized!
Not designed to be
organized,
searched
Tools developed:
Subject Directories
Search Engines
Subscription Databases vs. Internet
Subject Directories, Search Engines
Many searchable fields
Subscription
Browsable subject
Databases created
for searching
Free text searches
Sophisticated search strings
…not found in Internet IFTs
Internet Directories and Search Engines
A “Quick” Guide
WWW Subject Directories
Organized collections of web sites, sources
Browsable, searchable
Resemble indexes (somewhat)
Hierarchy of categories
Definition/scope note
Selected by humans (usually)
Directory Elements
directory.google.com
Search box
Help Link
Browsable categories, subcategories
Advertising (possibly)
WWW directories are handy…
Browse selective lists, review high quality
sources
Narrow broad topics, investigate subtopics
Timesaving
Subject Directories tour…
Research Directories
Noncommercial
Reliable sites, well
organized
Focus on topics for
research
No fun and games!
Research Directory Examples
The Librarian’s Index to
the Internet (LII):
Internet Public Library:
searchable,
www.ipl.org
annotated subject
Intute
directory
http://www.intute.ac.uk
Developed,
organized, maintained GDN
by librarians
http://gdnet.org/index.php
http://www.lii.org
Academic WWW Directories
Focus on research areas
Institutionally supported
Created by librarians, subject specialists
May have site annotations, scope notes
Examples of Academic Directories
SRJC:
http://www.santarosa.edu/library/Refs/index.shtml
UC System and beyond:
infomine.ucr.edu
Commercial WWW Directories
Broad subject areas
Popular categories
Semi-selective sites
Sites based on producer information, $$$
Unknown criteria
Commercial WWW Directories
Caveats:
When to use:
Scan broad subjects
Need current information
Can be overwhelming
Sites not filtered,
evaluated
Mystifying results?
Advertisers may
influence ranking
Examples of Commercial Directories
dir.yahoo.com
directory.google.com
www.about.com
For kids: www.yahooligans.com
Yahooligans, before & after:
Governmental Directories
Good sources, wide range of topics
Library of Congress International Portal:
http://www.loc.gov/rr/international/portals.html
National Network of Libraries of Medicine:
http://www.nlm.nih.gov/hinfo.html
Kids.gov:
http://www.kids.gov/
Cooperative Directories
Volunteers create, edit topic areas
Information without promoting/ranking
individual websites
Updated constantly?
Examples of Cooperative WWW Directory
Open Directory Project:
http://www.dmoz.org
Wikipedia (kinda sorta):
http://en.wikipedia.org/wiki/Main_Page
Don’t use Wikipedia for your Internet
source
Wikipedia assignment notes
Wikipedia Notes
What is it?
Acceptable as a source?
Strengths & weaknesses
http://www.youtube.com/watch?v=s8O-hv3w-MU
Using Search Engines
Handy Search Engine Sources
SRJC’s Search Engine page:
http://www.santarosa.edu/library/Refs/engines.shtml
Infopeople’s Best Search Tools Chart
http://www.infopeople.org/search/chart.html
Search Engine Watch, for the completely
obsessed:
http://searchenginewatch.com/
Search Engines
The Ugly Truth
How not to use Google:
“A quick Google search of “liberalism on college
campuses” brings a wealth of good evidence that what
is being taught on many of them is anti-American, antireligious, anti-Israel, pro-gay rights and pro-abortion,
often to the exclusion and ridicule of opposing views”
-Cal Thomas in his syndicated newspaper column, 4/2/2005
“Don’t make me Google it!”
-Jessica Simpson, after Nick Lachey refused to tell her how to spell the word “wounded”
A Google search is not research!
Research
Finding, evaluating, understanding variety
of reliable sources from number of
viewpoints
Good sources reveal where, how
information was gathered
A Google search is not research!
You are smarter than an algorithm
How you phrase search determines results
“A site's ranking in Google's search results is
automatically determined by computer
algorithms using thousands of factors...
Sometimes subtleties of language cause
anomalies to appear that cannot be predicted.”
Explanation of Google's Search Results
A Google search is not research!
Thomas begins column by referring to academic
study, published in conservative online journal
Study follows guidelines for academic research
Google search does not!
Also…
“The Internet” can’t spell!
Google can provide definitions:
Google Search: define:wounded
Searching with Search Engines
What’s a Search Engine?
Search for web pages, files, documents
Through specific set of sites (not the entire
WWW)
Updated by crawlers (spiders, robots)
Search for new content, “report” findings
Limitations of Popular Search Engines
In general…
Link to the linked
Ignore disconnected URLs
“Popularity” a factor
Won’t find dynamic pages
However… “we need the eggs”
When to use a search engine…
Time to review how best to search
Survey a lengthy results list
Examine many sites
Evaluate quality
Some knowledge about topic
Common Features of Search Engines
Search boxes
Options to refine searches
Advanced search techniques
Help
…use them!
How to Outsmart Google
“Mall map,” Help or Advanced Search
AND:
Japanese AND camouflage AND skirt
+Japanese +camouflage +skirt
NOT:
“hip hop” NOT bunnies
“hip hop” -bunnies
How to Outsmart Google
OR: (north bay OR Sonoma county)
+conservation
…simple synonyms?
(women OR females) AND marketing
(women OR females) +marketing
More Boolean-esque Options
Phrase searching:
“vampire poodles”
“night terrors” +sleep
disorders
Additional limiters:
Domain extension
Date range
Language
Truncation?
http://www.google.com/advanced
_search
Best Bets
Phrase search (but be careful)
“oldest profession” example
Limit to .edu or .org or .gov
Use +
http://www.google.com/help/refinesearch.html
Google toolbar options (not on Lecture Notes)
Highlight search terms
Find word in site
term site:santarosa.edu
Examples of Popular, Quality Search
Engines
www.google.com
www.altavista.com
www.ask.com
Evaluating Internet Sources
Internet Sources
Evaluate!
No standards
Compare to peer-reviewed journals,
academic journals
Excellent Internet evaluation source:
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/Evaluate.html
Applying START
Scope/Coverage, Treatment/Reliability, Authority, Relevancy, Treatment
to Internet Sources
Evaluating Internet Sources: Scope/Coverage
What’s covered?
Overview?
Detailed information?
Content of
information
Subtopic?
Scope/Content of Website
Example:
http://memory.loc.gov/ammem/collections/ro
binson/index.html
Evaluating Internet Sources: Scope/Coverage
Complete or “web
bites”?
Depth of
information
Edited/abridged
information with info/
links to original
documents?
Brief information
sources warning
Depth of Information Examples
Deep:
http://www.pbs.org/wgbh/amex/zoot/
Not-so-deep:
http://law.jrank.org/pages/2971/Sleepy-Lagoon-Trials-1942-43.html
Evaluating Internet Sources:
Treatment/Reliability
Treatment: how topic is
treated/presented
Information supported by
evidence?
Non-inflammatory
language, reasonable
arguments?
Bias, point of view
shown?
(Wikipedia additions to
the class Reader)
Treatment Examples
Example 1:
The Smoking Section: Website about second-hand smoke
Since 1979, the number of smokers has declined significantly, from about33% of adults,
or higher, to a proportion varyingly reported as being from20% to 25%. During the same
period, a host of anti-smoking laws havedramatically curtailed smoking in public places.
Today, exposure to ETSis not one tenth of what it was in 1979.
Yet, according to an article in theSan Jose Mercury News (October 12, 1993),
fatal asthma attacks have nearlydoubled in that time.
More than 5,100 Americans suffered fatal asthma attacksin 1991, up from about
2,600 in 1979.
Example 2:
The Cigarette Papers
Evaluating Internet Sources:
Treatment/Reliability
Purpose of Web Site:
Provide information?
Selling a product?
Purpose, perspective,
orientation of
information
Arguing a position?
Advertising clearly
distinguishable from
content?
Purpose of site example
Example:
http://www.addictionca.com/FAQ-ecstasy.htm
Can you tell the purpose of the site?
Google search for this site
Another Google search
Should you use this site as a source?
Evaluating Internet Sources:
Treatment/Reliability
Sources cited?
Enough information to
follow up?
Statistics sources?
“Experts” identified?
Reliability: trustworthiness of the
information in the source
Reliability example
http://www.factcheck.org/richardson_flunks_
two_subjects.html
Evaluating Internet Sources: Authority
Discovering information
about the source, author,
organization, etc.
Author(s) or source
listed? (Clearly?)
Author’s occupation,
education, experience?
Affiliated with known
organization,
institution?
Experts w/ subject
knowledge?
Qualifications to
address topic?
Authority Example
http://www.mayoclinic.com/health/AboutThisSite/aboutthissite
How to Find WWW Page Sources to
Evaluate Authority
Check page title, headings, menu, opening
paragraphs
Look near top, bottom, navigational bar
Look for description links:
“About the ______ Association"
"About Us"
“Mission Statement”
Evaluating Authority? Check…
Links to author’s faculty/professional pages
Articles, publications
Library catalogs, Internet search engines,
online databases
No source or author information? Be wary.
Webmaster not responsible for
page content
Examine URL…
Evaluating Internet Sources: Authority
URL: Clues to the origin of
information
“domain extensions”
Is it an organization’s
web site (.org)?
Is it a governmental
web site (.gov)?
Is it a military web site
(.mil)?
Is it a commercial
web site (.com)?
Is it an educational
web site (.edu)?
Deconstructing URLs
URL= Uniform Resource
Locator: the “address” of
web document
“Top Level” Domain =
main subdivision of
internet addresses
Check last two or three
letters after the final dot
at the beginning of URL
Backspace to first section
of URL to find main page
of site
Restricted: .edu, .gov,
.mil, .ca, .us
Unrestricted: .net, .org
.com
See ICANN | Top-Level
Domains (gTLDs) for full
listings
Evaluating Internet Sources: Relevancy
How is information
relevant to your
topic?
What will you use for
your project?
Usefulness of information
…to you!
Statistics?
Data?
Facts?
Opinions?
HOW does the source
support thesis?
Relevancy example
Example:
http://www.thereminworld.com/article.asp?id=17
Evaluating Internet Sources: Timeliness
Information Currency:
Date was site created?
Last updated, revised?
Information cited still valid?
Currency of information
Current References:
Links to other web pages
current/active?
Navigating Web Pages
Closed for Renovation!
Cite that Site!
Citing a Site
“Baseball, the Color Line and Jackie Robinson.”
Baseball and Jackie Robinson. Library of
Congress American Memory. 2 Oct. 2007.
<http://memory.loc.gov/ammem/collections/robins
on/jr1940.html>
Formatting notes for URLs
Remove hyperlinks (disable links)
Other Internet Sources
When you search with a search engine,
you are not
searching the entire Internet
Estimated over 80% of information
available is invisible to search engines
The Hidden Web
Or, the Invisible Web or Deep Web or
Deep Matter
No matter what you call it…
It’s not a secret!
What’s not retrieved
Thousands of
specialized
databases, dynamic
pages, files… millions
of records
The Hidden Internet, cont.
“Hidden” databases
produced by…
Universities
Libraries
Associations
Businesses
Government agencies
Great article on the Invisible Web
Chris Sherman and Gary Price
“The Invisible Web: Uncovering Sources
Search Engines Can’t See”
From Library Trends
Let’s find it!
Invisible Web Search Tools
http://www.lii.org
infomine.ucr.edu
http://www.completeplanet.com
http://www.intute.ac.uk
http://gdnet.org/index.php
(Caveat: searching the last site is challenge!)
Personal WWW Directories
Specific topic areas
Special interest
Quirky
Unlovely (design issues)
Examples of Personal WWW Directories
Matt Drudge’s Dad (it’s okay, he’s a
librarian):
www.refdesk.com
Gary Price, Hidden Web Guru:
http://www.freepint.com/gary/direct.htm
Other Sources of Information on the
Internet
1.
2.
3.
4.
Metasearch Engines
Newsgroups and Listservs
Blogs
News search engines
1. Metasearch Engines
Sends your keywords to several search
engines at the same time
Return a single list of results from multiple
sources
“Source” engine identified (most of the
time)
Standard Search Engines
Search for keywords, number of times
within document
Keywords from a single (updated)
database of websites
Each search engine searches unique
selection of web pages
Results ranked and sorted
Robots or spiders find new websites
Metasearch Engines
Transmits searches simultaneously to
several search engines
Results gathered from engines queried
Search terms sent to indexes maintained
by traditional search engines
Pros and Cons?
Metasearch Engines Examples
Directory:
http://www.santarosa.edu/library/Refs/engines.shtml
http://www.metacrawler.com/
http://www.dogpile.com/
http://ixquick.com/
http://www.webcrawler.com/
2. Newsgroups and Mailing Lists
Newsgroups
Worldwide bulletin
board systems
Tens of thousands of
forums (newsgroups)
Groups focus on wide
range of topics,
accessible to anyone
& everyone
“Bulletin board” style:
original and follow-up
postings
Not subscription based
Newsgroups and Mailing Lists
Mailing Lists
Automated mailing list Discussion groups
distribution systems
Wide-ranging
“Subscribe" to
Subscription-based
discussion lists
Distributed to the
entire subscriber base
via e-mail
Newsgroups and Mailing Lists
Pros
Good for locating
professional
discussions
Interactive, questions
welcome
Collaborative
Alternative voices
Cyber-networking
Cons
Unmoderated or
lightly moderated
forums
“On the Internet, no
one knows you’re a
dog.”
Can be difficult to
search
Newsgroups and Mailing Lists
Accessible mailing lists by Catalist:
http://www.lsoft.com/lists/listref.html
Google groups (the Usenet):
http://groups.google.com
Usenet history: http://www.ibiblio.org/usenet-i/
Yahoo groups:
http://groups.yahoo.com
Topica lists: http://lists.topica.com/
About.com forums: http://www.about.com
Other Ways to Find Collaborative Sources
Search Google or other search engines for
forums. For example:
PDF forums
Search Google or other search engines for
discussion groups. For example:
Gardening discussion groups
Newsgroup and Mailing List Search Examples
Google Groups (Usenet)
Catalist Listserv search
http://www.lsoft.com/lists/listref.html
3. Weblogs or Blogs
Online journals,
updated often
News-oriented
Contain commentary
and links
Personal or
professional focus
Pros:
Alternative sources
Constantly updated
Wide range of topics
Cons:
Difficult to search
Too much information
Reliability of authors
“Feeding Frenzy”
Weblog Sources
Google Directory:
http://directory.google.com/Top/Computers/Internet/On_the_Web/Weblogs/
DMOZ Weblog listing:
http://dmoz.org/Computers/Internet/
IPL Blog Page:
http://www.ipl.org/div/blogs
Search Google or other search engines for
weblogs or blogs. For example:
Gardening weblogs
Homework for Next Week
Read through UC Berkeley’s Internet
Evaluation Site
Listen to Talk of the Nation Wikipedia
show
(Click on the “Listen” icon at the top of the
page; story is about 30 minutes long)
…and/or read the transcript of the show
Complete Internet Quiz
Homework for Next Week
Internet Source Assignment
May e-mail before 3/12
Final Project Due!
See you next week!