NOT - Santa Rosa Junior College

Download Report

Transcript NOT - Santa Rosa Junior College

LIR 10 Week 5
Searching and Evaluating Information
on the Internet
This week’s class
Class announcements
Internet directories
Search engines
Evaluating Internet Sources
The “hidden’ Internet?
Finding Information on the Internet
What is the Internet?
Not just World Wide Web
Communication from user to user: telnet, ftp,
Usenet, MUDs, e-mail
“Web” and “Internet”
interchangeable?
WWW only one
way to communicate
HTTP
How is the World Wide Web Organized?
It’s not!!!
The Internet is not organized!
Not designed to be
organized,
searched
Tools developed:
Subject Directories
Search Engines
Subscription Databases vs. Internet
Subject Directories, Search Engines
 Many searchable fields
Subscription
 Browsable subject
Databases created
for searching
 Free text searches
 Sophisticated search strings
…not found in Internet IFTs
Internet Directories and Search Engines
A “Quick” Guide
WWW Subject Directories
Organized collections of web sites, sources
Browsable, searchable
Resemble indexes (somewhat)
Hierarchy of categories
Definition/scope note
Selected by humans (usually)
Directory Elements
directory.google.com
Search box
Help Link
Browsable categories, subcategories
Advertising (possibly)
WWW directories are handy…
Browse selective lists, review high quality
sources
Narrow broad topics, investigate subtopics
Timesaving
Subject Directories tour…
Research Directories
Noncommercial
Reliable sites, well
organized
Focus on topics for
research
No fun and games!
Research Directory Examples
The Librarian’s Index to
the Internet (LII):
 Internet Public Library:
searchable,
www.ipl.org
annotated subject
 Intute
directory
http://www.intute.ac.uk
 Developed,
organized, maintained  GDN
by librarians
http://gdnet.org/index.php
http://www.lii.org
Academic WWW Directories
Focus on research areas
Institutionally supported
Created by librarians, subject specialists
May have site annotations, scope notes
Examples of Academic Directories
SRJC:
http://www.santarosa.edu/library/Refs/index.shtml
UC System and beyond:
infomine.ucr.edu
Commercial WWW Directories
Broad subject areas
Popular categories
Semi-selective sites
Sites based on producer information, $$$
Unknown criteria
Commercial WWW Directories
Caveats:
When to use:
 Scan broad subjects
 Need current information
 Can be overwhelming
 Sites not filtered,
evaluated
 Mystifying results?
 Advertisers may
influence ranking
Examples of Commercial Directories
dir.yahoo.com
directory.google.com
www.about.com
For kids: www.yahooligans.com
Yahooligans, before & after:
Governmental Directories
Good sources, wide range of topics
Library of Congress International Portal:
http://www.loc.gov/rr/international/portals.html
National Network of Libraries of Medicine:
http://www.nlm.nih.gov/hinfo.html
Kids.gov:
http://www.kids.gov/
Cooperative Directories
Volunteers create, edit topic areas
Information without promoting/ranking
individual websites
Updated constantly?
Examples of Cooperative WWW Directory
Open Directory Project:
http://www.dmoz.org
Wikipedia (kinda sorta):
http://en.wikipedia.org/wiki/Main_Page
Don’t use Wikipedia for your Internet
source
Wikipedia assignment notes
Wikipedia Notes
What is it?
Acceptable as a source?
Strengths & weaknesses
http://www.youtube.com/watch?v=s8O-hv3w-MU
Using Search Engines
Handy Search Engine Sources
SRJC’s Search Engine page:
http://www.santarosa.edu/library/Refs/engines.shtml
Infopeople’s Best Search Tools Chart
http://www.infopeople.org/search/chart.html
Search Engine Watch, for the completely
obsessed:
http://searchenginewatch.com/
Search Engines
The Ugly Truth
How not to use Google:
“A quick Google search of “liberalism on college
campuses” brings a wealth of good evidence that what
is being taught on many of them is anti-American, antireligious, anti-Israel, pro-gay rights and pro-abortion,
often to the exclusion and ridicule of opposing views”
-Cal Thomas in his syndicated newspaper column, 4/2/2005
“Don’t make me Google it!”
-Jessica Simpson, after Nick Lachey refused to tell her how to spell the word “wounded”
A Google search is not research!
Research
Finding, evaluating, understanding variety
of reliable sources from number of
viewpoints
Good sources reveal where, how
information was gathered
A Google search is not research!
 You are smarter than an algorithm
 How you phrase search determines results
 “A site's ranking in Google's search results is
automatically determined by computer
algorithms using thousands of factors...
Sometimes subtleties of language cause
anomalies to appear that cannot be predicted.”
Explanation of Google's Search Results
A Google search is not research!
 Thomas begins column by referring to academic
study, published in conservative online journal
 Study follows guidelines for academic research
 Google search does not!
Also…
“The Internet” can’t spell!
Google can provide definitions:
Google Search: define:wounded
Searching with Search Engines
What’s a Search Engine?
Search for web pages, files, documents
Through specific set of sites (not the entire
WWW)
Updated by crawlers (spiders, robots)
Search for new content, “report” findings
Limitations of Popular Search Engines
In general…
Link to the linked
Ignore disconnected URLs
“Popularity” a factor
Won’t find dynamic pages
However… “we need the eggs”
When to use a search engine…
Time to review how best to search
Survey a lengthy results list
Examine many sites
Evaluate quality
Some knowledge about topic
Common Features of Search Engines
Search boxes
Options to refine searches
Advanced search techniques
Help
…use them!
How to Outsmart Google
 “Mall map,” Help or Advanced Search
AND:
Japanese AND camouflage AND skirt
+Japanese +camouflage +skirt
NOT:
“hip hop” NOT bunnies
“hip hop” -bunnies
How to Outsmart Google
OR: (north bay OR Sonoma county)
+conservation
…simple synonyms?
(women OR females) AND marketing
(women OR females) +marketing
More Boolean-esque Options
Phrase searching:
“vampire poodles”
“night terrors” +sleep
disorders
Additional limiters:
Domain extension
Date range
Language
Truncation?
http://www.google.com/advanced
_search
Best Bets
Phrase search (but be careful)
“oldest profession” example
Limit to .edu or .org or .gov
Use +
 http://www.google.com/help/refinesearch.html
Google toolbar options (not on Lecture Notes)
Highlight search terms
Find word in site
term site:santarosa.edu
Examples of Popular, Quality Search
Engines
www.google.com
www.altavista.com
www.ask.com
Evaluating Internet Sources
Internet Sources
Evaluate!
No standards
Compare to peer-reviewed journals,
academic journals
Excellent Internet evaluation source:
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/Evaluate.html
Applying START
Scope/Coverage, Treatment/Reliability, Authority, Relevancy, Treatment
to Internet Sources
Evaluating Internet Sources: Scope/Coverage
 What’s covered?
 Overview?
 Detailed information?
Content of
information
 Subtopic?
Scope/Content of Website
Example:
http://memory.loc.gov/ammem/collections/ro
binson/index.html
Evaluating Internet Sources: Scope/Coverage
 Complete or “web
bites”?
Depth of
information
 Edited/abridged
information with info/
links to original
documents?
 Brief information
sources warning
Depth of Information Examples
Deep:
http://www.pbs.org/wgbh/amex/zoot/
Not-so-deep:
http://law.jrank.org/pages/2971/Sleepy-Lagoon-Trials-1942-43.html
Evaluating Internet Sources:
Treatment/Reliability
Treatment: how topic is
treated/presented
 Information supported by
evidence?
 Non-inflammatory
language, reasonable
arguments?
 Bias, point of view
shown?
 (Wikipedia additions to
the class Reader)
Treatment Examples
Example 1:
The Smoking Section: Website about second-hand smoke
Since 1979, the number of smokers has declined significantly, from about33% of adults,
or higher, to a proportion varyingly reported as being from20% to 25%. During the same
period, a host of anti-smoking laws havedramatically curtailed smoking in public places.
Today, exposure to ETSis not one tenth of what it was in 1979.
Yet, according to an article in theSan Jose Mercury News (October 12, 1993),
fatal asthma attacks have nearlydoubled in that time.
More than 5,100 Americans suffered fatal asthma attacksin 1991, up from about
2,600 in 1979.
Example 2:
The Cigarette Papers
Evaluating Internet Sources:
Treatment/Reliability
Purpose of Web Site:
 Provide information?
 Selling a product?
Purpose, perspective,
orientation of
information
 Arguing a position?
 Advertising clearly
distinguishable from
content?
Purpose of site example
Example:
http://www.addictionca.com/FAQ-ecstasy.htm
Can you tell the purpose of the site?
Google search for this site
Another Google search
Should you use this site as a source?
Evaluating Internet Sources:
Treatment/Reliability
 Sources cited?
 Enough information to
follow up?
 Statistics sources?
 “Experts” identified?
Reliability: trustworthiness of the
information in the source
Reliability example
http://www.factcheck.org/richardson_flunks_
two_subjects.html
Evaluating Internet Sources: Authority
Discovering information
about the source, author,
organization, etc.
 Author(s) or source
listed? (Clearly?)
 Author’s occupation,
education, experience?
 Affiliated with known
organization,
institution?
 Experts w/ subject
knowledge?
 Qualifications to
address topic?
Authority Example
http://www.mayoclinic.com/health/AboutThisSite/aboutthissite
How to Find WWW Page Sources to
Evaluate Authority
Check page title, headings, menu, opening
paragraphs
Look near top, bottom, navigational bar
Look for description links:
“About the ______ Association"
"About Us"
“Mission Statement”
Evaluating Authority? Check…
Links to author’s faculty/professional pages
Articles, publications
Library catalogs, Internet search engines,
online databases
No source or author information? Be wary.
Webmaster not responsible for
page content
Examine URL…
Evaluating Internet Sources: Authority
URL: Clues to the origin of
information
“domain extensions”
 Is it an organization’s
web site (.org)?
 Is it a governmental
web site (.gov)?
 Is it a military web site
(.mil)?
 Is it a commercial
web site (.com)?
 Is it an educational
web site (.edu)?
Deconstructing URLs
 URL= Uniform Resource
Locator: the “address” of
web document
 “Top Level” Domain =
main subdivision of
internet addresses
 Check last two or three
letters after the final dot
at the beginning of URL
 Backspace to first section
of URL to find main page
of site
 Restricted: .edu, .gov,
.mil, .ca, .us
 Unrestricted: .net, .org
.com
 See ICANN | Top-Level
Domains (gTLDs) for full
listings
Evaluating Internet Sources: Relevancy
 How is information
relevant to your
topic?
 What will you use for
your project?
Usefulness of information
…to you!
Statistics?
Data?
Facts?
Opinions?
 HOW does the source
support thesis?
Relevancy example
Example:
http://www.thereminworld.com/article.asp?id=17
Evaluating Internet Sources: Timeliness
Information Currency:
 Date was site created?
 Last updated, revised?
 Information cited still valid?
Currency of information
Current References:
 Links to other web pages
current/active?
Navigating Web Pages
Closed for Renovation!
Cite that Site!
Citing a Site
“Baseball, the Color Line and Jackie Robinson.”
Baseball and Jackie Robinson. Library of
Congress American Memory. 2 Oct. 2007.
<http://memory.loc.gov/ammem/collections/robins
on/jr1940.html>
Formatting notes for URLs
Remove hyperlinks (disable links)
Other Internet Sources
When you search with a search engine,
you are not
searching the entire Internet
Estimated over 80% of information
available is invisible to search engines
The Hidden Web
Or, the Invisible Web or Deep Web or
Deep Matter
No matter what you call it…
It’s not a secret!
What’s not retrieved
Thousands of
specialized
databases, dynamic
pages, files… millions
of records
The Hidden Internet, cont.
 “Hidden” databases
produced by…
 Universities
 Libraries
 Associations
 Businesses
 Government agencies
Great article on the Invisible Web
Chris Sherman and Gary Price
“The Invisible Web: Uncovering Sources
Search Engines Can’t See”
From Library Trends
Let’s find it!
Invisible Web Search Tools
http://www.lii.org
infomine.ucr.edu
http://www.completeplanet.com
http://www.intute.ac.uk
http://gdnet.org/index.php
(Caveat: searching the last site is challenge!)
Personal WWW Directories
 Specific topic areas
 Special interest
 Quirky
 Unlovely (design issues)
Examples of Personal WWW Directories
Matt Drudge’s Dad (it’s okay, he’s a
librarian):
www.refdesk.com
Gary Price, Hidden Web Guru:
http://www.freepint.com/gary/direct.htm
Other Sources of Information on the
Internet
1.
2.
3.
4.
Metasearch Engines
Newsgroups and Listservs
Blogs
News search engines
1. Metasearch Engines
Sends your keywords to several search
engines at the same time
Return a single list of results from multiple
sources
“Source” engine identified (most of the
time)
Standard Search Engines
Search for keywords, number of times
within document
Keywords from a single (updated)
database of websites
Each search engine searches unique
selection of web pages
Results ranked and sorted
Robots or spiders find new websites
Metasearch Engines
Transmits searches simultaneously to
several search engines
Results gathered from engines queried
Search terms sent to indexes maintained
by traditional search engines
Pros and Cons?
Metasearch Engines Examples
Directory:
http://www.santarosa.edu/library/Refs/engines.shtml
http://www.metacrawler.com/
http://www.dogpile.com/
http://ixquick.com/
http://www.webcrawler.com/
2. Newsgroups and Mailing Lists
Newsgroups
 Worldwide bulletin
board systems
 Tens of thousands of
forums (newsgroups)
 Groups focus on wide
range of topics,
accessible to anyone
& everyone
 “Bulletin board” style:
original and follow-up
postings
 Not subscription based
Newsgroups and Mailing Lists
Mailing Lists
 Automated mailing list  Discussion groups
distribution systems
 Wide-ranging
 “Subscribe" to
 Subscription-based
discussion lists
 Distributed to the
entire subscriber base
via e-mail
Newsgroups and Mailing Lists
Pros
 Good for locating
professional
discussions
 Interactive, questions
welcome
 Collaborative
 Alternative voices
 Cyber-networking
Cons
 Unmoderated or
lightly moderated
forums
 “On the Internet, no
one knows you’re a
dog.”
 Can be difficult to
search
Newsgroups and Mailing Lists
 Accessible mailing lists by Catalist:
http://www.lsoft.com/lists/listref.html
 Google groups (the Usenet):
http://groups.google.com
 Usenet history: http://www.ibiblio.org/usenet-i/
 Yahoo groups:
http://groups.yahoo.com
 Topica lists: http://lists.topica.com/
 About.com forums: http://www.about.com
Other Ways to Find Collaborative Sources
Search Google or other search engines for
forums. For example:
PDF forums
Search Google or other search engines for
discussion groups. For example:
Gardening discussion groups
Newsgroup and Mailing List Search Examples
Google Groups (Usenet)
Catalist Listserv search
http://www.lsoft.com/lists/listref.html
3. Weblogs or Blogs
 Online journals,
updated often
 News-oriented
 Contain commentary
and links
 Personal or
professional focus
Pros:
 Alternative sources
 Constantly updated
 Wide range of topics
Cons:
 Difficult to search
 Too much information
 Reliability of authors
 “Feeding Frenzy”
Weblog Sources
Google Directory:
http://directory.google.com/Top/Computers/Internet/On_the_Web/Weblogs/
 DMOZ Weblog listing:
http://dmoz.org/Computers/Internet/
 IPL Blog Page:
http://www.ipl.org/div/blogs
 Search Google or other search engines for
weblogs or blogs. For example:
Gardening weblogs
Homework for Next Week
Read through UC Berkeley’s Internet
Evaluation Site
Listen to Talk of the Nation Wikipedia
show
(Click on the “Listen” icon at the top of the
page; story is about 30 minutes long)
…and/or read the transcript of the show
Complete Internet Quiz
Homework for Next Week
Internet Source Assignment
May e-mail before 3/12
Final Project Due!
See you next week!