ITIS 1210 Introduction to Web

Download Report

Transcript ITIS 1210 Introduction to Web

ITIS 1210
Introduction to Web-Based
Information Systems
Internet Research One
Introduction
 Internet
 Global network of interconnected networks
 Infrastructure supporting W W W
 World Wide Web
 Repository of information
 Typically in an HTML format
 Accessible via browsers
Introduction
 Background:
 The Internet literally spans the planet
 The W W W contains information stored on
millions of computers
 Problem:
 How do you find the right information in such
a vast storehouse?
Introduction
 What do you need to know?
 What search tools are available
 How to create an Internet search strategy
 How to select the proper search terms
 Simple
 Complex
 How to perform a basic search
 How to analyze the results
 How to cite online resources properly
Search Tools
 Four types




Search engines
Metasearch engines
Subject guides
Specialized search tools
 No tool searches the entire Internet!
 Some are better than others for certain
uses
Search Tools
Search
Engines
Metasearch
Engines
Specialty
Search
Tools
WWW
Subject
Guides
Internet
Search Tools
Surface
Web
Deep
Web
Search Engines
Subject Guides
Specialty
Search
Tools
Unsearched “deep” Web
500 times larger than the
“surface” Web
Search Tools
Search
Engines
Metasearch
Engines
Specialty
Search
Tools
WWW
Subject
Guides
Internet
Search Engines
 Locate Web pages based on keywords
 Keyword
 Nouns, verbs
 Describe page in terms of major concepts
 Spider programs
 “Crawl” the Web
 Return results
 Results indexed
Search Engines
 Index consists of
 Keyword
 Link to Web page containing that keyword
 Precise process but




Narrow view of Web
Can become outdated
Only works for specific content
Slow
 Remember: no engine searches the entire Web
Metasearch Engines
 Enter keyword(s)
 Results are links to Web pages
 Source is not Web itself but other search
engines
 Useful for finding highest ranked results
from multiple search engines at once
 Duplicates can be eliminated
 Ranked by relevancy
Metasearch Engines
 Not an optimum source




Subject to timeouts
Retrieve only top 10-50 hits from each site
May not have advanced search features
May exclude major databases
 Then there’s the “haystack problem”
 “Just what are you looking for, anyway?"
Metasearch Engines












A known needle in a known haystack
A known needle in an unknown haystack
An unknown needle in an unknown haystack
Any needle in a haystack
The sharpest needle in a haystack
Most of the sharpest needles in a haystack
All the needles in a haystack
Affirmation of no needles in the haystack
Things like needles in any haystack
Let me know whenever a new needle shows up
Where are the haystacks?
Needles, haystacks -- whatever
Search Tools
Search
Engines
Metasearch
Engines
Specialty
Search
Tools
WWW
Subject
Guides
Internet
Subject Guides
 Hierarchically organized directories
 User navigates through hierarchy to find
relevant links
 Good for a broad view of a topic
 Can follow hierarchy to narrow search results
 Typically prepared by hand
 May be academic, professional, commercial
 May provide keyword searches of their
database
Specialized Search Tools
 Useful in finding information “invisible” to
traditional search engines/subject guides
 Data in “deep” Web usually stored in




Proprietary databases
Specialty directories
Newsgroups
Reference sites
Specialized Search Tools
 Access to these resources may be
restricted to
 Authorized users
 Subscribers
 Intelligent search agents
 Specialized software
 Queries these sites
Googlewhacks
 Two words that yield exactly one hit
 Example:
 Antimatter easterness
Internet Search Strategy
 Effective searching is planned searching
 Most users do what?
 Enter a single keyword
 Get back thousands (millions!) of hits
 Too many hits is just as bad as too few
 What’s needed?
 Effective
 Efficient
Internet Search Strategy
 Seven steps to
 Finding the information you need
 In a timely manner
Define topic and initial
keywords
Locate background
information and identify
additional keywords
Choose proper search tool
Translate questions into
effective search query
Perform search
Evaluate results
Note information for citation
YES
Satisfied?
NO
Start over
Internet Search Strategy
 Define your topic, note keywords
 What do you want to end up with?
 Write down your topic
 What are the keywords?
 Might be phrases too
 Try to identify all the necessary concepts
Internet Search Strategy
 Background information & additional
keywords
 Look up the topic somewhere else
 Encyclopedia, periodical, reference source
 Add important keywords to your list
 Ask someone who might know
 Check spelling!
Internet Search Strategy
 Choose search tool




Web is a big place
Lots of tools available
Not all tools are good for every query
Some rules:
 For specific content use search engines or
metasearch engines
 For broad concepts use subject guides
 If all else fails use specialized search tools
Internet Search Strategy
TOOL
BEST FOR SEARCHES
WHERE?
HOW TO SEARCH
EXAMPLE
TOOLS
Search
Engines
General or
specific
Own indexes
compiled from data
gathered from Web
Enter keywords, phrases, GOOGLE
complex searches
ALTAVISTA
Metasearch General or
Engines
specific
Indexes of multiple
search engines
simultaneously
Enter keywords, phrases, LXQUICK
complex searches
VIVISIMO
Subject
Guides
More
general
Own files or database
Click through subject
categories (may allow
keyword searches)
Specialized
Search
Tools
More
specific
“Invisible” databases,
directories, reference
sites, newsgroups
Enter keywords, phrases,
complex searches
LII.ORG
VLIB.ORG
Internet Search Strategy
 Create effective search query
 Query
 Word(s)/phrases/symbols that a search engine can
interpret
 Effective query
 Keywords that best describe the topic
Internet Search Strategy
 Perform search
 Engines use search forms
 Fields where you enter information about your
search
 (Subject guides usually require actively
clicking on links to navigate the hierarchy)
Internet Search Strategy
 Evaluate search results
 Results are typically a list of links that match
your query
 Quantity, quality, format vary from engine to
engine
 Evaluation based on some criteria you select
 Source of Web page
 Currency of information
 Past experience
Internet Search Strategy
 Refine search
 Results may be too broad
 Quality/quantity may not be optimum
 Possible steps
 Refine query
 Different keywords/phrases
 Use a different tool
 If still not satisfied re-evaluate keywords
 Too specific?
 Too obscure?
Identifying Keywords
 A topic is usually too broad to be useful
 Must identify the major elements of your
topic
 What best describes the topic?
 What makes this topic different from similar
ones?
 These are the keywords
 How should you create a list of keywords?
Identifying Keywords
 Write a sentence or two that summarizes
your topic
I want to find Web sites
about alternative energy
Identifying Keywords
 Look for uncommon words that are unique
to this topic
I want to find Web sites
about alternative energy
 You’re looking for words that a search tool
can use effectively
Identifying Keywords
 Identify words that
 You expect will appear on a Web page
 That will be useful to your research
 Not all words will be useful
 Some words will appear in all sites
 And, or, the, etc.
 Look for unique words closely related to your
topic
 That won’t be found on sites you aren’t interested in
Identifying Keywords
Parts of speech
Articles
Conjunctions &
Prepositions
Adjectives &
Adverbs
Pronouns &
Verbs
Examples
a, an, the
and, or, but, in, of, for, on,
into, from, than, at, to
quick, fine, happy, as, also,
probably, however, very
this, that, these, those, is, be,
see, do
Identifying Keywords
 If necessary:
 Define keywords
 Find background information on your topic
 Useful if this topic is new to you
 Use:
 Encyclopedias
 Dictionaries
Identifying Keywords
 Example:
 Dictionary says alternative energy is from
non-fossil fuels
 Solar and wind are examples
 Encyclopedia says water and geothermal
energy can be used as power sources
 So can something called biomass
Identifying Keywords
 Updated list of keywords is now:
Keywords
alternative
energy
solar
wind
water
biomass
geothermal
Identifying Keywords
 Identify synonyms
 Words that have same or nearly same
meaning
 Why?
 Web pages are created by individuals
 They won’t all use the same words to
describe every topic
 An expanded list should be broad enough to
include Web pages indexed under a variety of
similar terms
Identifying Keywords
Keywords
Synonyms & Related Terms
alternative
renewable, sustainable
energy
power
solar
panels, photovoltaic
wind
turbines, windmills
water
hydropower, hydroelectric
biomass
waste-to-energy, bioenergy
geothermal
heat, pumps
Basic Search
 Search engines go about searches
differently
 Spiders don’t always crawl the same parts of
the Web
 Therefore they return different results
 Using one engine restricts you to only those
parts of the Web indexed by that engine
 Also, different ranking algorithms are used
 So relevancy may be different for different engines
Basic Search
 Check out the engine’s Help section
before starting
 An optimum search with one engine might
not be optimum for another
 Comparing results from different engines
can be useful
Basic Search
 Some engines accept sponsored
payments to rank results
 Sometimes they indicate this, sometimes
not
 Check out
http://searchenginewatch.com/
Basic Search
 Try it: search for “solar energy”
http://search.aol.com/
www.google.com
www.ask.com
Using Phrases
 Sometimes a single word is too broad
 Sometimes word order is important
 Phrase searching
 Usually accomplished by placing keywords
within quotation marks:
“solar energy”
Using Phrases
Search for: solar energy
Search for: energy solar
Web pages
without both
words
Two-word searches
have identical results
Web pages with
both words
Using Phrases
Search for: “solar energy”
Web pages
without both
words
Phrase search has
different results
Web pages with
both words
Web pages with
the phrase
Using Phrases
 How do you specify phrase searching?
 Usually with quotation marks
 Sometimes not necessary
 Drop-down menu
 Check box option
 Look at Advanced link if there is one
 For example, Yahoo
Analyzing Results
 A word about domains





Academic sites - .edu
Commercial sites sell or advocate - .com
Professional or organization - .org
Government - .gov
Other countries - .uk or .au
 As you scan results pay attention to
domain names in the Web pages returned
Analyzing Results
 Look for your search terms in the results
 Number of times a keyword shows up may
indicate its relevance
 Google displays keywords in bold
 Proximity may indicate relevance
 Keyword in URL
 Decipher the URL
 Mnemonic URLs contain the keyword
 Easy to remember the URL
Analyzing Results
 Note result ranking





Many engines rank order results
Mathematical formula or algorithm
Different engines use different algorithms
Most relevant sites displayed first
Generally, the best results are within three or
four pages of the top
 Google’s I’m Feeling Lucky button
Analyzing Results
 Did results return a directory or subject
guide?
 If so, it’s probably relevant
 Does the engine use cached pages?
 Links break, pages go down
 Previous versions of pages usually still
maintained in the index
 Useful in finding newer version
Analyzing Results
 Navigate within results
 How many hits did you get?
 Search within to narrow results
Citing Online Resources
 Two reasons to cite resources:
 So you can find them again if you need to
 To avoid plagiarism
 Everything on the Web is copyrighted
 Must have authors permission to use
material for profit
Citing Online Resources
 Fair Use exemption to copyright law




Students & researchers
Use small amounts
Educational purposes
No permission required
 Must always give credit for work not your
own
Citing Online Resources
 Some schools/professors require specific
citation formats
 Find out which is needed for your situation
 Citations have a format
 Maintains consistency
 Recognizable and easy to understand parts
 Two basic kinds
 MLA – Modern Language Association
 APA – American Psychological Association
Citing Online Resources
 MLA citation elements






Author
Web page title
Web site title
Date Web page was created or last revised
Internet address
Date you visited the Web page
Citing Online Resources
 MLA citation format
Author last name, author first name.
“Web PAGE title.”
Web SITE title.
Date created or revised.
<Full Internet address>
(Date you viewed the Web page)
Citing Online Resources
 Omit items that cannot be located on the
Web page
 Author might be a corporation or
organization
 Skip the Web page title if you’re citing the
entire site
 Don’t underline the URL, remove the
underline if the word processor
automatically includes one
Citing Online Resources
 Creation/revision date may be anywhere
on page
 Omit this if you can’t locate one
 Most browser print functions include the
URL and current date on the hardcopy
 Make a note of it in case you don’t print the
page yourself