Internet Tutorial.04.. - Computer and Information Science

Download Report

Transcript Internet Tutorial.04.. - Computer and Information Science

Tutorial 4
Searching the Web
Using Search Engines and
Directories Effectively
Objectives
• Determine whether a research question is specific or
exploratory.
• Learn how to formulate an effective Web search
strategy to answer research questions.
• Learn how to use Web search engines, Web
directories, and Web meta-search engines effectively.
Objectives
• Use Boolean logic and filtering techniques to improve
your Web searches.
• Use advanced search options in Web search
engines.
• Assess the validity and quality of Web research
resources.
• Learn about the future of Web search tools.
Types of Search Questions
• A specific question is a question that you can phrase
easily and one for which you will recognize the answer
when you find it.
• An exploratory question is an open-ended question that
can be harder to phrase; it is also difficult to determine
when you find a good answer.
Specific Question
New Perspectives on the Internet, 5e
Tutorial 4
5
Exploratory Question
New Perspectives on the Internet, 5e
Tutorial 4
6
Web Search Process
New Perspectives on the Internet, 5e
Tutorial 4
7
Web Search Strategy
• You may need to reformulate, or more clearly state,
your question.
• Try to think of synonyms for each word.
• Identify unique phrases that relate to your topic or
question.
Using Search Engines
Four Broad Categories Of Search Tools:
1. Search engines
2. Directories
3. Meta-search engines
4. Other Web resources such as Web
bibliographies
Understanding Search Engines
• A web search engine is a Web site (or part of a Web site) that
finds other Web pages that match a word or phrase you enter.
• The word or phrase you enter in a search engine is called a
search expression or a query.
• A search expression or query might also include instructions that
tell the search engine how to search.
• A search engine does not search the Web to find a match; it
searches only its own database of information about Web pages
that it has collected, indexed, and stored.
Understanding Search Engines
• A hit is a Web page that is indexed in the search engine’s
database and that contains text that matches your search
expression.
• Most search engines report the number of hits they find.
• All search engines provide a series of results pages,
which are Web pages that contain hyperlinks to the Web
pages that contain text that matches your search
expression.
Understanding Search Engines
• A Web robot, also called a bot or a spider, is a program
that automatically searches the Web to find new Web sites
and update information about old Web sites that already
are in the database.
• Most search engines allow Web page creators to submit
the URLs of their pages to search engine databases.
• Search engine operators often sell advertising space on
the search engine Web page and on the results pages.
Understanding Search Engines
• Some search engine operators sell paid placement rights on
results pages. These paid placement links are often labeled as
“sponsored,” and they are usually called sponsored links.
• If the advertising appears in a box on the page (usually at the
top, but sometimes along the side or bottom of the page), it is
usually called a banner ad.
• Revenue from sponsored links and banner ads is used to
generate profit after covering the costs of maintaining the
computer hardware and software required to search the Web
and to create and search the database.
Understanding Search Engines
HotBot search
results for the
search term “car”
New Perspectives on the Internet, 5e
Tutorial 4
14
Using More Than One
Search Engine
• Each search engine includes different Web pages in its
database.
• Different search engines use different rules to evaluate search
expressions.
• The best way to determine how a specific search engine
interprets search expressions is to read the Help pages on the
search engine Web site.
• Search engines change the way they interpret search
expressions from time to time, so you should read the Help
pages regularly.
Understanding Search Engine
Databases
• Search engine databases store different collections
of information about the pages that exist on the Web
at any given time.
• Each search engine database indexes the
information it has collected from the Web differently.
• Search engine robots may collect information from a
Web page’s title, description, keywords, HTML tags,
or read a certain number of words from each Web
page.
Understanding Search Engine
Databases
• A META tag is HTML
code that a Web page
creator places in the
page header for the
specific purpose of
informing Web robots
about the content of
the page.
META tags in a Web page
<HEAD>
<TITLE>
Current Developments in Electronic Commerce
</TITLE>
<META NAME ="description" CONTENT= "Current
news and reports about electronic commerce
developments.">
<META NAME ="keywords" CONTENT ="electronic
commerce, electronic data interchange, value added
reseller, EDI, VAR, secure socket layer, business on the
internet">
</HEAD>
Understanding Search Engine
Databases
• Full text indexing: when search engines store the
entire content of every Web page they index.
• Stop words: common words, such as and, the, it,
and by, that many search engines omit from their
databases.
• Many search engines include information about their
search engines, robots, and databases on their Help
or About pages.
Search Engine Features
• Page ranking is a way of grading Web pages by the
number of other Web pages that link to them. The URLs of
Web pages with high rankings are presented first on the
search results page.
• A natural language query interface allows users to enter
a question exactly as they would ask a person that
question.
• The procedure of converting a natural language question
into a search expression is sometimes called parsing.
Search Engine Features
New Perspectives on the Internet, 5e
Tutorial 4
20
Using Directories and Hybrid
Search Engine Directories
• A Web directory is a listing of hyperlinks to Web pages that is
organized into hierarchical categories.
• The difference between a search engine and a Web directory is
that people select the Web pages to include in a Web directory.
• Many directories allow a Web page to be indexed in several
different categories.
• The main weakness of a directory is that you must know which
category is likely to yield the information you desire.
• Yahoo! is one of the oldest and most respected directories on the
Web.
Using Directories and Hybrid
Search Engine Directories
New Perspectives on the Internet, 5e
Tutorial 4
22
Using Directories and Hybrid
Search Engine Directories
• The combination of search engine and directory is
sometimes called a hybrid search engine directory.
• Using a hybrid search engine directory can help you
identify which category in the directory is likely to contain
the information you need.
• After you enter a category, the search engine is useful for
narrowing a search even further. You can enter a search
expression and limit the search to that category.
Using Meta-Search Engines
• A meta-search engine is a tool that combines the power of
multiple search engines.
• Some meta-search tools also include directories.
• Because each search engine on the Web has different strengths
and weaknesses, you might need to use several individual
search engines to perform a complete search for a particular
question.
• Using a meta-search engine lets you search several engines at
the same time.
• Profusion, a popular meta-search engine, routes search terms
to more than ten search engines and Web directories.
Using Meta-Search Engines
New Perspectives on the Internet, 5e
Tutorial 4
25
Using Other Web Resources
• Other Web resources are similar to bibliographies in that they
contain lists of hyperlinks to Web pages.
• Many of these resources include summaries or reviews of Web
pages.
• They are often called Web bibliographies, but many other
names are used for them:
• Resource lists
• Subject guides
• Clearinghouses
• Virtual libraries
Using Other Web Resources
• Other web resources are sometimes confusingly called Web
directories.
• Web bibliographies are usually more focused on specific
subjects than Web directories, and Web bibliographies usually
do not include a tool for searching within their categories.
• These other resources can be very useful when you want to
obtain a broad overview or a basic understanding of a complex
subject area.
• Some Web bibliographies are general references. Most are
more focused. Many are created by librarians at university and
public libraries.
Boolean Logic and
Filtering Techniques
• The most important factor in obtaining good results in a
Web search is careful selection of the search terms you
use.
• You can usually choose one or two words that will work
well when the object of your search is straightforward.
• More complex search questions require more complex
queries, which you can use along with Boolean logic,
search expression operators, or filtering techniques, to
broaden or narrow your search expression.
Boolean Operators
• Boolean algebra was developed by George Boole, a
nineteenth century British mathematician.
• Boolean operators, or logical operators, specify the
logical relationship between the elements they join.
• Three basic Boolean operators—AND, OR, and NOT—
are recognized by most search engines.
• You can use these operators in many search engines by
including them with search terms.
Boolean Operators
Search Expression
Search Returns Pages that Include
exports AND France AND Japan
All of the three search terms
exports OR France OR Japan
Any of the three search terms
exports NOT France NOT Japan
Exports, but not if the page also
includes the terms France or Japan
exports AND France NOT Japan
Exports and France, but not Japan
Other Search
Expression Operators
• A precedence operator, also called an inclusion
operator or a grouping operator, clarifies the
grouping within a complex expression and is usually
indicated by the parentheses symbols.
• A location operator, or proximity operator, lets you
search for terms that appear close to each other in
the text of a Web page. The most common location
operator offered in Web search engines is the NEAR
operator.
Wildcard Characters
• Most search engines support some use of a wildcard
character in their search expressions.
• A wildcard character allows you to omit part of a
search term.
• Many search engines recognize the asterisk (*) as
the wildcard character.
Search Filters
• Many search engines allow you to restrict your
search by using search filters.
• A search filter eliminates Web pages from a search.
• The filter criteria can include such Web page
attributes as language, data, domain, host, or page
component.
Complex Searches
• Most search engines implement many of the
operators and filtering techniques you have learned
about.
• Some search engines provide separate advanced
search pages for these techniques.
• Some search engines allow you to use advanced
techniques such as Boolean operators on their
simple search pages.
Using AltaVista
Advanced Search
• Open the AltaVista search engine in your Web browser.
• Select the Advanced Search option.
• Formulate and enter a suitable search expression.
• Click the Find button.
• Evaluate the results and, if necessary, revise your search
expression.
Using AltaVista Advanced Search
New Perspectives on the Internet, 5e
Tutorial 4
36
Filtered Search in HotBot
• Open the HotBot search engine page in your Web browser.
• Select the HotBot Advanced Search link.
• Formulate and enter a suitable search expression.
• Set any filters you want to use for the search.
• Click the SEARCH button.
• Evaluate the results and, if necessary, revise your search
expression.
Filtered Search in HotBot
New Perspectives on the Internet, 5e
Tutorial 4
38
Filtered Search in Google
• Open the Google search engine page in your Web browser.
• Click the Advanced Search link.
• Formulate and enter suitable search expression elements.
• Formulate and set appropriate search filters.
• Click the Google Search button.
• Evaluate the results and, if necessary, revise your search
expression.
Filtered Search in Google
New Perspectives on the Internet, 5e
Tutorial 4
40
Search Engines with
Clustering Features
• Vivísimo is a search engine that uses advanced
technology to group its results into clusters.
• The clustering of results provides a filtering effect.
• The filtering is done automatically by the search
engine after it runs the search.
Obtaining Clustered Search
Results Using Vivísimo
• Open the Vivísimo search engine page in your browser.
• Formulate and enter a suitable search expression.
• Click the Search button.
• Evaluate the results and, if necessary, revise your search
expression.
Obtaining Clustered Search
Results Using Vivísimo
New Perspectives on the Internet, 5e
Tutorial 4
43
Future of Web Search Tools
• A number of different companies and organizations are
working on ways to make searching the Web easier.
• Work on natural language interfaces continues as search
engine sites strive to make the job of searching even
easier for users.
• An increasing number of search engines offer natural
language querying as an option for entering search
expressions.
Using People to Enhance
Web Directories
• One company, About.com, hires people with expertise in
specific subject areas to create and manage their Web
directory entries in those areas.
• The Open Directory Project uses the services of more
than 40,000 volunteer editors who maintain listings in their
individual areas of interest.
• The Open Directory Project offers the information in its
Web directory to other Web directories and search
engines at no charge.
Evaluating the Validity and Quality
of Web Research Resources
• Information on the Web is seldom subjected to the review and
editing processes that have become a standard practice in print
publishing.
• The risks of obtaining and relying on inaccurate or unreliable
information can be significant.
• Reduce your risk by carefully evaluating the quality of any Web
resource on which you plan to rely for information related to an
important judgment or decision.
• Evaluate on the Web page’s authorship, content, and
appearance.
Author Identity and Objectivity
• Web page should identify the author and present the author’s
background information and credentials.
• Check secondary sources for corroborating information.
• Author contact information should be provided.
• Examine the domain identifier in the URL.
• Consider whether the qualifications presented by the author
pertain to the material that appears on the Web site.
• Information about the author’s affiliations should be provided.
Content
• Determine timeliness of the content by checking the
publication date.
• Read the content critically and evaluate whether the
included topics are relevant to the research question at
hand.
• Determine whether important topics or considerations
were omitted.
• Assess the depth of treatment the author gives to subject.
Form and Appearance
• Many pages that contain low-quality or incorrect
information are poorly designed and not well edited.
• A Web page that contains spelling errors indicates a
low-quality resource.
• Loud colors, graphics that serve no purpose, and
flashing text are all Web page design elements that
often suggest low-quality resource.
Summary
• You learned how to formulate specific and
exploratory research questions.
• You learned how to use a structured Web search
process to find information on the Web.
• You learned how to develop search expressions and
used them in search engines, Web directories, and
meta-search engines.
Summary
• You learned what Boolean operators, precedence
operators, and location operators are and how they
work in several major search engines.
• You learned how to use wildcards in search
expressions.
• You learned how to use several types of filtering
techniques to narrow your search results.
Summary
• You learned how to evaluate the validity and reliability
of a Web page by using information about author
identity and objectivity.
• You learned how to evaluate the validity and reliability
of a Web page by evaluating content, form and
appearance.