Module 3 Powerpoint slides - Initial Set Up
Download
Report
Transcript Module 3 Powerpoint slides - Initial Set Up
Module 3
Internet
Search Engines
Need for them?
Millions of pages on web.
How to locate the most relevant?
Allow you to enter a search request (what
you want to look for) and return a list of
webpages matching your query
Anatomy of a Search Engine
Spider (a.k.a robots, webbots)
Traverses the web and stores the contents of all searchable web
pages.
(this happens *before* you ever enter your query)
Websites can decide to deny access to some resources
• Using a robots.txt file eg. http://www.usask.ca/robots.txt
Indexing Software
Indexes the web pages into searchable database
(this happens *before* you ever enter your query)
Query Interface
Allows users to enter keywords and other combinations.
Searches are performed within the indexed database
Different Search Engines
www.yahoo.com
Directory listing organised into various categories
• Yellow pages in our phone book.
• All page are hand linked
www.altavista.com
“Yet Another Hierarchical Officious Oracle”
“a view from above”
First truly huge collection of indexed database of web
pages
www.google.com
“googol”: 1 followed by 100 zeros
Top search engine today - over 100 million queries a
day.
Why Google?
Results ranked by relevance
Important because a typical user rarely goes beyond
the first page
How is relevance measured?
# of links that point to the same page.
Not just # of times a keyword is repeated.
Careful here: If enough people say a lie to be true,
it becomes the truth.
• Googlebomb: “talentless hack”
Googlewhack: ‘the search for the one’!
• Eg. dogmatism unicyclist
Effective Searching
Select the right keywords
• Saves time and frustration
AND OR NOT
AND: combines two keywords
• specifies that both keywords should be found on the resulting
web page
OR: combines two keywords
• Specifies one or both keywords to be found on the web page
NOT: operates on a single keyword
• Ensures that this keyword should not be found in any page
returned.
Examples:
• vacation london OR paris
• plane AND geometric NOT air
Effective Searching
+/- signs
+ indicates a keyword must be present in the result
- indicates a keyword must not be present
The signs are usually stuck to the keyword
Example: +bass +fish –music
• star wars episode +1
Quotation marks “ ”
Groups a set of keywords so that the resulting page
should have these in the exact same order
Can be used in combination with other methods
Examples: “star wars episode 1”
• “to kill a mockingbird” -movie
Networking and
Telecommunication
Linking Up: Network Basics
A computer
network is any system of two
or more computers that are linked
together.
Advantages
to networks:
People share computer hardware, thus
reducing costs
People share data and software programs,
thus increasing efficiency and production
The Internet
Internet is a network of networks
Globally connected network that links
various organisations and individuals.
The Web is not the Internet.
• WWW is one particular usage of
internet.
• Email, FTP (File Transfer Protocol) are
other such uses.
Connecting to the Internet
Bandwidth:
The amount of information
that can be transmitted in a given amount
of time
Impacted by:
• Physical media that make up the network
• Amount of network traffic
• Software protocols of the network
Connecting to the Internet
Dial up connections
• Modems
Broadband connections
• DSL Digital Subscriber Line 300Kbps to 1.3Mbps
• Cable Modems 10Mbps.
Direct connections using
• 1.5Mbps to 45 Mbps
T1 or T3 lines.
Communication á la Modem
A modem is a hardware device
that connects a computer to a
telephone line (for remote
access).
Modulator-demodulator
May be internal on the system
board or external modem sitting in
a box linked to a serial port.
Modem transmission speed is
measured in bits per second
(bps) and generally transmit at
28,000 bps to 56.6K bps
Networks: Near and Far…
Networks Near and Far
Local-area network (LAN)
Computers are linked within a
building or cluster of
buildings.
Each computer and
peripheral is an individual
node on the network.
Nodes are connected by
cables
Wide-Area Networks
WAN is a network that
extends over a long
distance.
Each network site is a node
on the WAN network
Made up of LANs linked by
phone lines, microwave
towers, and communication
satellites.
Data is transmitted over common pathways called
a backbone.
• CANet4
One of world’s fastest, fiber-optic
Networks
Intranets are self-contained intra-organizational
networks that offer email, newsgroups, file
transfer, Web publishing and other Internet-like
services.
Could include LANs and WANs
Firewalls prevent unauthorized communication
with the outside world and secure sensitive
internal data
Gateways act as the gate keepers, letting some
things through the firewall and stopping others.
Communication Protocols
Communication Software
Protocol
- set of rules for the exchange of
data between computers
TCP/IP Transmission Control Protocol /
Internet Protocol
• Messages are broken into Packets - 1500 bytes
• Packets are numbered and sent over the network
Communication Software
IP defines the addressing system
128.236.24.161 - “dotted quad”, 0 to 255
Each computer on the internet has an IP address
Every packet includes the source IP, destination IP
and the packet number (7 of 13)
TCP is an end-to-end communication protocol.
packets are reliably transmitted from one computer
to another.
Lost packets are re-transmitted.
Communication Software
Communication software establishes a
protocol that is followed by the computer’s
hardware
Different forms:
Client/server model - one or more computers act as dedicated
servers and all the remaining computers act as clients
• Web server and client browsers
Peer-to-peer model - every computer on the network is both
client and server
• Napster, Gnutella
Many networks are hybrids, using features of the client/server
and peer-to-peer models
Client/Server Model
Server software responds to
client requests by providing
data
Client software sends
requests from the user to the
server
Internet Addresses…
Addressing Computers
Unique IP numbers
Gateways
Need for it? – similar to the house address
Takes care of routing packets in and out of a LAN
Routers
Takes care of routing packets across multiple network
nodes
Internet Addresses
DNS (Domain Name System) translates IP
addresses into a string of names
Address: 128.233.130.63 is www.cs.usask.ca
Address: 216.239.51.101 is www.google.com
Easier to remember strings of alphabets than
numbers.
DNS servers
Arranged in a hierarchy - 4 top level servers in US
Internet Domains
Internet
Top
addresses are classified by Domains
level domains include:
.edu - educational sites
.com - commercial sites
.gov - government sites
.mil - military sites
.org - nonprofit organizations
.ca - Canada
Multiple
computers can be mapped on to the
same domain name
Eg. www.yahoo.com
Web Addresses
Dissecting Web Page address:
Path to
the host
http://
www.vote-smart.org/ help/database.html
Protocol for
Web pages
Resource
Page
Addressing Resources
URL: Uniform Resource Locator
A web address like:
•
A Web server stores webpages and sends pages to client
web browsers on demand.
FTP: File Transfer Protocol
http://www.cs.usask.ca/index.html
ftp://ftp.cs.usask.ca
allows users to download and upload files between
remote servers and their computers
Telnet:
telnet://scrooge.usask.ca
Allows users to login into remote computers.
WWW
World Wide Web not the same
as the internet?
WWW – a definition
The World Wide Web is part of the internet. It
is a collection of multimedia documents created
by organizations and users worldwide.
Documents are linked in a hypertext web that
allows users to explore them with simple mouse
clicks
Surfing the Web
Browser
lets you look at and navigate info on the WWW
Uses HTTP to communicate with web servers
E.g. Netscape Navigator, Internet Explorer,
Mozilla, Opera
HTTP
HyperText Transfer Protocol
A set of rules for exchanging files on the WWW
Cookies
Cookies are files created on your computer by
a website to store information about you.
To accept or not ?
Benefits:
stores some of the personal information (repeat info)
allows pages to be customised to your preferences
Eg. Layouts, advertisements…
Privacy issues.
Do you want your browsing patterns to be used by a
company / organisation?
Email
Your computer
Gets
email
Sends to
Bob’s mail
server
Email
Write email
to Bob
Press “Send”
Receives
email
Waits until
Bob logs on,
sends email
Bob’s computer
Your mail server
Bob’s mail server
Other mail servers
Bob logs on
Checks email
Receives
your
message
Email on the Internet
What appears on the
screen depends on the
type of Internet
connection you have
and the mail program
you use.
Popular graphical
email programs
include Eudora,
Outlook and Netscape
Communicator.
Addressing Persons
Examples:
[email protected]
[email protected]
User President whose
mail is stored on the host
whitehouse in the
government domain of
USA
User abc123 at the server
for Computer Science,
University of
Saskatchewan, Canada.
Email Protocols
POP
Online/offline access
Downloaded
locally
IMAP
Remain on server
Header
Queued on server,
downloaded,
Receiving messages transferred in
choose messages
same order
to transfer
Sending messages
Use SMPT –
different protocol
Done through
IMAP
Size of mailbox
Limited only by
local HD
Determined by
server
Asynchronous Communication
Mailing
lists
allow you to participate in email discussion
groups on special-interest topics.
emails are sent to the whole group
Message
boards and Newsgroups
public discussion on a particular subject
consisting of notes written to a central Internet
site
I-help is a message board
Real-Time Communication
Instant Messaging
Chat Rooms
for conversing with multiple people in real-time
Internet telephony
exchanging instant messages with online friends and
co-workers
Used for local and for long-distance toll-free
telephone service
Videoconferencing
for remote face-to-face meetings
Netiquette, Viruses, and
Internet Issues
Email Netiquette
Never say in email something you wouldn’t want quoted
in the news
Cool down, don’t send spur-of-the-moment messages you’ll
soon regret
Give context with Subject lines
Use > to quote messages in responses
Forward messages sparingly
Do not Spam
Actually violation of copyright law
Proof-read!
Email Netiquette
URLs:
Place URLs in < > if they are long
Do not place punctuation marks immediately after a
URL
www.cs.usask.ca.
Attachments:
Send sparingly
Consider whether recipient will be able to read
Don’t overload inboxes
Netiquette
Learn the non-verbal language of the Net :)
Emotions
Surround words by * to *emphasize*
Use CAPITAL letters sparingly
• It is considered SHOUTING and RUDE to write in ALL CAPS
Emoticons
• :) :-)
:(
;-)
:)-
:P
Acronyms
• BTW, LOL, FYI, ROTFL, IMHO, <g>, TTYL
Netiquette - forwards
“forward this to 100 people and Bill Gates will give you $1000”
“Deodorant causes breast cancer”
“This kid is dying – forward to everyone you know as her last wish”
“Sign this petition to save the people of so-and-so”
“Internet cleaning day – please unplug your computer”
THESE ARE HOAXES –
DO NOT FORWARD OR BELIEVE!!!!
If you are still unsure, CHECK FIRST!
http://hoaxbusters.ciac.org
http://www.symantec.com/avcenter/hoax.html
http://www.urbanlegends.com/
Online Community terms…
Lurking:
Spamming:
A program running in the background that detects spam and
deletes it.
Newbie:
Posting a message numerous times, taking up room and
annoying other users
Cancelbot:
Reading, “listening” to the conversation without taking part
Someone who is new to being online (or email/internet/etc)
Flaming:
An angry, nasty, typically vulgar response to someone
Viruses
Viruses
programs that could damage your data and hinder a computer’s
normal functioning.
Can:
• Activate itself : executable files, boot sector, macros
• Replicate itself: through e-mail attachments
• Do “something”: destroy contents
Trojan horses
Worms
malicious programs disguised as useful software.
travel across the network and replicate themselves
Antivirus and Security programs check for known viruses
and protects against attacks
Internet Issues: Ethical and
Political Dilemmas
Freedom of Speech?
Privacy
Should there be limits on what info can be gathered about you
online?
Digital divide
Should what is said on the internet be subject to any laws?
Parental controls?
tech haves vs. have-nots
Intellectual Property:
How do copyright laws apply to online content? Across
international boundaries?
Does the end-use of the copied material make a difference?
Corresponding Textbook Readings
14 – 18
P. 36 – 38
P. 250 – 256
P. 265 – 277
P. 288 – 295
P. 299 – 300
P. 316 – 318
P.
To Know – Module 3
Keywords
How
does the internet and web work?
How do search engines work?
How does email work?
Netiquette
How do viruses work?
What are some ethical/political/social
issues with respect to the internet?