Transcript PowerPoint
CIS 228
The Internet
Day 8, 9/22
Review HFHTML ch 1 - 7
Alphabet Soup
HTML (delineates document structure)
HyperText Markup Language
CSS (specifies document presentation)
Cascading Style Sheets
XHTML (HTML as an XML sub-language)
eXtensible HyperText Markup Language
XML
eXtensible Markup Language
An HTML Document
<html>
<head>
<title>A minimal web page</title>
</head>
<body>
<h1>Hello world!</h1>
</body>
</html>
HTML Vocabulary
Tag – markup (enclosed in angle brackets)
Opening tags: <html>, <div id=”end”>, <h1>,<p>, <q>, <em>
Closing tags: </html>, </div>, </h1>, </p>, </q>, </em>
Empty tags: <br>, <hr>, <img src=”photo.jpg” alt=”my pic”>
Element – a component of a document
An empty tag, or
Opening tag, matching closing tag, everything in between
Attribute – a name value pair in an opening or empty tag
id=”end”, src=”photo.jpg”, alt=”my pic”, class=”address”>
Character Entity – special characters
& (“&”), < (“<”), > (“>”), © (“©”)
HTML Character Entities
< <
<
less than
> >
>
greater than
& &
&
ampersand
“
"
"
double quote
'
'
'
apostrophe
 
non-breaking space
Required in content, meaningless in markup (tags)
More:http://www.w3schools.com/tags/ref_entities.asp
HTML Elements
Opening tag, content, closing tag (or empty tag)
Elements Nest (form a document tree)
The <html> element is the root
Each element is fully contained in a unique parent
Two kinds:
Block elements (large sections of a document)
Inline elements (mostly text)
Raw text (e.g. “a mule”)
Multiple whitespace chars (“ “, “\t”, “\n”) collapse to a singe space
Short sections of text (e.g “a <em>very stupid</em> mule”)
Kinds of HTML Elements
Block elements
Containing other block elements:
<html>, <head>, <body>, <div>, <blockquote>
Containing only inline elements:
Titles (in <head> element): <title> ??
Headings: <h1>, <h2>, <h3>, <h4>, <h5>, <h6>
Paragraphs: <p>
Inline elements:
<q> quote
<em> emphasis (often italic)
<strong> emphasis (often bold)
Some HTML Elements
<html> contains <head> and <body> elements
<head> contains information about the page
<body> contains the page content
<h1> contains inline elements that make up a heading
<h2> a slightly less dramatic heading
<p> contains inline elements that make up a paragraph
<q> an inline quotation
<em> inline element indicating emphasis
<strong> inline element also indicating emphasis
<img> an empty element indicating a picture
<br> an empty element indicating a line break
Style Element
<style> element helps determine page presentation
Parent is the <head> element
Attribute type=“text/css”
Content consists of CSS declarations
<style type=“text/css”>
body {
background-color: #db8;
Margin: 10%;
font-family: sans-serif; }
</style>
Style information will be stored in separate files
Web Vocabulary
Web page – the unit of hypertext content stored on a
server and displayed by a browser
Server – a repository for web pages, which are
delivered to browsers upon request
Browser – obtains web pages specified (explicitly or
implicitly via a hyperlink) by a user and displays
their contents to the user
Hyperlink – clickable html element that indicates a
transition to a web page specified by an attribute
in the opening tag of the element
Hypertext – text containing one or more hyperlinks
Hypertext
<a> element specifies a hyperlink
Content (the link label) is clickable
CSS specifies how this content is displayed
Usually underlined and in a distinctive color
href attribute specifies a new web page
As a path to a file on the same computer, or
As URL (Uniform Resource Locator)
title attribute is a textual description of the page
Suggestion: title attribute should match page's title element
id attribute provides a destination for hyperlinks
target attributes specifies different window (or tab)
URL's
Uniform Resource Locator
protocol://domain:port/path#fragment
Protocol – a scheme for exchanging information
http (hypertext transfer protocol), ftp, etc.
Domain – identifies a server
Port – optional number for the protocol to use
Path – specifies a file on the server
Fragment – specifies a location within the file
Domain Names
Top level: com, org, net, edu, mil, …
ICANN decides these
Second level: google.com, cuny.edu, …
You can acquire these
Provided by Domain Name Registrars ($10/year)
Go Daddy, eNom, Tucows, Melbourne IT, Key-Systems
Deeper level: www.abc.com, lehman.cuny.edu
Administered by the second level name owner
Typically, the first name identifies a machine
media.lehman.cuny.edu
Paths
Path – sequence of names separated by “/”s
The final name in a path specifies a file
Fragments “#loc” specify locations within a file
Other names specify directories (folders)
To go down, specify the name of the child directory
To go up, use “..”
Examples:
trucks.html
Second Kings/22/20.html
../../../second/cousin/once/removed.html
Hyperlink Examples
<a href=“todo.html” title=“todo”>todo</a>
<a href=“http://www.abc.com/directions.html”>
Directions
</a>
<a href=“../fire/trucks.html” title=“trucks”>
My firetruck page
</a>
<a href=“http:/media.lehman.cuny.edu/~bowen/hour.html”
title=“office hour”>My office hour
</a>
<a href=“../library/books#catch22” title=“citation”>
Catch 22
</a>
The Image Element
An inline element that identifies an image to display
Tag: <img> (an empty element, no closing tag)
src attribute, where to find the image
Relative path to a local file, or
Uniform Resource Locator
alt attribute, textual indication of what the image is
width attribute, provides browser with size info
height attribute, provides browser with size info
Use width (and height) to inform the browser
Not to resize a large image (why?)
Common Image Formats
jpg
Variable, lossy data compression
Good for photos (lots of colors)
gif (depricated ??)
Good for logos (small number or colors)
Transparency
png
Newer format with transparency (replacing gif ?)
psd
Proprietary, Adobe Photoshop format
“Quirks” Mode
Today, all browsers support standards
Compliant pages are displayed similarly
There are multiple standards
HTML 4, HTML 4.01, XHTML 1.0, XHTML 1.1, …
Browsers need to know which standard a page adheres to
Browsers still need to support old web pages
Each browser does this differently (and slowly)
To avoid “quirks” mode
DOCTYPE announces the standard your page uses
Make sure your page obeys that standard.
DOCTYPE
On the top line of your html file
Only a handfull to choose from
Spelling (including capitalization) must be identical
HTML 4.01 (transitional)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
HTML 4.01 (strict)
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
XHTML 1.0 (strict)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
HTML 4.01 Compliance Issues
Images need an alt attribute
Specify a character encoding
<meta http-equiv=”Content-Type” content=“text/html; charset=utf-8”>
Don't leave off end (or start) tag
html element required
Containing head and body elements (and nothing else)
title element required in head element
Only block elements nest directly in body or blockquote
Block elements cannot be in p or inline elements
a elements cannot contain other a elements
List elements (ol and ul) only contain list items (li)
XHTML 1.0 Strict
Well-formed XML (empty elements end “ />”)
XML declaration (optional)
<?xml version="1.0" encoding="UTF-8" ?>
Document Type Declaration
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" >
Root element
<html xmlns="http://www.w3.org/1999/xhtml"
xml:lang="en" lang="en" >
Well-formed XML
Document needs to contain at least 1 element
Unique root element contains whole document
Tags must nest properly
Empty tags end “/>”
Tag names are case sensitive
Attribute values must be quoted
Characters “<”, “>”, “&”, “'”, and '”'
Cannot appear in content
Use character entities (< >, …) instead
Study Suggestions
Read the text (if you haven't already)
Repeat any labs you aren't confident about
Take the practice exam (if you haven't already)
Look over the answer key for the practice exam
Review the “bullet points” at the end of each chapter
Review these slides
Ignore: internet history, FTP, XML, image processing
Look over the “there are no Dumb Questions”
sections of the text