introduction to the course

Download Report

Transcript introduction to the course

LIS650 lecture 0
Introductory lecture
Thomas Krichel
2004-01-23
administrative matters
• Course home page is at
http://wotan.liu.edu/home/krichel/lis650p04s
• First quiz next lecture!
• Deadline to finish web site: one week after the end of
the last lecture.
• You will not be able to change your web site between
the deadline and the time that the grade is issued!
• Subscribe to class mailing list
https://lists.liu.edu/mailman/listinfo/cwp-lis650-krichel
today
•
•
•
•
•
introduction to the course
talk about you
the basic ingredients of the web, without html
introduction to our basic technical set up
introduction to html
Course history
• Course was first run as an institute 2002-05-13
to 2002-05-17
• Title was “Webmastering I: the static web site”.
• To the curriculum committee, this title did not
sound academic enough.
• Since “Web Site Architecture and Design” is now
the full title, WeSaD (pronounced like “wizard”) is
the official abbreviation.
• Webmastering is still what we want to learn.
teaching WeSaD
• WeSaD combines many aspects:
–
–
–
–
–
Authoring pages
Work on the organization of data to fit onto pages
Set display style of different pages
Organize the contribution of data
Maintain a technical web installation
• Some of them can be learned in a course, but
others can not.
• Emphasis has to be on learnable elements.
teaching philosophy
• Point and click on a computer software is not
enough
• Explain underlying principles
• Promote standards
– HTML 4.01
– CSS level 2.1
• Avoid proprietary software
WeSaD contents
• Deals with the maintenance of a static web site.
Such a web site remains the same whatever the
user does with it.
• Topics include
– html
– css
– site usability and information architecture, as far as
relevant for static web sites
– http, uri, web server
things this course does not do
• Forms: allow you to design forms that users fill
in. But you do not have the programming skills to
do something with the form.
• Any HTML elements that require executable
contents are not covered.
• Frames: allow you to put several documents into
one physical document. Most experts advise
against them.
• We do not cover image maps.
• We don’t do some advanced CSS properties.
Other courses: webmastering II
• Deals with building dynamic web sites.
– Users fill in a form
– Users submit the form
– Web server return a page that is specific to the
request of the user.
• Teaches a language called PHP, that is widely
used to generate such web sites.
– Gets you introduced to computer programming
– Gets you to train analytical thinking.
other courses: webmastering III
• Deals with XML
– XML is a syntax to encode any kind of data.
– XML can be constrained to only allow certain types of
data (XML Schema)
– XML can be transformed to render the data in various
ways (XSLT)
• Achieve a separation of contents and
presentation of a web page.
• advanced course, has both Schema and
Transformation
The world wide web
The World Wide Web (Web) is a network of
information resources. The Web relies on three
mechanisms to make these resources readily
available to the widest possible audience:
– A uniform naming scheme for locating resources on
the Web (I.e. URIs).
– Protocols, for access to named resources over the
Web (e.g., HTTP).
– Hypertext, for easy navigation among resources (e.g.,
HTML).
URI introduction
• Every resource available on the Web -- HTML
document, image, video clip, program, etc. -has an address that may be encoded by a
Universal Resource Identifier, or "URI".
• URIs typically consist of three pieces:
– The naming scheme of the mechanism used to
access the resource.
– The name of the machine hosting the resource.
– The name of the resource itself, given as a path.
example URI
• http://openlib.org/home/krichel
This URI may be read as follows: There is a
document available via the HTTP protocol,
residing on the site openlib.org, accessible via
the path "/home/krichel".
• mailto:[email protected]
This URI may be read as follows: There is email
user krichel in a domain openlib.org to whom
email may be sent.
client / server protocol
• The web operates mostly on http.
• This is a client-server protocol.
• The client software is run on the local PC that
you are using.
– It is called a web browser or user agent.
• Our server is a piece of hardware called
wotan.liu.edu
– It runs the Debian GNU/Linux operating system on a
Intel architecture.
– It provides http daemon software that serves http
requests. The particular software is called Apache.
communication with the server
• The protocol for communicating with the server
is the secure shell, short ssh. It is based publickey cryptography.
• We two two ssh clients
– For file editing and manipulation, we use putty.
– For file transfer, we use winscp.
– Both are available on the web.
• Telnet and ftp servers are not available on
wotan.liu.edu. Telnet and ftp do not encrypt the
communication stream; therefore they are not
secure.
registration time
• As part of the course, you are being provided
with web space on the server wotan.liu.edu, at
the URL
http://wotan.liu.edu/~username
where username is a user name that you will
chose now.
• It is my intention to maintain this web space for
you into the foreseeable future.
• You should also choose a password, now.
• I will now register you.
login time
• Use putty, port 22 to wotan.liu.edu
• set other attributes of the session as you like,
using the menu on the left, for example
– colors
– font shapes and sizes
– bell
• Save the session as “wotan” (in the first screen)
to save all the customization.
• You do not normally need to login to the
machine, unless you want to work with it.
free software
• I maintain wotan.liu.edu server but you can build
your own server if
– you have Internet access
– you have an old PC to spare
• All the server software, as well as putty and
winscp are free, open-source.
• It is one of my fundamental beliefs that free
information should run on free software.
• The library community can learn a hell of a lot
from the free software community.
• See my talk at http://openlib.org/home/krichel/
presentations/new_york_2003-11-07.ppt
installing software at home
• Go to your favorite search engine to search for
– putty
– winscp
• Download and run windows-style installer software
to install both pieces of software.
• Download and install a recent version of at least
two browsers. I suggest
– Netscape Navigator at
http://channels.netscape.com/ns/browsers/download.jsp
– Opera at http://www.opera.com
putty and winscp
• You can either maintain files on wotan.liu.edu
– by logging into wotan.liu.edu
– using a file editor there, for example nano
– past experience has shown that this is hard for
students with no UNIX experience.
• You can also maintain text files locally
– each time you make a change, you save the file and
upload to wotan.liu.edu using winscp.
– you can use Notepad locally to maintain text files
– I do not recommend using WordPad and Word.
create a web page in MS notepad
• Open Microsoft notepad. Type the text
<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head><meta http-equiv="Content-Type"
content="text/html; charset=UTF8">
<title></title></head><body>
<div></div></body>
</html>
Saving the web page
• save as “empty.html”.
• If you want to open it again in notepad
–
–
–
–
open notepad
select file/open
list all files
empty.html
• Don't click on the file.
• Don't choose edit in the context menu.
upload and view file
• Once you have your file “empty.html”, use the
menus of winscp to upload it to your file in the
public_html directory of your home directory on
wotan.liu.edu.
• It has to be in public_html !
• Once it is there, use a web browser to view it at
http://wotan.liu.edu/~user/empty.html, where
user is your user id.
• Then validate it at http://validator.w3.org.
– enter the URL of the page that you want to validate
– hit the validate button
• It has to be in public_html !
public_html
• Is your web directory. It is automagically created
for you when Thomas registers you.
• The web server will map requests to
http://wotan.liu.edu/~user/file to show the file
/home/user/public_html/file.
• Here user stands for your user id, and file is the
file name.
• If file ends with “.html” or “.htm” the web browser
will be told that the file is a html file. It will be
rendered accordingly by the browser.
index.html
• The web server on wotan will map requests to
http://wotan.liu.edu/~user to show the file
~user/public_html/index.html
• If this file is not there, the server will prepare a
html document from the list of files that it finds in
the directory and send it to the user agent.
• Once you have a file index.html, the web user
can no longer see the individual files in your
directory.
HTML and XHTML
• HTML is the hypertext markup language
• HTML is a markup language that is widely used
on the Word Wide Web (WWW)
• The latest, and probably last version of HTML is
at http://www.w3.org/TR/html4/
• The WC3, the standard making body for the
WWW, have issued XHTML, a replacement of
HTML that is compatible with XML.
• We will ignore XHTML for the rest of the course.
what is markup?
• Everything in a document that is not content.
It can be give in two ways
• 1: Procedural
– Codes identify point size, style, font, etc.
– Usually only understood by defining tool
– Example: Microsoft Word
• 2: Descriptive
–
–
–
–
Describes purpose of text within the document
Chapter head, Paragraph, Section Head, TOC
Structure and Style are kept separate
Example: LaTeX, SGML
SGML
• Standard Generalized Markup Language
• Descriptive approach with three separate layers
– structure: types of information in document
– content: the information itself
– style: matches typesetting with structure
• Developed for the publishing industry by a group
around Goldfarb.
• So complicated that no software implements it fully
• Document Type Definition (DTD)
– Defines the structure
Document Type Definition (DTD)
• Describes information the document handles
– e.g Title,TOC, Chapter, Section
• Relationships between fields
– e.g. A Chapter contains Sections
• Consistency
• Logical structure
• Information defined by tags
HTML
• HyperText Markup Language
• Defines an SGML DTD
–
–
–
–
–
Head, Title, Body, Paragraph, etc.
Headings, Bold, Italic, etc.
Table, List, Image, etc.
Links to other documents
Forms
• Style applied by Web Browser
– User has some control
HTML history
• HTML was a very bare-bones language when
first invented by Tim Berners-Lee. It did not
describe pages with much of a visual appeal.
• In the 90s, successful browsers invented
“extensions” that aimed to stretch the visual
boundaries of HTML.
• Some of these extensions found their way in the
official HTML spec issued by the W3C.
“my HTML”
• I will teach HTML 4.01. This version has two
different DTDs:
– the loose DTD
– the strict DTD
• I will only do the tags of the strict DTD
• The loose DTD has more tags, but all the
functionality of these tags is best done with style
sheets.
• Thus, the pages created with HTML only will
look rather boring.
• But we do cover style sheets later.
HTML tags
• HTML markup is written as tags. Tags are written
as pairs (typically)
– begin with <tag>
– end with </tag>
– tag is the tag name
"tag start"
"tag end"
• Can be nested
• Can contain non-markup data
• Tag names are case-insensitive, but it is best to
use the same case, consistently, for human
readability.
attributes to tags
• <atag attribute_name_one="value_one"
attribute_name_two="value_two">
• Here attribute_name_one and
attribute_name_two are attribute names and
value_one and value_two are attribute values.
• I will say: tag <tag> “requires” attribute
"attribute".
• I will say tag <tag> “takes” attribute "attribute" if
the attribute is optional.
Example
<a href="http://openlib.org/home/krichel"
title="homepage of Thomas Krichel">Thomas
Krichel</a>
– the whole thing is an <a> tag.
(I surround tag names with <>)
– “href” is an attribute name
– “http://openlib.org/home/krichel” is the value of the
"href" attribute
(I surround attribute names with straight quotes)
– “Thomas Krichel” is character data.
Characters: concept
• A character set combine two things
– Character repertoire: a set of characters e.g. "A", "‫"ﺾ‬
"‼", "₣"
– Character code positions: defines a number for each
character in the repertoire.
• Character encoding is a way to encode the code
positions in bytes
• To correctly display a document, the user agent
needs to know both!
playing safe with characters
• Only use the characters on the US keyboard,
don't insert symbols.
• Save as ascii or utf-8.
• Never save as "Unicode" within MS Notepad.
• If you encounter a character that is not on your
keyboard, use an SGML entity.
Special Characters
• Inserted as an entity reference
– Format can be &code;
• Ex. &amp;
– Insert an ampersand
– Codes are often abbreviation of the character names
– Codes can be in hex form
• Ex. &#38; to insert an ampersand
http://www.w3.org/TR/REC-html40/sgml/entities.html
has the list
classifying tags
• There is a whole bunch of different tags.
• We can group tags together in different ways.
• In the following, I will explain some of the ways.
– block-level vs text-level tags
– tags that require closing vs those that do not.
block-level vs text-level tags
• Block-level tags contain data that is aligned
vertical by visual user agent.
• Text-level tags are aligned horizontally by visual
user agents.
• There are a number of reasons behind this
distinction
– Block level can contain other block level tags and
text-level tags.
– Text-level tags can not contain block-level tags.
– Visual user agents start a new line at the beginning of
block-level tags.
– Multidirectional text would be impossible without it.
common frame for pages
• We look at empty.html again. Here is the start
again
<!DOCTYPE HTML PUBLIC "-//W3C//DTD
HTML 4.01//EN“
"http://www.w3.org/TR/html4/strict.dtd">
• This is an SGML document type declaration.
• It says which kind of HTML it is.
• Use empty.html as a start to compose all your
pages.
special topic: images
• The appeal of the web to the masses has a lot to
do with its capability to transport image.
• Image format are independent of the web, but
there are two classic format that are widely
supported by user agents.
– GIF
– JPEG
GIF
• stands for graphics interchange format.
• developed by CompuServe.
• unresolved copyright issues make the format
abhorred by the free software community.
• 250 colors maximum
• uses a loss-less compression technique
GIF has three tricks
• interlacing:
– when downloading the file, the browser can show
every forth row first
– user gets in an idea of the picture before it is sharp
• transparency
– some GIFs are transparent, so you can see them on
top of already exist
– technically, the GIF has one color as the background
color, and pixels of that color are ignored by the user
agent
• animation
– some GIFs are in fact sequences of GIFs that can be
rendered one after the other.
JPEG
• The Joint Photographic Experts Group is a
standard-making body for images
• They can support thousands of colors.
• The compression is lossy, i.e. the JPEG file will
look like the original image, but not be the same.
• The compression does not work well with
drawings.
• There are no copyright and patent problems with
JPEG
working with wotan
• You can work with wotan directly if you like. Use
putty to connect to wotan.liu.edu, then type
cd public_html
• You can start from empty.html, the file that
validates, and copy it to test.html
cp empty.html test.html
nano test.html
• Then you can change test.html to try out the
tags as I discuss them here.
working on the local machine
• Open empty.html on your web site and save as
test.html
• edit it with notepad to be safe
• open with Internet Explorer to see the rendered
html
• to validate
– you have to upload the file first to your public_html
directory on wotan.liu.edu
– Then use the W3C validator at http://validator.w3c.org
literature
• I work from the text of the official standard at
http://www.w3.org/TR/html4/
• To work with it faster, I made a copy at
http://wotan.liu.edu/~krichel/html4/
• You can work from any HTML book.
Homework
• Look at course home page
http://wotan.liu.edu/home/krichel/lis650p04s
• Send [email protected] your secret word for
course result delivery.
• Prepare a one-page max summary of the type of
website that you want to build, bring printed copy
with you next week.
• Prepare for quiz at the beginning of next lecture.
http://openlib.org/home/krichel
Thank you for your attention!