Chapter 1 Intro and Overview
Download
Report
Transcript Chapter 1 Intro and Overview
Functionality of a web server
What does the web server do?
Let a user request a resource
Find the resource
Return something to the user
The resource can be different things, such as
An HTML page
A picture
A PDF document
Data (xml, json, plain/text)
Functionality of a web server
If the requested resource is not there, you will get an
error “404 Not Found” error in the browser
In the context of this book, when we say “server”, we
mean either the physical machine (hardware) or the
web server application (software)
Functionality of a web client
When we talk about clients, we usually mean both (or
either) the human user and browser application
The browser is the piece of software that knows how to
render HTML pages
How do the clients and servers talk
to each other?
The clients and servers speak HTTP
HTTP stands for Hyper Text Transfer Protocol
HTTP is the protocol clients and servers use on the web
to communicate
The client sends an HTTP request, and the server
answers with an HTTP response.
The browsers must know HTML
HTML stands for HyperText Markup Language
HTML tells the browser how to display the content to
the user
HTML
When you develop a web page, you use HTML to
describe what the page should look like and how it
should behave
The goal of HTML is to take a text document and add
tags that tell the browser how to format the text.
What is HTTP
HTTP runs on top of TCP/IP.
TCP stands for Transmission Control Protocol
It is a connection-oriented, end-to-end reliable protocol
for message transmission on the network
TCP is responsible for making sure that a file sent from
one network node to another ends up as a complete file
at the destination
IP stands for Internet Protocol (protocol used for
communicating data across a packet-switched
internetwork)
HTML can be part of the HTTP
response
An HTTP response can contain HTML
HTTP adds header information to the top of whatever
content is in the response
An HTML browser uses that header info to help
process the HTML page
HTML can be part of the HTTP
response
HTTP Header info
<html>
<head>
…
</head>
<body>
<img src=…>
</body>
</html>
What is in the HTTP request
The first thing you’ll find is an HTTP method name
HTTP protocols has several methods, the ones you’ll
use most often are
GET
POST
HTTP GET
User clicks a link to a
new page
User
Browser
Browser sends out an HTTP
GET to the server, asking the
server to get the page
Server
HTTP POST
Browser sends out an HTTP
POST to the server, giving the
server what the user typed into
the form
User types in a form
and hits the Submit
button
User
Browser
Server
What is in the HTTP request
The main job of HTTP GET is to ask the server to get a
resource such as an HTML page, a JPEG, a PDF, etc.
The main job of HTTP POST is to request something
and at the same time send form data to the server
What is in the HTTP request
That does not mean HTTP GET cannot be used to
send data
The data you send with HTTP GET is appended to the
URL up in the browser bar
So whatever you send is exposed
You can even use HTTP GET to send form data to the
web server
Doing this cause any form data to be exposed in the
browser bar, that is why people usually do not do this
Anatomy of an HTTP GET request
HTTP method
Request line
Request
headers
Path of resource
Protocol version
Anatomy of an HTTP GET request
Another example with request parameters
HTTP GET request
parameters
Anatomy of an HTTP GET request
In a GET request
Parameters (if there are any) are appended to the first
part of the request URL, Starting with a “?”.
Parameters are separated with an ampersand “&”
E.g., GET /select/selectBeerTaste.jsp?color=dark&taste=malty HTTP/1.1
Anatomy of an HTTP POST request
Request line
Request
headers
Message
body
Path of resource Protocol version
Anatomy of an HTTP POST request
HTTP POST requests are designed to be used by the
browser to make complex requests on the server
For example, it can be used to send all of the form data
use completed to the web server and then added the
data to a database
The data sent back to the server is known as the
“message body” or “payload”
The data can be quite large
Anatomy of an HTTP response
Protocol ver
Response
header
Response
body
http status code
Text version of
the status code
Anatomy of an HTTP response
An HTTP response has both a header and a body.
The header info tells the browser about the protocol
being used
Whether the request was successful
What kind of content is included in the body
The body contains the contents (e.g., HTML) for the
browser to display
Anatomy of an HTTP response
The Content-Type response header’s value is known as
a MIME type.
The MIME type tells the browser what kind of data the
browser is about to receive so that the browser will know
how to render it
Notice that the MIME type value relates to the value
listed in the HTTP request’s “Accept” header
MIME stands for Multipurpose Internet Mail Extensions
URL
URL stands for Uniform Resource Locator
Every resource on the web has its own unique address, in the
URL format
http://www.wickedlysmart.com:80/beeradvice/select/beer1.html
Path
Sever name
Protocol
Resource
Port
if not specified,
then port 80 is the
default
If not specified,
default to index.html
TCP port
A TCP port is just a number
A port represents a logical connection to a particular
piece of software running on the server hardware
A TCP port can be any number from 0-65535
A port does not represent a place to plug in some
physical device, it is just a number representing a server
application
The TCP port numbers from 0 to 1023 are reserved for
well-known services
Well-known TCP port numbers
FTP: 21
Telnet: 23
SMTP: 25
HTTPS: 443
POP3: 110
HTTP: 80
Time: 37
Directory structure for a simple
Apache web site
Apache is a popular open source web server
Suppose we have a web site www.wickedlysmart.com
running on Apache
It hosts two applications
One giving skiing advice
One giving beer-related advice
What would the directory structure look like for this
web site?
Directory structure for a simple Apache
Apache
web site
Home
htdocs is the dir that is the root
for all of the web applications
Index.html is the default
page that will be returned to
a user who keys
www.wickedlysmart.com
htdocs
The root folder for the
skiingAdvice application
The root folder for the
beerAdvice application
<html>
.
.
.
</html>
A
skiingAdvic
e
beerAdvic
e
select
select
Index.html
<html>
.
.
.
</html>
<html>
.
.
.
</html>
B
Index.html is the default
page for the skiingAdvice
application
C
Index.html
checkout
<html>
.
.
.
</html>
Index.html
Index.html is the default
page for the beerAdvice
application
D
selectBeer.html
An HTML page that gives the
user some advice
Mapping URLs to content
http://www.wickedlysmart.com will cause the server to return to you index.html
at location A
Mapping URLs to content
What url will cause the server to return to you index.html at location B?
Mapping URLs to content
What url will cause the server to return to you index.html at location C?
Mapping URLs to content
http://www.wickedlysmart.com will cause the server to return to you index.html
at location A
Web server loves serving static
web pages
Web server sends back the page the client ask for with added HTTP
header info without doing any change or computation on the page
If the client want a dynamic web page such as showing the time on the web
server, the server cannot do that
<html>
<body>
The current time is [insertTimeOnServer]
</body>
</html>
Web server cannot
insert the time on the
html page directly
Helper application
So a helper application is needed to generate the dynamic content.
Web server sends the request to
the helper application (to generate
dynamic content), then take the
app’s response and send it back
to the client.
Client
Web
server
Another
application
on server
In fact, the client never needs to know that someone else did some of
the work
Two things the web server alone
won’t do
Dynamic Content
A separate “helper” application that the web server can
communicate with can build non-static, just-in-time pages
Saving data on the server
When the user submits data in a form, in order to process form data,
you need a help application
The helper application can either save data to a database or use the
data to generate the response page
CGI
Non-java term for a web server helper application is
“CGI” program
Most CGI programs are written as Perl scripts, but
many other languages can be used including
C, Python, and PHP
CGI
CGI
Differences between Servlets and
CGI
Servlets have better performance in serving client
requests
Client requests for a Servlet resource are handled as
separate threads of a single running Servlet
With CGI, the server has to launch a heavy-weight
process for each and every request for that resource
Servlet Demystified
Let us use a simple example to show how to write,
deploy, and run a servlet that generates a HTML page
that displays the current date and time of the server
Build this directory tree
Servlet Demystified
Write a servlet named Ch1Servlet.java and put it in the
src directory. Alternatively, you may download the
servlet code from the blackboard (lab assignment
section for this class)
Servlet Demystified
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
public class Ch1Servlet extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException {
PrintWriter out = response.getWriter();
java.util.Date today = new java.util.Date();
out.println("<html> " +
"<body>"+
"<h1 align=center>HF\'s Chapter1 Servlet</h1>"
+ "<br>"+today+"</body>"+"</html>");
}
}
Create a deployment descriptor
(DD)
<?xml version="1.0" encoding="ISO-8859-1"?>
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
version="2.5">
<servlet>
<servlet-name>Chapter_1_Servlet</servlet-name>
<servlet-class>Ch1Servlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>Chapter_1_Servlet</servlet-name>
<url-pattern>/Serv1</url-pattern>
</servlet-mapping>
</web-app>
Servlet Demystified
Build this directory under the existing tomcat
directory…
Servlet Demystified
From the project1 directory, compile the servlet
javac -classpath \tomcat_dir\lib\servlet-api.jar -d classes
src\Ch1Servlet.java
Please be noted that in the book, it uses a different version of tomcat, thus
it uses a different classpath as follows:
javac -classpath /your path/tomcat/common/lib/servlet-api.jar -d classes
src/Ch1Servlet.java (do not use this command line)
In our installation of version 6.0.18 of Tomcat, there is no sub-dir
common under tomcat dir. Thus, if you use the command in the book as
it, it will prompt a compilation error.
Servlet Demystified
Copy the Ch1Servlet.class file to WEB-INF/classes, and
copy the web.xml file to WEB-INF.
From the tomcat’s bin directory, start Tomcat
C:\apache-tomcat-6.0.18\bin>startup
or # ./startup.sh (for linux)
Servlet Demystified
Launch your browser and type in
http://localhost:8080/ch1/Serv1
Servlet Demystified
From now, every time you update either a servlet class
or the deployment descriptor, shutdown tomcat and
then restart it.
C:\apache-tomcat-6.0.18\bin>shutdown
or # ./shutdown.sh (for linux)
Disadvantage of using servlet
Disadvantage of using servlet
Question:
Why can’t I just copy a whole page of HTML from my
web page editor, like Microsoft Front Page, Dreamweaver,
and paste it into the println()?
Answer:
You cannot have a carriage return (a real one) inside a
String literal. Simply copy a whole page of HTML into
println() will cause compilation errors.
Quotes in HTML page can be a problem too
Overview of JSP
A JSP page looks like an HTML page, except you can
put Java and Java-related things inside the page
So it really like inserting a variable into your HTML
Putting Java into HTML is a solution for two issues:
Not all HTML page designers know Java
2. It is difficult to format HTML into a string literal in
servlet
1.
With JSP, Java Developers can do Java, and HTML
developers can do web pages