Transcript CGI

Web Server
• Web servers enable HTTP access to a ‘Web site,’
which is simply a collection of documents and other
information organized into a tree structure, much
like a computer’s file system.
• Dynamic content can come from a variety of sources.
Search engines and databases can be queried to
retrieve and present data that satisfies the selection
criteria specified by a user
BASIC OPERATION
• Web servers, browsers, and proxies communicate by
exchanging HTTP messages.
• The server receives and interprets HTTP requests,
locates and accesses requested resources, and
generates responses, which it sends back to the
originators of the requests.
HTTP request processing
1. http://mysite.org/.
2. The process begins when the end user tells the browser to
access the page found at the URL
http://mysite.org/pages/simple-page.html.
How to get in PHP
•
•
•
•
$_SERVER['HTTP_HOST'];
$_SERVER['HTTP_REFERER'];
$_SERVER['HTTP_USER_AGENT'];
$_SERVER['SCRIPT_NAME'];
Delivery of static content
• Web servers present both static content and dynamic content.
Static content falls into two categories:
• 1. static content pages: static files containing HTML pages,
XML pages, plain text, images, etc., for which HTTP responses
must be constructed (including headers);
•
2. as-is pages: pages for which complete HTTP responses
(including headers) already exist and can be presented ‘as is’.
Dynamic content
• Dynamic content, the server must take an explicit
programmatic action to generate a response, such as the
execution of an application program, the inclusion of
information from a secondary file, or the interpretation of a
template. This mode of processing includes Common Gateway
Interface (CGI) programs, Server-Side Include (SSI) pages, Java
Server Pages (JSP), Active Server Pages (ASP), and Java
Servlets, among others.
• Web servers use a combination of filename suffixes/extensions and URL
prefixes to determine which processing mechanism should be used to
generate a response.By default, it is assumed that a URL should be
processed as a request for a static content page. However, this is only one
of a number of possibilities.
• A URL path beginning with /servlet/ might indicate that the target is a Java
servlet. A URL path beginning with /cgi-bin/ might indicate that the target
is a CGI script. A URL where the target filename ends in .cgi might indicate
this as well. URLs where the target filename ends in .php or .cfm might
indicate that a template processing mechanism (e.g. PHP or Cold Fusion)
should be invoked.
CGI
The original mechanisms for serving up dynamic content are CGI
(Common Gateway Interface) and SSI (Server Side Includes).
Today’s Web servers use more sophisticated and more
efficient Mechanisms for serving up dynamic content.
CGI was the first consistent server-independent mechanism,
dating back to the very early days of the World Wide Web.
• CGI is the part of the Web server that can communicate with
other programs running on the server. With CGI, the Web
server can call up a program, while passing user-specific data
to the program (such as what host the user is connecting
from, or input the user has supplied using HTML form syntax).
The program then processes that data and the server passes
the program's response back to the Web browser.
• The heart of the CGI specification is the
designation of a fixed set of environment
variables that all CGI applications know about
and have access to. The server is supposed to
use request information to populate variables
in from request information other than HTTP
headers.
SSI mechanism
• The Server Side Includes specification (SSI)
dates back almost as far as the CGI
pecification. It provides mechanisms for
including auxiliary files (or the results of the
execution of CGI scripts) into an HTML page
ADVANCED MECHANISMS FOR DYNAMIC CONTENT
DELIVERY
• CGI is a simple mechanism for implementing portable server-side
applications. It is employed ubiquitously throughout the Web.
However, there are a number of problems associated with CGI
processing. Its main deficiency is performance. Processing a request
that invokes a CGI script requires the spawning of a child process to
execute that script (plus another process if the script is written in an
interpreted language such as Perl).
• SSI has similar deficiencies when its command processing employs
CGI under the hood. It adds the additional performance penalty by
requiring servers to parse SSI pages. Most importantly, SSI may
represent a serious security risk, especially when not configured
carefully by the server administrator.
Native APIs (ISAPI and NSAPI)
• Efficiency concerns may be addressed by using
native server APIs. A native API is simply a
mechanism providing direct ‘hooks’ into the Web
server’s application programming interface. Use
of a native API implies the use of compiled code
that is optimized for use within the context of a
specific Web server environment. NSAPI and
ISAPI are two approaches employed by
Netscape’s Web server software and Microsoft’s
IIS, respectively.
FastCGI
• FastCGI is an attempt to combine the portability of CGI
applications with the efficiency of non-portable applications
based on server APIs. The idea is simple:
• instead of requiring the spawning of a new process every time
a CGI script is to be executed, FastCGI allows processes
associated with CGI scripts to ‘stay alive’ after a request has
been satisfied. This means that new processes do not have to
be spawned again and again, since the same process can be
reused by multiple requests. These processes may be
initialized once without endlessly re-executing initialization
code.
Template processing
Another approach used to serve dynamic content involves the
use of template processors.
In this approach, templates are essentially HTML files with
additional ‘tags’ that prescribe methods for inserting
dynamically generated content from external sources. The
template file contains HTML that provides general page layout
parameters, with the additional tags discretely placed so that
content is placed appropriately on the rendered page. Among
the most popular template approaches are PHP (an open
source product), Cold Fusion (from Allaire/Macromedia), and
Active Server Pages or ASP (from Microsoft).
• This functionality, which is found in many programming and
scripting languages, includes:
• submitting database queries,
• iterative processing (analogous to repetitive ‘for-each’
looping), and conditional processing (analogous to ‘if’
statements).
Servlets
A better approach to serving dynamic content is the Servlet
API—Java technology for implementing applications that are
portable not only across different servers but also across
different hardware platforms and operating systems. Like
FastCGI, the servlet API uses server application modules that
remain resident and reusable, rather than requiring the
spawning of a new process for every request. Unlike FastCGI,
the servlet API is portable across servers, operating systems,
and hardware platforms. Servlets execute the same way in
any environment that provides a compliant servlet runner. The
servlet API generated very strong following; it is widely used in
a variety of Web server environments.
Java Server Pages
• The Java Server Pages (JSP) mechanism came about as Sun’s
response to Microsoft’s own template processing approach, Active
Server Pages. JSP was originally intended to relieve servlet
programmers from the tedium of having to generate static HTML or
XML markup through Java code. Today’s JSP processors take static
markup pages with embedded JSP instructions and translate them
into servlets, which then get compiled into Java byte code. More
precisely, JSP 1.1-compliant processors generate Java classes, which
extend the HttpJspBase class that implements the Servlet interface.
• What this means is that JSP serves as a pre-processor for servlet
programmers. The resulting classes are compiled modules that
execute faster than a processor that interprets templates at request
time.
ADVANCED FEATURES
•
•
•
•
Virtual hosting
Chunked transfers
Caching support
Extensibility
Virtual Hosting
• virtual hosting is the ability to map multiple
server and domain names to a single IP
address. The lack of support for such feature
in HTTP/1.0 was a glaring problem for Internet
Service Providers (ISP). After all, it is needed
when you register a new domain name and
want your ISP to support it.
1. Use information in the required Host header to identify the virtual
host.
2. Generate error responses with the proper 400 Bad Request status
code in the absence of Host.
3. Support absolute URLs in requests, even though there is no
requirement that the server identified in the absolute URL matches
the Host header.
4. Support isolation and independent configuration of document trees
and server side applications between different virtual hosts that are
supported by the same server installation.
Chunked transfers
The chances are there were a number of occasions when you spent long
minutes sitting in front of your browser waiting for a particularly slow
page. It could bebecause of the slow connection or it could be that the
server application is slow. Either way you have to wait even though all you
need may be to take a quick look at the page before you move on.
HTTP/1.1 specification introduced the notion of transfer encoding as well
as the first kind of transfer encoding—chunked —that is designed to
enable processing of partially transmitted messages.According to the
specification, the server is obligated to decode HTTP requests containing
the Content-Transfer-Encoding: chunked header prior to passing it to
server applications.
Caching support
• Caching is one of the most important
mechanisms in building scalable applications.
Server applications may cache intermediate
results to increase efficiency when serving
dynamic content, but such functionality is
beyond the responsibility of HTTP servers
Extensibility
• Real HTTP servers vary in the availability of
optional built-in components that support the
execution of server-side applications. They
also differ in the implementation of optional
HTTP methods, which are all methods except
GET and HEAD. Fortunately, they provide
server administrators with ways to extend the
default functionality