Transcript Slide 1
Announcements
Reading for next week: 2 papers available on
Blackboard
Background Reading: 1.11, 12.1-12.8 in text
About homework assignment 1 ...
1
Plan for Today
Review of Database Design, Functional
Dependency, and Normal Forms
Choices for Application Design
XML (briefly)
Database Design
What’s the problem here?
Entity-Relationship Model (and Diagrams)
Functional Dependencies
Legal relations
Decompositions
Closures and Canonical Covers
Dependency Preservation
Normal Forms
1st Normal Form
Boyce-Codd Normal Form (BCNF)
3rd Normal Form
3
Application Design
Application Design
What’s the Big Question/Problem in
Application Design?
In other words, why are we studying this?
Client Side Scripting and Applets
Browsers can fetch certain scripts (client-side scripts) or
programs along with documents, and execute them in
“safe mode” at the client site
Javascript
Macromedia Flash and Shockwave for animation/games
VRML
Applets
Client-side scripts/programs allow documents to be
active
E.g., animation by executing programs at the local site
E.g. ensure that values entered by users satisfy some
correctness checks
Permit flexible interaction with the user.
Executing programs at the client site speeds up interaction by
avoiding many round trips to server
Client Side Scripting and Security
Security mechanisms needed to ensure that
malicious scripts do not cause damage to the
client machine
Easy for limited capability scripting languages, harder
for general purpose programming languages like Java
E.g. Java’s security system ensures that the
Java applet code does not make any system
calls directly
Disallows dangerous actions such as file writes
Notifies the user about potentially dangerous actions,
and allows the option to abort the program or to
continue execution.
Web Servers
A Web server can easily serve as a front end to a variety
of information services.
The document name in a URL may identify an executable
program, that, when run, generates a HTML document.
To install a new service on the Web, one simply needs to
create and install an executable that provides that
service.
When a HTTP server receives a request for such a document, it
executes the program, and sends back the HTML document that
is generated.
The Web client can pass extra arguments with the name of the
document.
The Web browser provides a graphical user interface to the
information service.
Common Gateway Interface (CGI): a standard interface
between web and application server
HTTP and Sessions
The HTTP protocol is connectionless
That is, once the server replies to a request, the
server closes the connection with the client, and
forgets all about the request
In contrast, Unix logins, and JDBC/ODBC connections
stay connected until the client disconnects
Motivation: reduces load on server
operating systems have tight limits on number of open
connections on a machine
Information services need session information
retaining user authentication and other information
E.g. user authentication should be done only once per
session
Solution: use a cookie
Sessions and Cookies
A cookie is a small piece of text containing
identifying information
Sent by server to browser on first interaction
Sent by browser to the server that created the
cookie on further interactions
Server saves information about cookies it issued,
and can use it when serving a request
part of the HTTP protocol
E.g., authentication information, and user preferences
Cookies can be stored permanently or for a
limited time
Three-Tier Web Architecture
Two-Tier Web Architecture
Servlets
Java Servlet specification defines an API for communication
between the Web server and application program
E.g. methods to get parameter values and to send HTML text back to
client
Application program (also called a servlet) is loaded into the Web
server
Two-tier model
Each request spawns a new thread in the Web server
thread is closed once the request is serviced
Servlet API provides a getSession() method
Sets a cookie on first interaction with browser, and uses it to identify
session on further interactions
Provides methods to store and look-up per-session information
E.g. user name, preferences, ..
Example Servlet Code
public class BankQueryServlet extends HttpServlet
{
public void doGet(HttpServletRequest request,
HttpServletResponse result)
throws ServletException, IOException
{
String type = request.getParameter(“type”);
String number = request.getParameter(“number”);
…code to find the loan amount/account balance …
…using JDBC to communicate with the database..
…we assume the value is stored in the variable balance
result.setContentType(“text/html”);
PrintWriter out = result.getWriter( );
out.println(“<HEAD><TITLE>Query Result</TITLE></HEAD>”);
out.println(“<BODY>”);
out.println(“Balance on “ + type + number + “=“ + balance);
out.println(“</BODY>”);
out.close ( );
}
}
Server-Side Scripting
Server-side scripting simplifies the task of connecting a
database to the Web
Define a HTML document with embedded executable code/SQL
queries.
Input values from HTML forms can be used directly in the
embedded code/SQL queries.
When the document is requested, the Web server executes the
embedded code/SQL queries to generate the actual HTML
document.
Numerous server-side scripting languages
JSP, Server-side Javascript, ColdFusion Markup Language
(cfml), PHP, Jscript
General purpose scripting languages: VBScript, Perl, Python
Comparative Advantages
JDBC and ODBC from Client
Client-side Scripting and Applets
Positive:
Negative:
Two-Tier Server Architecture
Positive:
Negative:
Three-Tier Server Architecture
Positive:
Negative:
Positive:
Negative:
Server-Side Scripting
Positive:
Negative:
eXtensible Markup Language
(XML)
XML: Motivation
Data interchange is critical in today’s networked
world
Examples:
Banking: funds transfer
Order processing (especially inter-company orders)
Scientific data
Chemistry: ChemML, …
Genetics: BSML (Bio-Sequence Markup Language), …
Paper flow of information between organizations is
being replaced by electronic flow of information
Each application area has its own set of
standards for representing information
XML has become the basis for all new
generation data interchange formats
XML Motivation (Cont.)
Earlier generation formats were based on plain text with
line headers indicating the meaning of fields
Similar in concept to email headers
Does not allow for nested structures, no standard “type” language
Tied too closely to low level document structure (lines, spaces, etc)
Each XML based standard defines what are valid
elements, using
XML type specification languages to specify the syntax
Plus textual descriptions of the semantics
XML allows new tags to be defined as required
DTD (Document Type Descriptors)
XML Schema
However, this may be constrained by DTDs
A wide variety of tools is available for parsing, browsing
and querying XML documents/data
Comparison with Relational Data
Inefficient: tags, which in effect represent
schema information, are repeated
Better than relational tuples as a dataexchange format
Unlike relational tuples, XML data is selfdocumenting due to presence of tags
Non-rigid format: tags can be added
Allows nested structures
Wide acceptance, not only in database systems,
but also in browsers, tools, and applications
Structure of XML Data
Tag: label for a section of data
Element: section of data beginning with <tagname> and
ending with matching </tagname>
Elements must be properly nested
Proper nesting
Improper nesting
<account> … <balance> …. </balance> </account>
<account> … <balance> …. </account> </balance>
Formally: every start tag must have a unique matching end tag,
that is in the context of the same parent element.
Every document must have a single top-level element
XML Example
<bank>
<account>
<account_number> A-101 </account_number>
<branch_name>
Downtown </branch_name>
<balance>
500
</balance>
</account>
<depositor>
<account_number> A-101 </account_number>
<customer_name> Johnson </customer_name>
</depositor>
</bank>
XML Document Schema
Database schemas constrain what information can be
stored, and the data types of stored values
XML documents are not required to have an associated
schema
However, schemas are very important for XML data
exchange
Otherwise, a site cannot automatically interpret data received
from another site
Two mechanisms for specifying XML schema
Document Type Definition (DTD)
Widely used
XML Schema
Newer, increasing use
Document Type Definition (DTD)
The type of an XML document can be specified using a DTD
DTD constraints structure of XML data
DTD does not constrain data types
What elements can occur
What attributes can/must an element have
What subelements can/must occur inside each element, and how many
times.
All values represented as strings in XML
DTD syntax
<!ELEMENT element (subelements-specification) >
<!ATTLIST element (attributes) >
Element Specification in DTD
Subelements can be specified as
names of elements, or
#PCDATA (parsed character data), i.e., character strings
EMPTY (no subelements) or ANY (anything can be a subelement)
Example
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT customer_name (#PCDATA)>
<! ELEMENT account_number (#PCDATA)>
Subelement specification may have regular expressions
<!ELEMENT bank ( ( account | customer | depositor)+)>
Notation:
“|” - alternatives
“+” - 1 or more occurrences
“*” - 0 or more occurrences
Bank DTD
<!DOCTYPE bank [
<!ELEMENT bank ( ( account | customer | depositor)+)>
<!ELEMENT account (account_number branch_name balance)>
<! ELEMENT customer(customer_name customer_street
customer_city)>
<! ELEMENT depositor (customer_name account_number)>
<! ELEMENT account_number (#PCDATA)>
<! ELEMENT branch_name (#PCDATA)>
<! ELEMENT balance(#PCDATA)>
<! ELEMENT customer_name(#PCDATA)>
<! ELEMENT customer_street(#PCDATA)>
<! ELEMENT customer_city(#PCDATA)>
]>
Limitations of DTDs
No typing of text elements and attributes
All values are strings, no integers, reals, etc.
Difficult to specify unordered sets of subelements
Order is usually irrelevant in databases (unlike in the documentlayout environment from which XML evolved)
(A | B)* allows specification of an unordered set, but
Cannot ensure that each of A and B occurs only once
IDs and IDREFs are untyped
The owners attribute of an account may contain a reference to
another account, which is meaningless
owners attribute should ideally be constrained to refer to customer
elements
XML Schema
XML Schema is a more sophisticated schema language which
addresses the drawbacks of DTDs. Supports
Typing of values
User-defined, comlex types
Many more features, including
E.g. integer, string, etc
Also, constraints on min/max values
uniqueness and foreign key constraints, inheritance
XML Schema is itself specified in XML syntax, unlike DTDs
More-standard representation, but verbose
XML Scheme is integrated with namespaces
BUT: XML Schema is significantly more complicated than DTDs.
XML Schema Version of Bank DTD
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“bank” type=“BankType”/>
<xs:element name=“account”>
<xs:complexType>
<xs:sequence>
<xs:element name=“account_number” type=“xs:string”/>
<xs:element name=“branch_name”
type=“xs:string”/>
<xs:element name=“balance”
type=“xs:decimal”/>
</xs:squence>
</xs:complexType>
</xs:element>
….. definitions of customer and depositor ….
<xs:complexType name=“BankType”>
<xs:squence>
<xs:element ref=“account” minOccurs=“0”
maxOccurs=“unbounded”/>
<xs:element ref=“customer” minOccurs=“0”
maxOccurs=“unbounded”/>
<xs:element ref=“depositor” minOccurs=“0”
maxOccurs=“unbounded”/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Where we are in the course …
Fundamentals of Using a Database
Implementing a Database
Relational Model
SQL
Database Design
Application Design
System Architecture
Storage Structure and Indexing
Query Processing and Optimization
Transactions
Data Mining and Databases
Pattern and Association Mining
Information Retrieval