- Courses - University of California, Berkeley

Download Report

Transcript - Courses - University of California, Berkeley

Database Applications:
Web-Enabled Databases and
Search Engines: Cont.
University of California, Berkeley
School of Information Management and
Systems
SIMS 257: Database Management
Oct. 16, 2001
Database Management -- R. Larson
Today
• Databases for Web Applications –
Continued
Oct. 16, 2001
Database Management -- R. Larson
Why Use a Database System?
• Database systems have concentrated on providing
solutions for all of these issues for scaling up Web
applications
–
–
–
–
–
Performance
Scalability
Maintenance
Data Integrity
Transaction support
• While systems differ in their support, most offer
some support for all of these.
Oct. 16, 2001
Database Management -- R. Larson
Dynamic Web Applications 2
Web
Server
Internet
Files
CGI
DBMS
Server
database
database
database
Oct. 16, 2001
Clients
Database Management -- R. Larson
Server Interfaces
SQL
HTML
DHTML
Web Server
JavaScript
Native
DB
Interfaces
Database
Web DB
CGI
App ODBC
Web Server
API’s
ColdFusion
Native DB
interfaces
JDBC
PhP Perl
Web Application
Server
Adapted from
John P Ashenfelter,
Choosing a Database for Your Web Site
Oct. 16, 2001
Java
ASP
Database Management -- R. Larson
What Database systems are
available?
• Choices depend on:
– Size (current and projected) of the application
– Hardware and OS Platforms to be used in the
application
– Features required
• E.g.: SQL? Upgrade path? Full-text indexing? Attribute size
limitations? Locking protocols? Direct Web Server access?
Security?
–
–
–
–
Oct. 16, 2001
Staff support for DBA, etc.
Programming support (or lack thereof)
Cost/complexity of administration
Budget
Database Management -- R. Larson
Desktop Database Systems
System (producer)Platform
SQL
ODBC
Scaling
Access (Microsoft)
FoxPro (Microsoft)
FileMaker (FileMaker)
Excel (Microsoft)
Files (owner)
Yes
Yes
No
No
No
Yes
Yes
No
Yes
No
SQL Server
~$200
SQL Server
~$200
FileMaker Server ~$200
Convert to Access~$200
Import into DB ?
Windows
Windows,
Windows,
Windows,
Windows,
Mac
Mac
Mac
Mac
Price
• Individuals or very small enterprises can create
DBMS-enabled Web applications relatively
inexpensively
• Some systems will require an application server
(such as ColdFusion) to provide the access path
between the Web server and the DBMS
Oct. 16, 2001
Database Management -- R. Larson
Pros and Cons of Database
Options
• Desktop databases
– usually simple to set up and administer
– inexpensive
– often will not scale to a very large number of
users or very large database size
– May lack locking management appropriate for
multiuser access
– Poor handling for full-text search
– Well supported by application software
(Coldfusion, PHP, etc.)
Oct. 16, 2001
Database Management -- R. Larson
Enterprise Database Systems
System
Platform
SQL ODBC JDBC Web?
SQL-Server (Microsoft)
Oracle Internet Platform
Informix Internet Foundation.2000
Sybase Adaptive Server
DB2 (IBM)
WIndowsNT -2000
Unix, Linux, NT
Unix, Linux, NT
Unix, Linux, NT
IBM,Unix, Linux, NT
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
?
Yes
Yes
Yes
Yes
Yes (IIS)
Yes
Yes
Yes
Yes?
• Enterprise servers are powerful and
available in many different configurations
• They also tend to be VERY expensive
• Pricing is usually based on users, or CPU’s
Oct. 16, 2001
Database Management -- R. Larson
Pros and Cons of Database
Options
• Enterprise databases
– Can be very complex to set up and administer
• Oracle, for example recommends RAID-1 with 7x2 disk
configuration as a bare minimum, more recommended
–
–
–
–
Expensive
Will scale to a very large number of users
Will scale to very large databases
Incorporate good transaction control and lock
management
– Native handling of Text search is poor, but most DBMS
have add-on text search options
– Support for applications software (ColdFusion, PHP,
etc.)
Oct. 16, 2001
Database Management -- R. Larson
Free Database Servers
System
Platform
SQL ODBC JDBC Web?
mSQL
Unix, Linux
Yes
MySQL
Unix, Linux, NT Yes
PostgreSQL Unix, Linux, NT Yes
Yes
Yes
Yes
No(?)
No(?)
Yes
No?
No?
No?
• System is free, but there is also no help line.
• Include many of the features of Enterprise
systems, but tend to be lighter weight
• Versions may vary in support for different
systems
• Open Source -- So programmers can add features
Oct. 16, 2001
Database Management -- R. Larson
Pros and Cons of Database
Options
• Free databases
– Can be complex to set up and administer
– Inexpensive (FREE!)
– usually will scale to a large number of users
– Incorporate good transaction control and lock
management
– Native handling of Text search is poor
– Support for applications software (ColdFusion, PHP,
etc.)
Oct. 16, 2001
Database Management -- R. Larson
Embedded Database Servers
System
Platform
SQL ODBCJDBC Web?
Sleepycat DB Unix, Linux, Win No
Solid
Unix, Linux, Win Yes
No
Yes
Java API No?
Yes
Yes
• May require programming experience to
install
• Tend to be fast and economical in space
requirements
Oct. 16, 2001
Database Management -- R. Larson
Pros and Cons of Database
Options
• Embedded databases
–
–
–
–
Must be embedded in a program
Can be incorporated in a scripting language
inexpensive (for non-commercial application)
May not scale to a very large number of users (depends
on how it is used)
– Incorporate good transaction control and lock
management
– Text search support is minimal
– May not support SQL
Oct. 16, 2001
Database Management -- R. Larson
Web Application Server Software
•
•
•
•
ColdFusion
PHP
ASP
All of the are server-side scripting
languages that embed code in HTML pages
Oct. 16, 2001
Database Management -- R. Larson
ColdFusion
• Developing WWW sites typically involved
a lot of programming to build dynamic sites
– e.g. Pages generated as a result of catalog
searches, etc.
• ColdFusion was designed to permit the
construction of dynamic web sites with only
minor extensions to HTML through a
DBMS interface
Oct. 16, 2001
Database Management -- R. Larson
ColdFusion
• Started as CGI
– Drawback, as noted above, is that the entire
system is run for each cgi invocation
• Split into cooperating components
– NT service -- runs constantly
– Server modules for 4 main Web Server API
(glue that binds web server to ColdFusion
service) {Apache, ISAPI, NSAPI, WSAPI}
– Special CGI scripts for other servers
Oct. 16, 2001
Database Management -- R. Larson
What ColdFusion is Good for
• Putting up databases onto the Web
• Handling dynamic databases (Frequent
updates, etc)
• Making databases searchable and
updateable by users.
Oct. 16, 2001
Database Management -- R. Larson
Requirements
• Unix or NT systems
• Install as SuperUser
• Databases must be defined via “data source
names (DSNs) by administrator
Oct. 16, 2001
Database Management -- R. Larson
Requirements and Set Up
• Field (attribute) names should be devoid of spaces.
Use the underscore character, like new_items instead
of “new items.”
• Use key fields. Greatly reduces search time.
• Check permissions on the individual tables in your
database and make sure that they have read-access
for the username your Web server uses to log in.
• If your fields include large blocks of text, you'll
want to include basic HTML coding within the text
itself, including boldface, italics, and paragraph
markers.
Oct. 16, 2001
Database Management -- R. Larson
Templates
• Have a database named
contents_of_my_shopping_cart.mdb -single table called contents...
• Create an HTML page (uses extension .cfm),
before <HEAD>...
• <CFQUERY NAME= ”cart"
DATASOURCE=“contents_of_my_shopping_
cart">SELECT * FROM contents
</CFQUERY>
Oct. 16, 2001
Database Management -- R. Larson
Templates cont.
•
•
•
•
•
•
•
•
•
•
•
•
<HEAD>
<TITLE>Contents of My Shopping Cart</TITLE>
</HEAD>
<BODY>
<H1>Contents of My Shopping Cart</H1>
<CFOUTPUT QUERY= ”cart">
<B>#Item#</B> <BR>
#Date_of_item# <BR>
$#Price# <P>
</CFOUTPUT>
</BODY>
</HTML>
Oct. 16, 2001
Database Management -- R. Larson
Templates cont.
Contents of My Shopping Cart
Bouncy Ball with Psychedelic Markings
12 December 1998
$0.25
Shiny Blue Widget
14 December 1998
$2.53
Large Orange Widget
14 December 1998
$3.75
Oct. 16, 2001
Database Management -- R. Larson
CFIF and CFELSE
<CFOUTPUT QUERY= ”cart">
Item: #Item# <BR>
<CFIF #Picture# EQ"">
<IMG SRC=“generic_picture.jpg"> <BR>
<CFELSE>
<IMG SRC="#Picture#"> <BR>
</CFIF>
</CFOUTPUT>
Oct. 16, 2001
Database Management -- R. Larson
More Templates
<CFQUERY DATASOURCE = “AZ2”>
INSERT INTO Employees(firstname, lastname,
phoneext) VALUES(‘#firstname#’, ‘#lastname#’,
‘#phoneext#’) </CFQUERY>
<HTML><HEAD><TITLE>Employee Added</TITLE>
<BODY><H1>Employee Added</H1>
<CFOUTPUT>
Employee <B>#firstname# #lastname#</B> added.
</CFOUTPUT></BODY>
</HTML>
Oct. 16, 2001
Database Management -- R. Larson
CFML ColdFusion Markup
Language
• Read data from and update data to databases and
tables
• Create dynamic data-driven pages
• Perform conditional processing
• Populate forms with live data
• Process form submissions
• Generate and retrieve email messages
• Perform HTTP and FTP function
• Perform credit card verification and authorization
• Read and write client-side cookies
Oct. 16, 2001
Database Management -- R. Larson
PHP
• PHP is an Open Source Software project with
many programmers working on the code.
– Commonly paired with MySQL, another OSS project
– Free
– Both Windows and Unix support
• Estimated that more than 250,000 web sites use
PHP as an Apache Module.
Oct. 16, 2001
Database Management -- R. Larson
PHP Syntax
• Similar to ASP
<HTML><BODY>
<?php
$myvar = “Hello World”;
echo $myvar ;
?>
</BODY></HTML>
• Includes most programming structures (Loops,
functions, Arrays, etc.)
• Loads HTML form variables so that they are
addressable by name
Oct. 16, 2001
Database Management -- R. Larson
Combined with MySQL
• DBMS interface appears as a set of
functions:
<HTML><BODY>
<?php
$db = mysql_connect(“localhost”, “root”);
mysql_select_db(“mydb”,$db);
$result = mysql_query(“SELECT * FROM employees”, $db);
Printf(“First Name: %s <br>\n”, mysql_result($result, 0 “first”);
Printf(“Last Name: %s <br>\n”, mysql_result($result, 0 “last”);
?></BODY></HTML>
Oct. 16, 2001
Database Management -- R. Larson
ASP – Active Server Pages
• Another server-side scripting language
• From Microsoft using Visual Basic as the
Language model (VBScript), though
Javascript (actually MS Jscript) is also
supported
• Works with Microsoft IIS and gives access
to ODBC databases
Oct. 16, 2001
Database Management -- R. Larson
ASP Syntax
<%
SQL="SELECT last, first FROM employees
ORDER BY last"
set conn = server.createobject("ADODB.Connection")
conn.open “employee"
set people=conn.execute(SQL)
%>
<% do while not people.eof
set resultline=people(0) & “, “ & people(1) & “<BR>”
Response.Write(resultline)
people.movenext
loop%>
<% people.close %>
Oct. 16, 2001
Database Management -- R. Larson
Text Search
• Native text searching within databases is very
poor.
– Involves a full scan of the database to resolve “LIKE”
queries.
– Text fields are limited in size
• For example Oracle VARCHAR has a maximum of 4000 bytes
• LONG (BLOBS, etc) fields support larger data, but are not
indexable and can’t be used in WHERE clauses.
• Some Databases offer Text retrieval add-ons
– Oracle’s interMedia or ConText Text retrieval engines
– Informix Text DataBlade
– IBM DB2 Text Extender
Oct. 16, 2001
Database Management -- R. Larson
Text Search Options
Search Engines
Manufacturer
Price
Platform
Altavista Search Intranet
Cheshire II, Cha-Cha
Dig
Fulcrum Knowledge Net
Index Server (MS)
InfoMagnet
Netscape Compass
PLWeb Turbo
RetrievalWare
Verity Information Server
Ultraseek server
Webinator
WebGlimpse
Altavista
UC Berkeley
Open Source
Fulcrum
Microsoft
CompassWare
Netscape
Personal Library Softw.
Excalibur
Verity
Infoseek
Thunderstone
Univ of Tucson
$16,000
Free or ?
Free
$5,000
Free
$5000+100
$1,295
$7-10000
$12,500
$5,000
$1,000
Free or $700
Free or $200
Unix, NT
Unix
Unix
Unix, NT
NT
NT
Unix, NT
Unix, NT
Unix, NT
Unix, NT
Solaris
Unix, NT
Unix, NT
Oct. 16, 2001
Database Management -- R. Larson
Features to look for
•
•
•
•
•
•
Ranked and Boolean Search
Proximity search
Fielded searching
Concept expansion
Spider for Indexing
Document types available
– HTML, PDF, XML, MS-Office, Multimedia?
Oct. 16, 2001
Database Management -- R. Larson
Other Options
• Have an external search engine crawl and
present your site.
– Inktomi provides portal sites for customers
– Snap uses Inktomi to do the same sort of thing
Oct. 16, 2001
Database Management -- R. Larson
Conclusions
• Database technology is a required
component for large-scale dynamic Web
sites, especially E-Commerce sites
• Web databases cover most of the needs of
dynamic sites except for text search
• Many solutions and systems are available
for web-enabled databases and search
engines
Oct. 16, 2001
Database Management -- R. Larson