Summer Class 3

Download Report

Transcript Summer Class 3

The Edge of IT
ITEC-200 Fall 2006
Topic 3: Database
Professor J. Alberto Espinosa
Transaction
Processing
IT
Infrastructure
Business
Applications
Roadmap
Decision Support
Distributed Collaboration
Enterprise Collaboration
Financial Management
etc.
IT &
Business
Server
Appl
Client
Appl
DB
Database
IT Infrastrucure:
- HW & SW
- Database
- Telecom
Information
Business
Applications
DB
IT &
Business
Agenda
• Introduction to database and
database management systems
(DBMS) concepts
• Learn how to query databases to
extract the information you need
• Learn how to design and
implement databases
Topic 3: Database p.3
Database &
Database Management
System (DBMS) Concepts
Follow up course:
ITEC-470 Databases, Data Mining &
Knowledge Management
Topic 3: Database p.4
Definitions
Database:
An organized collection of “logically related” data
that can be retrieved on demand
Database Management System (DBMS):
Software that manages databases
(i.e., define, create, update, and query databases)
Acts as intermediary between business
applications and physical data files
“Most powerful, scalable, flexible and effective business
applications rely on a well designed database and a
powerful underlying DBMS”
Topic 3: Database p.5
The Old Way:
Data files
Data and program files were separate. You had to write
individual programs to: define the data; upload it; update it;
manipulate it; and or retrieve it
Examples:
Data
Files
Application Programs
Accounting,
Human Resources
API1
System Software
Windows
Unix, Linux
INSTRUCTION SET
HARDWARE
PC,
Mainframe
Topic 3: Database p.6
A Better Way: Using a DBMS
A business application passes high level instructions to the DBMS. The
DBMS has capabilities to do all the necessary data management: data
definition, manipulation, and retrieval. So, the business application does
not have to worry about low level data management functions
Examples:
Database Application
API2
Database
DBMS
Accounting,
Human Resources,
ERP, CRM
Oracle, Access,
MS SQL Server
API1
System Software
Windows
Unix, Linux
INSTRUCTION SET
HARDWARE
PC,
Mainframe
Topic 3: Database p.7
Advantages of Using
Databases & DBMSs
•
•
•
•
•
•
•
•
Programs independent of data structure
Less data redundancy
Better consistency in the data
More flexibility & scalability
Easier to integrate & share data
Easier to develop business applications
Easier to enforce business rules/constraints
Easier access to data by users
(e.g., queries, reports, forms, etc.)
Topic 3: Database p.8
DBMS in a Client/Server Environment
Database Server: runs “back-end” part of the DBMS to process
queries and perform database management tasks
Database Client: runs “front-end” part of the DBMS that provides
the user interface (e.g., data entry, screen displays or
presentation, report formatting, query building tools)
CLIENT
SERVER
Database
Front-End
DBMS
Request
(ex. query)
Response
(ex. query result)
Execute query
Back-End
DBMS
Topic 3: Database p.9
DBMS in a Web Server Environment
Request (ex. get a price quote, place an order)
Response (ex. query results with HTML-formatted
product price or order confirmation notice)
Topic 3: Database p.10
Data Warehouse
“A database that stores and consolidates current and historical data
from various systems (internal and external) with tools for
management reporting and sophisticated analysis—i.e., Datamining”
Topic 3: Database p.11
Most Common Database Models
•
•
•
•
Hierarchical (of historical interest only)
Network (of historical interest only)
Relational
Object Oriented databases (new)
Topic 3: Database p.12
Relational Database
A database with structured data consisting of:
•
•
•
Tables: or “entities”
Every table has a unique name
Ex. Students, Courses
Fields: or “columns”, “attributes”
Every field has a unique name within the table
Ex. Students (StudentID, StudentName, Major, Address)
Ex. Courses (CourseNo, CouseName, CreditPoints, Description)
Records: or “rows”, “tuples”, “instances”
Every record is unique (has a unique field that identifies it)
Ex. {“jdoe”, “John Doe”, “CS”, 5000 Forbes Ave.)
Ex. {“MGMT-352-001”, “MIS”, Fall 2002, “A great course”}
Topic 3: Database p.13
Terminology Equivalence
ERD or
Data Model
OO Database
Relational
Database
Entity
Class
Table
Instances
Objects
Records
Relationship
Relationship
Relationship
Attributes
Properties
Fields
Other
Terms Used
Rows, Tuples
Columns
Topic 3: Database p.14
DBMS Functions and Tools
• Performs 3 main functions:
– Data definition (define, create databases)
– Data manipulation (data entry, updates)
– Data retrieval (extraction, reports, displays)
• Plus additional database tools:
– Data dictionary: data about the database
– Visual tools: report & form design
– Data modeling & database design tools
– Macros and programming languages
– Internet/web features, etc.
• Examples:
– Oracle, DB2, Visual FoxPro, MS Access & MS SQL
Topic 3: Database p.15
Define, Create and
Manipulate Data in
Databases
Topic 3: Database p.16
MS Access (structured data)
Topic 3: Database p.17
MS Access Data Definition
Topic 3: Database p.18
MS Access Data Manipulation
Topic 3: Database p.19
Visual FoxPro Data Manipulation
Topic 3: Database p.20
Lotus Notes
Topic 3: Database p.21
Displaying and Printing Data
From Databases:
Forms and Reports
The idea:
(1) organize and store the data in the most efficient way,
without data redundancy
(2) retrieve the data in the way you want to see it (not
necessarily how it is organized internally) using queries
(3) present the results to users and managers using
forms (for data entry) and report (for printouts)
Topic 3: Database p.22
Forms
Typically used for data entry & screen displays
Topic 3: Database p.23
Reports
Typically used for printouts
Topic 3: Database p.24
Database Queries
Topic 3: Database p.25
Queries are DBMS Commands for:
• Data definition (define, create databases)
• Data manipulation (data entry, updates)
• Data retrieval (extraction, reports, displays)
Queries are often thought of as a method to retrieve data, but
queries can also be used to define and manipulate data
Databases can be queried in many ways:
• Proprietary DBMS commands and languages, or
• Standard query methods/languages (QBE, SQL, etc.)
Topic 3: Database p.26
Standard Query Methods
Query by Example (QBE)
• Called Design View in MS Access
• Visual interface using examples of data requested
• Similar to how you do searches in the library
Structured Query Language (SQL)
• Popular with power users
• Works in most DBMS
• Can embed SQL commands in programs, web
scripts, etc.
• English-like commands (4GL), practical
• Exact, mathematical: relational algebra & matrix math
Topic 3: Database p.27
Query by Example (QBE)
•
•
•
Called Query “Design View” in MS Access
Column labels are the fields we want to retrieve
In table cells we enter “examples” of the info we want
Topic 3: Database p.28
8 SQL Commands Only
• Data Definition:
CREATE TABLE, DROP TABLE
• Data Manipulation:
INSERT, UPDATE, DELETE, UNION, JOIN
• Data Retrieval:
SELECT (need to learn this only)
Topic 3: Database p.29
SQL Commands: Data Definition
Example: Create & Delete Table called “Friends”
One SQL
Command
CREATE TABLE Friends
(FriendID integer,
LastName char(24),
FirstName char(24),
Birthday date,
; = End of
Phone char(10),
Command
Notes memo);
DROP TABLE Friends;
Topic 3: Database p.30
SQL Commands: Data Manipulation
•
INSERT: Add new records
•
UPDATE: Modify existing records
•
DELETE: Delete records
•
UNION: Combine records from two tables
•
JOIN: Combine columns from two tables
Topic 3: Database p.31
SQL Commands: Data Manipulation
Add & Update Records
Insert (add) a complete record (values in all fields):
INSERT INTO Friends
VALUES (“ae”, “Espinosa”, “Alberto”, 12/12/2002,
“885-1958”, “Looks tired, needs a vacation”)
Insert (add) partial record (values in some fields only):
INSERT INTO Friends
(FriendID, LastName, FirstName)
VALUES (“ae”, “Espinosa”, “Alberto”)
Update (modify) record with new values:
UPDATE Friends SET LastName=“Espinosa”
Topic 3: Database p.32
Data Retrieval Queries: The Idea
•
Organize database (design, create):
– In the most efficient & consistent way (internally)
– Not based on how you want the data to look
•
Produce the “virtual” tables as you want them to
look using queries
How we store
the data
How we display
the data
Topic 3: Database p.33
Data Retrieval in SQL
The “SELECT” Command
SELECT <field list & function list> –columns to retrieve
FROM <table list> –tables that contain the data
WHERE condition1 –which records to retrieve
AND condition2 …….
ORDER BY field1, ….. –to sort the query result
SELECT can be followed by:
DISTINCT (eliminates duplicate rows from result)
TOP #
(lists only the top # rows of result)
*
(lists all fields in the table)
Topic 3: Database p.34
Complexity of SELECT Queries
• Simple Queries:
Involve a single table
• Complex (Join) Queries:
Involve more than one table
Topic 3: Database p.35
Simple SELECT Queries
SELECT ProdID, ProdName, Type, Price
FROM Products
WHERE Price>=300;
SELECT Avg(Price) AS AvgPrice
FROM Products
WHERE Price>=120
AND Type=“Percussion”;
SELECT Count(*) as TotOrders
FROM Orders
Click here to download this database [local copy]
Topic 3: Database p.36
Complex SELECT (Join) Queries
Tables: Orders (OrderNo, ClientID, OrderDate, OrderStatus)
LineItems (OrderNo, LineItem, ProdID, Qty)
Table Join (2 ways):
SELECT Orders.OrderNo, ClientID, LineItem, ProdID, Qty
FROM Orders, LineItems
WHERE Orders.OrderNo = LineItems.OrderNo;
Join
Condition
Table Product (Don't do this!! Don’t forget the join condition):
SELECT Orders.OrderNo, ClientID, LineItem, ProdID, Qty
FROM Orders, LineItems;
Database Design
Topic 3: Database p.38
Database Design Issue #1:
Data Integrity in Databases
Two main types of data integrity:
Entity Integrity:
• Ensuring that every record in each table in the database can
be addressed (i.e., found) -- this means that there each record
has to have a unique identifier that is not duplicate or null (i.e.,
not blank)
• Examples: every student has an AU ID; every purchase order
has a unique number; every customer has an ID
Referential Integrity:
• Ensuring that the data that is entered in one table is consistent
with data in other tables
• Examples: purchase orders can only be placed by valid
customers; accounting transactions can only be posted to
valid company accounts
Topic 3: Database p.39
Database Design Issue #2:
Special Fields (Keys)
How do you find the record (info) you need?
• Primary key (PK):
– Field(s) that uniquely identifies a record
(CourseNo, InstructorID)
– Entity integrity = PK is not duplicate & not blank
– PK can be:
– A single field (e.g., UserID), or
– Multi-field—i.e., “composite” (OrderNo, LineItem)
– Dual key = a composite PK with only 2 fields
How do you link related tables in a database?
•
Foreign key (FK):
– A field in a table that is a PK in another table
– That is, a field that “must” exist in another table
– This is how referential integrity is maintained
Topic 3: Database p.40
Illustration: Primary and Foreign Keys
PK
FK
PK
Topic 3: Database p.41
Design Issue #3:
Steps in Database Design
1.
2.
3.
4.
5.
Build a Data Model – or Entity-Relationship Diagram (ERD) – for the
business application: this model describes each data element in the
application and how different data elements relate to each other
Review the data model to ensure entity and referential integrity
Convert (i.e., map) the data model into a relational database design
• Convert the data model into a database relationship diagram
Implement Important Data Rules, there are 3 of them:
• Add update rules: (1) don’t allow or (2) cascade updates
• Add delete rules: (1) don’t allow or (2) cascade updates
• Add business rules: e.g., PayDate >= OrderDate; OrderQty > 100
Normalize the database design as necessary and update the data
model and design if necessary
– i.e., re-organize the tables to avoid recording redundant data
Topic 3: Database p.42
Data Model Example (Entity Relationship Diagram--ERD):
Course Registration System
Course
Instructors
CourseNo
CourseDescription
Many
InstructorID
CreditPoints
PreRequisites
ClassroomNo
1
Includes
InstructorID
Teaches
1
Relationships
Enrollment
Comments
Entities
Students
StudentID
Many
StudentID
CourseNo
LastName
FirstName
Telephone
EMailAddr
Enrolls
Many
1
LastName
FirstName
SSN
Department
College
Major
EMailAddr
Topic 3: Database p.43
Entity-Relationship Diagrams (ERDs)
Conceptual Data Modeling
•
•
•
Data-oriented modeling method that describes the data and
relationships among data entities
Goal: capture meaning of the data
2 main ERD constructs:
– Entities and its attributes
– Relationships between entities
Topic 3: Database p.44
Entity
“An object, person, place, event or thing or which we want to
record data”
• Equivalent to a table in a database
• Examples: instructors, students, classrooms, invoices,
registration, machines, countries, states, etc.
• Entity instance: a single occurrence of an entity
Example: Espinosa, Kogod 39, MGMT-352, 001
• Entities can be identified in a requirements analysis
description by following the use of NOUNS
Topic 3: Database p.45
Entity Representation
Entity Name
Peter Chen’s notation
Topic 3: Database p.46
Relationships
• Relationships describe how two entities relate to each other
• Relationships in a database application can be identified following
the VERBS that describe how entities are associated with one
another
• Examples:
students enroll in courses
countries have cities, etc.
Topic 3: Database p.47
Cardinality
• Is an important database concept that helps understand how two
entities are related
• Cardinality: describes how many instances of one entity can be
associated with another entity
• The cardinality of a relationship between two entities has two
components:
– Maximum Cardinality: is the maximum number of instances that can be
associated with the other entity – usually either 1 or many (the exact
number is rarely used)
– Minimum Cardinality: is the minimum number of instances that can be
associated with the other entity – usually either 0 or 1
– Symbols: 0
1
Many
• Examples:
1 student can only park 1 (or 0) cars – 1 to (0 or) 1
1 client can place (0 or ) many orders – 1 to (0 or) many
1 student can enroll in (at least 1 or) many courses and
a course can have (0 or) many students – (0 or) many to (1 or) many
Topic 3: Database p.48
ERD SYMBOLS (cont’d.)
Student
Has
EmailAccount
1 to 1
Maximum
Cardinality
(OUTER symbol)
Student
Car
Parks
Mandatory
Optional
Minimum
Cardinality
(INNER symbol)
Peter Chen’s notation
using Systems Architect software
Topic 3: Database p.49
ERD SYMBOLS (cont’d.)
→ Advises
← Have
Advisor
Student
1 to Many
Maximum
Cardinality
Faculty
1 to Many (or None)
Teaches
Mandatory
Minimum
Cardinality
Course
Optional
Peter Chen’s (“crow’s feet”) notation
using Systems Architect software
Topic 3: Database p.50
Many to Many Relationships?
(0 or) Many to Many (at least 1)
Orders
Products
Not good for database  Need to convert every
Many-to-Many relationship into 2 One-to-Many relationships
with a third table called “Intersection Table”
Orders
1 to Many (at least 1)
Products
LineItems
1 to Many (or None)
Intersection Table
Topic 3: Database p.51
Adding Attributes to ERD
Instructors
PK
Course
InstructorID
Teaches
LastName
FirstName
Telephone
EMail
PK
FK1
CourseNo
CourseDescription
InstructorID
CreditPoints
PreRequisites
Peter Chen’s notation
& MS Visio software
Topic 3: Database p.52
Steps in ERD Modeling
1.
Identify and diagram all ENTITIES
2.
Add PK attributes – i.e., implement entity integrity
Ensure PK’s are non-null & non-duplicates
3.
Identify and diagram all RELATIONSHIPS
and their respective CARDINALITIES
4.
Add FK attributes – i.e., implement referential integrity
(this is automatic in some tools—MS Access)
5.
Add remaining attributes
Topic 3: Database p.53
ERD Example:
Course Registration System
Courses (CourseNo (PK), CourseDescripition, InstructorID,
CreditPoints, ClassroomNo)
PreRequisites (CourseNo (PK), PreRequisiteNo (PK),
Comments)
Students (StudentID (PK), LastName, FirstName, SSN,
Department, College, Major, EMail)
Enrollment (StudentID (PK), CourseNo (PK), Comments)
Instructors (InstructorID (PK), LastName, FirstName,
Telephone, EMail)
Classrooms (ClassroomNo (PK), ClassroomName, Building,
BuildingRoomNo, Equipment, Capacity)
Note: PK denotes a primary key
Topic 3: Database p.54
Example: Course Registration System
Step 1. Draw Entities
PreRequisites
Course
ClassRooms
Enrollment
Instructors
Students
Topic 3: Database p.55
Example: Course Registration System
Step 2. Add PK’s (undeline/separate with a line)
PreRequisites
Course
Instructors
CourseNo
PreRequisiteNo
CourseNo
InstructorID
Enrollment
ClassRooms
ClassroomNo
StudentID
CourseNo
Students
StudentID
Topic 3: Database p.56
Example: Course Registration System
Step 3. Add Relationships (w/Cardinalities)
PreRequisites
PK,FK1
PK
CourseNo
PreRequisiteNo
has
Course
PK
Instructors
Teaches
CourseNo
PK
InstructorID
Includes
Assigned
Enrollment
ClassRooms
PK
ClassroomNo
PK,FK1
PK,FK2
StudentID
CourseNo
Students
Enrolls
PK
StudentID
Topic 3: Database p.57
Example: Course Registration System
Step 4. Add FK’s
PreRequisites
PK,FK1
PK
CourseNo
PreRequisiteNo
Assigned
Course
has
PK
CourseNo
FK1
FK2
InstructorID
ClassroomNo
Instructors
Teaches
PK
InstructorID
Includes
Enrollment
ClassRooms
PK
ClassroomNo
PK,FK1
PK,FK2
StudentID
CourseNo
Students
Enrolls
PK
StudentID
Topic 3: Database p.58
Example: Course Registration System
Step 5. Add Remaining Attributes
Course
PreRequisites
PK,FK1
PK
CourseNo
PreRequisiteNo
Comments
Assigned
Has
PK
CourseNo
FK1
CourseDescription
InstructorID
CreditPoints
FK2
ClassroomNo
Instructors
Teaches
PK
InstructorID
LastName
FirstName
Telephone
EMail
Students
ClassRooms
PK
Includes
PK
ClassroomNo
ClassroomName
Building
BuildingRoomNo
Equipment
Capacity
Enrollment
PK,FK1
PK,FK2
StudentID
CourseNo
Comments
Enrolls
StudentID
LastName
FirstName
SSN
Department
College
Major
EMail
Topic 3: Database p.59
EXAMPLE:
Package Delivery Tracking System
Clients
PK
ClientID
PK
DeliveryNo
PK
PackageNo
LastName
FirstName
Address
Telephone
FK2,FK3
DriverNo
Date
Status
FK2,FK3
DeliveryNo
Size
Charge
Drivers
Trucks
PK
Packages
Deliveries
TruckNo
PK
DriverNo
Make
Model
Year
FK1
TruckNo
DriverName
LicenseNo
Topic 3: Database p.60
EXAMPLE:
Airline Reservation System
Passengers
PK
TripSegments
Reservations
PassengerID
PK
ReservationNo
PK,FK1
PK
ReservationNo
SegmentNo
LastName
FirstName
Address
Telephone
FK1
PassengerID
OtherResvData
FK2
FlightNo
Aircrafts
Flights
Airports
PK
AirportID
AirportName
City
Country
PK
PK
FK2,FK4
FK2,FK5
FK3
AircraftNo
FlightNo
Origin
Destination
AircraftNo
Make
Model
Engine
Capacity
Topic 3: Database p.61
EXAMPLE:
HW2 Accounting Database
(Data Model diagram using MS Access)
Topic 3: Database p.62
ERD Example:
A Music Collection Database
You have such a great music collection that your friends keep borrowing your music.
Things are getting out of control. You no longer know who has what, so you decide to
create a database to inventory your music items and keep track of who has borrowed
which items. You analyze the requirements and come up with the following entities:
Artists: A list of individual or group artists in your collection
MediaTypes: A list of the different media you possess (CD, VHS, DVD, etc.)
MusicItems: A list of the individual CD's, VHS tapes, etc. you own
Suspects: A list of friends and acquaintances who borrowed music items from you
CheckOut: A log where you keep track of who borrowed what
You then decide to collect these attributes for each entity (PK denotes a primary key):
Artists (ArtistID (PK), ArtistName, Nationality, ArtistRating)
MediaTypes (MediaCode (PK), MediaName, Description)
MusicItems (ItemNo (PK), Title, ArtistID, MediaCode, YearRelease, DatePurchased,
MusicType, ItemRating)
Suspects (SuspectID (PK), SuspectName, Address, Telephone)
CheckOut (ItemNo (PK), SuspectID (PK), DateBorrowed (PK),
ReturnExpectedDate, Comments)
Topic 3: Database p.63
ERD Example (cont’d):
A Music Collection Database
Your requirements for this application also need to comply with the following (entities in
bold italics and relationships underlined, note respective use of nouns and verbs):
A music item (CD, tape, etc.) are recorded by one artist, but an artist may appear in
more than one music item in your collection (e.g., you have several albums from Dave
Mathews). A music item is associated with one media type, but a you may have
several music items in a given media type (e.g., you have several CD’s). Over time,
suspects can check out many items (e.g. Alberto has borrowed several CD’s from
you), but each check out entry is associated with only one suspect. Music items may
also be checked out many times over time (e.g. one of you Dave Mathews albums has
been borrowed several times), but every check out entry is associated with only one
music item.
Please draw an ERD with all entities, relationships, attributes (please underline PK's)
and cardinalities. If you feel the information above is incomplete, please make any
business assumptions you need and clearly state them.
Also, describe in words (no need to diagram) how your design would change if music
items can contain more than one artist.
Topic 3: Database p.64
ERD Illustration: Music Collection
Artists
Key Data
ArtistID [PK1]
Non-Key Data
ArtistName
Nationality
ArtistRating
MediaTypes
Key Data
Associated MediaCode [PK1]
Non-Key Data
MediaName
Description
By
MusicItems
Key Data
ItemNo [PK1]
Non-Key Data
Title
ArtistID
MediaCode
YearRelease
DataPurchased
MusicType
ItemRating
Checked Out
CheckOut
Key Data
ItemNo [PK1]
SuspectID [PK2]
DataBorrowed [PK3]
Non-Key Data
ReturnExpectedDate
Comments
Borrows
Suspects
Key Data
SuspectID [PK1]
Non-Key Data
SuspectName
Address
Telephone
Topic 3: Database p.65
What is a good database design?
1. Sound data model (ERD)
2. Implement “Entity Integrity” and “Referential
Integrity”
3. Implement other important rules
(note: MS Access has features to enforce these rules)
• Update Rules: what can be updated and when
• Delete Rules: what can be deleted and when
• Business Rules: needed to conduct business
4. The database is “Normalized” – no redundancy
Topic 3: Database p.66
Update Rules
What can be updated/modified in the database and when?
1. It is OK to update values in any non-PK fields, provided that
referential integrity and business rules are respected
2. It is OK to update values in the PK in one table if it is not linked to
a FK in another table, provided that entity integrity, referential
integrity and business rules are respected
3. If a PK is linked to a FK in another table, we need to ensure that
referential integrity is maintained. Depending on what makes sense
for business and the data itself, either:
• Disallow updates of values in the PK, or
• Allow updates,
but cascade changes to all related FKs in other tables
Topic 3: Database p.67
Delete Rules
What can be deleted in the database and when?
1. It is OK to delete records in a table [only] if its PK is not
linked to a FK in another table
2. If its PK is linked to a FK in another table, we need to
ensure that referential integrity is maintained.
Depending on what makes sense for business and the
data itself, either:
• Disallow deletions, or
• Allow deletions,
but cascade deletions in all related tables that contain a FK
linked to this table
Topic 3: Database p.68
Business Rules
Most DBMS have features that allow you to impose
constraints in the data to meet rules imposed by a
company when conducting business:
i.e., “business rules” – examples:
• CustomerAge >= 18
• OrderQty >= 100
• ProductPrice <= 1000
• PaymentDate <= PurchaseDate + 90
Topic 3: Database p.69