Transcript (Student#).
An Introduction to Database
and Database Designing
R C Goyal
Principal Scientist
IASRI, New Delhi
Databases Before the Use of
Computers
Data kept in books, ledgers, card files,
folders, and file cabinets
Long response time
Labor-intensive
Often incomplete or inaccurate
Data & Information
Data known facts that can be
recorded and that has implicit
meaning
Information processed data
Meaningless data becomes
information when it is processed
and presented to the decision
maker in a meaningful way
Database
A set of data that is required for a specific
purpose or is fundamental to a system,
project, enterprise, or business. A formally
structured collection of data.
A database may consist of one or more data
banks and be geographically distributed
among several repositories.
In automated information systems, the
database is manipulated using a database
management system.
File Mgmt vs. Database Mgmt
File Management
Each data entity is in a separate file
system for creating, retrieving and
manipulating files.
Database Management
Same file but data elements are integrated
and shared among different files
Program controls the structure of a
database and access to data
Database Management for
Strategic Advantage
We live in the Information Age
Information used to make organizations
more productive and competitive
Databases used to support business
operations
Databases used across a range of
applictions
Personal, department, enterprise, web
Technical Advantages of
Database Management
Reduced data redundancy
Improved data integrity
More program independence
Increased user productivity
Increased security
Disadvantages of DBMS
Cost issues
Implementation & maintenance issues
Security issues
Privacy issues
Features of DBMS
Database Engine
heart of DBMS
stores, retrieves and updates data
enforces business rules
Data dictionary
holds definitions of all of the data tables
Describes the data type, allows DBMS to
keep track of data, helps user find data
they need
Features of DBMS…Cont.
Query processor
Enables users to store and retrieve data
Use English command that use such words
as SELECT, DELETE, MODIFY
Report generator
Formats and prints reports after user uses
query processor
Features of DBMS…Cont.
Forms generator
Application generator
Access security
Setup access privileges to protect data
from unauthorized access and sabotage
System recovery
Database Development Cycle
Database planning
System Definition
Requirements collection and analysis
Database design
DBMS selection
Application design
Prototyping
Implementation
Data Conversion and loading
Testing
Operational Maintenance
Database Life Cycle
Database Planning
Systems Definition
Requirements Collection
and analysis
DBMS
Selection
Database Design
Application
Design
Implementation
Data Conversion and loading
Prototyping
Testing
Evaluation & Maintenance
Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm
Database Planning
Current systems evaluation
Development of standards
Technological feasibility
Operational feasibility
Economical feasibility
Systems definition
Data dictionary
Metadata
Requirements Collection and Analysis
identifying management information
requirements,
determining information requirements by
functional area,
and establishing hardware and software
requirements
Database Design
Conceptual design
Logical design
Physical design
Database design methodology
A structured approach that uses
procedures, techniques, tools, and
documentation aids to support and
facilitate the process of design.
* Conceptual database design
* Logical database design
* Physical database design
Critical Success Factors in
Database Design
Work interactively with the users as much as
possible.
Follow a structured methodology throughout the
data modelling process.
Incorporate structural and integrity
considerations into the data models.
Combine conceptualisation, normalisation, and
transaction validation techniques into the data
modelling methodology.
Design Considerations - Tips
Planning
Tables
Queries
Forms
Reports
Macros
Modules
General
Tips - Planning
Before you start up the computer to build a
database, PLAN ON PAPER!!!
Gather all of the paper forms you use to collect
your data
Interview the people who will be using the
database, from those who conduct the field work
to those who analyze the data
Think about what you want to get back out of
your database, not just about how to get data in
Steal - see if it’s been done before
Tips - Tables
Normalize, normalize, normalize
Use field properties to help maintain data
integrity
When possible limit the possible entries in a
field to a lookup list (domain)
Choose your key fields carefully
Index fields that you will search often
Many-to-many relationships in Access
Steal - link to existing tables
Normalization
The process of breaking down large
tables into smaller ones by removing all
unnecessary or duplicate fields,
eliminating redundant data, and making
sure that each table represents only
one thing.
Why Normalize?
If you don’t normalize, you may run into
anomalies (unexpected results)
Deletion anomaly - deleting a record
unexpectedly removes a value we wanted to
keep
Insertion anomaly - we can’t add a record
because we don’t know the value of all of the
required fields
Change anomaly - one change must be
applied to many records in a single table
Non-normalized Table
Typical flat-file table
Student# StName
1022 Rajat
4123 Patel
AdvName AdvRoom Class1 Class2 Class3
Goyal
412 101-07 143-01 159-02
Gupta
216 202-01 211-02 214-01
Contains all data in one record.
Contains repeating fields.
Contains information on more than one thing.
First Normal Form
Eliminate repeating fields within tables.
Create a separate table for each set of
related data.
Identify each set of related data with a
primary key.
First Normal Form
Student#
1022
1022
1022
4123
4123
StName
Rajat
Rajat
Rajat
Patel
Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
No Repeating Fields: Class# is now a single field instead of repeating for each class a
student takes.
Now multiple records for each student because >1 record is needed to accommodate all
classes for each student
Non-normalized Table
Student# StName
1022 Rajat
4123 Patel
AdvName AdvRoom Class1 Class2 Class3
Goyal
412 101-07 143-01 159-02
Gupta
216 202-01 211-02 214-01
Second Normal Form
Create separate tables for values that
apply to multiple records.
Relate these tables with a foreign key.
Second Normal Form
Students:
Student# StName
1022 Rajat
4123 Patel
Student#
Classes:
1022
1022
1022
4123
4123
4123
AdvName AdvRoom
Goyal
412
Gupta
216
Class#
101-07
143-01
159-02
210-01
211-02
214-01
Eliminate Redundant Data: no longer repeating
student name, advisor, and adv-room for each class
Third Normal Form
Eliminate fields that do not depend on
the key.
Student# StName
Students:
1022 Rajat
4123 Patel
AdvName AdvRoom
Goyal
412
Gupta
216
AdvRoom is independent of the key field
(Student#).
It does not depend on the student, it depends on
the advisor.
Third Normal Form
Students:
Student# StName
1022 Rajat
4123 Patel
AdvID
1
2
Advisors:
AdvID AdvName AdvRoom
1 Goyal
412
2 Gupta
216
Eliminate Data Not Dependent on Key: remove
AdvName from the Students table and create a
separate table describing the advisors with its own
primary key (AdvID).
Relate Advisors to Students using AdvID.
Deletion Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If we delete student Rajat, we lose the
only records for advisor Goyal
Insertion Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If we want to add a new student we
have to know what class they are in (it’s
part of the key)
Change Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If student Rajat switches to advisor
Gupta, we have to change three records
instead of just one
Field Properties
Data Type - text, integer, double, boolean, etc.
Field Size - for text fields
Input Mask - a pattern for entering data
Default Values - auto-entered for new records
Validation Rule - limits values entered
Required? - force entry of data
Relationships
One to One
One to Many
Many to One
Many to Many
Many-to-Many Relationship
Exist between two tables when:
for one record in the first table, there
can be many corresponding records in
the second table and…
for one record in the second table,
there can be many corresponding
records in the first table
Many-to-Many Relationship
One student can take many classes, and one class can
be taken by many students.
Students
StudentID
Name
Address
City
State
Zip
WorkPhone
HomePhone
Classes
ClassNumber
StudentID
Subject
InstructorID
Days
Time
Comments
Tips - Queries
Take advantage of action queries to handle
batch record operations
Use queries to present calculated values
rather than storing the calculated values in
your tables
Remember that null never equals another null
Joining two tables on a field that may contain a
null value may not give you the results you expect
Searching for duplicate values will not return two
records that have a null
Realize that you can link a table to itself
DBMS Selection
* Costs
* Features and Tools
* Underlying model
* Portability
* DBMS hardware requirements
* Organisational requirements
Implementation
The physical realisation of the database
and application designs
the detailed model is converted to the
appropriate implementation model, the
data dictionary is built, the database is
populated, application programs are
developed and users are trained
Data Conversion and Loading
& Testing
Transferring any existing or new data
into the new database and converting
any existing applications to run on the
new database
Finding errors
Database Evaluation
Interviewing and polling users to
determine whether any data needs are
unmet.
Operational maintenance
preventive maintenance (backup)
corrective maintenance (recovery)1
adaptive maintenance
regular monitoring & periodical check
up
Recent Developments Affecting
Database Design and Use
Data Mining (On-Line Analytical Processing)
Drill down from summary data to detailed data
Data Warehouses/Data Marts
Integrates many large databases into one repository
Linking Web Site Applications to Organizational
Databases
Users have Web view to organizational database
Improves customer contact and service
Adds security as a concern
Data Warehouses
Data warehouses are used to organize
information for quick and effective
queries
Data Warehouses and
Database
In the data warehouse, data are organized
around major subjects
Data in the warehouse are stored as
summarized rather than detailed raw data
Data in the data warehouse cover a much
longer time frame than in a traditional
transaction-oriented database
Data warehouses are organized for fast
queries
Data warehouses are usually optimized for
answering complex queries, known as OLAP
Data Warehouses and
Database
Data warehouses allow for easy access via
data-mining software called software
Data warehouses include multiple databases
that have been processed so that data are
uniformly defined, containing what is referred
to as “clean” data
Data warehouses usually contain data from
outside sources
Data Mining
Statistical analysis
Decision trees
Neural networks
Fuzzy logic
Clustering
Data Mining Patterns
Data mining patterns that decision
makers try to identify include
Associations, patterns that occur together
Sequences, patterns of actions that take
place over a period of time
Clustering, patterns that develop among
groups of people
Trends, the patterns that are noticed over
a period of time
Web Based Databases and
XML
Web-based databases are used for
sharing data
Extensible markup language (XML) is
used to define data used primarily for
business data exchange over the Web
An XML document contains only data and
the nature of the data
Continue………………………………..