Transcript Document

Databases
Databases
• The contents of databases often form the basis on which
business decisions are made
• A considerable proportion of worldwide computer
resources and programming activity are devoted to
database applications
• Databases are the foundation for many types of
applications: accounting, transaction systems, Management
information systems (MIS), Executive Information
Systems (EIS), data warehouses, document management,
expert systems, etc.
Early data processing
IBM punched card
shuffler
From the 20's through the 50's electromechanical machines
handling punched cards were the mainstay of business data
processing.
Magnetic media
300MB IBM 2302 disk drive
(random-access device)
9-track tape drive
(sequential device)
When high-capacity random-access data storage
devices (such as disk drives) and sorting and
indexing methods became available, searching
for a particular record became much faster.
Whatever the data storage and retrieval mechanism used, the fundamental
element of a database is a TABLE. A table is a database object that
consists of a collection of rows (records) that have an identical collection
of properties. A record has several fields.
Each field is devoted to one one property.
Each record is uniquely identified with a primary key (the first field)
Flat-file databases
• A single table database
• Suitable for contact lists but problematic
when records are expanded with more data,
like customers expanded with sales.
• When information like ‘customer’, ‘orders’,
‘salesperson’, ‘salesperson data’ must be
included, a flat-file database will create
repeating groups and inefficient storage.
Problem with flat-file databases = repeating groups
Table ‘Customers’
Table ‘Orders’
The solution is a split-up into related tables, linked through Customer ID
The relational model
• Developed in 1970 by Dr. E. F. Codd
• Records are linked by relations between
attribute values; a relation consists of a
linkage between records in two tables that
have identical attribute values.
• Revolutionised the database world, and is
the dominant model today.
RDB
Dublicate data among tables, but not within a table.
Data in relational tables is independent of the methods used by the
database management system to manipulate the records
RDB
 Tables in a RDB must follow certain rules, called the
‘normal forms’ or ‘normalisation’.
 Certain ‘data integrity’ rules exist for ensure that the
database remains correct and complete at all times.
 A query is a communication to the database, such as a
request for data, insertion of new data, change of structure,
etc.
 Queries to a DRB are predominantly with Structured
Query Language (SQL). Everyday users will not encounter
SQL when using a database system, queries are normally
pre-defined in the front-end application on the users
desktop.
RDB
• SQL can define or create databases through action
queries, e.g. CREATE TABLE or create new
record entries with INSERT. SQL reserved words
and syntax is defined in an ANSI standard
• Partial syntax of SQL query:
Some classic databases...
• In the 1960’s Committee for Data System Languages
(CODASYL) databases solved the problem with repeating
groups through system of pointers. CODASYL databases
and COBOL remain in use today on some mainframe
systems today.
• dBase, Foxpro and Paradox were popular relational
database programs for the PC in 1980s. dBase
programming and query language was xBase and tables
were stored as individual directories on disk.
RDBMS
• A Relational Database Management System
(RDBMS) is an integrated set of programs used to
define, update and control the database
• The RDBMS is the ‘engine’ handing the data in
the database tables
RDBMS
Main-frame and mid-range
systems often has
integrated RDBMS. Such
native support gives a
stable platform for creating
and running various
business applications
IBM AS/400 series,
sometimes with ‘DB2’
RDBMS preinstalled
Centralized computing
(before the PC)
Client-server
Client-server
In a client-server networking environment a central machine
serves a number of clients (PCs). The shared database and
database engine will be located on the server, while a frontend program is located on each desktop.
Client-server
• The philosophy in the client/server implementation
is, you should offload database processing to the
server. Therefore, the database engine should accept
SQL requests from the client and execute them
totally on the server, returning only the answer set to
the client requestor. Network traffic will therefore be
minimized.
• The RDBMS can handle large amounts of data and
also automatically enforces certain data integrity and
data security policies for multiple users.
MIS
A Management Information System (MIS) generates
management information for monitoring performance,
maintaining coordination, and providing background
information about the organization's operation
MIS as a concept has merged with IS over the last decade, so
that ‘MIS’ now may designate the total Information System of
the organisation.
?
MIS
Data Processing
Information Services
IS
Information Department
MIS
 Databases applications
are used for all manner
of transaction handling,
processing orders,
invoices, parking
tickets, etc.
 Management
Information Systems
(MIS) are supplying
high-level business
information to
management.
Other types of DB-bases IS
• Decision Support System (DSS): Designed to support
decision-making processes involving semi-structured or
unstructured problems. DSS are most effective on tactical
level of organisation (engineers,planners, etc.)
• Executive Information System (EIS). An MIS with DSS
characteristics for executive decision support
• Expert System. Interactive ‘case-based reasoning’ system
that responds to questions, requests clarification and gives
recommendations.
• Data Warehouse
Commercial Enterprise Systems
• ES allow companies to replace their
existing information systems, which are
often incompatible with one another, with a
single, integrated system
• ES are off-the-shelf solutions/products.
• Largest ES software company =
• Wellknown product = SAP/R3
Commercial Enterprise Systems
MANAGERS
REPORTING
APPLICATIONS
FINANCIAL
SALES FORCE AND
CUSTOMER
SERVICE
BACK OFFICE
ADMINISTRATORS
AND WORKERS
CENTRAL
DATABASE
MANUFACTURING
SUPPLIERS
CUSTOMERS
SALES
AND
DELIVERY
SERVICE
APPLICATIONS
HUMAN
RESOURCES
MANAGEMENT
INVENTORY
AND
SUPPLY
EMPLOYEES
A central database draws data from and feeds data into a series
of modules supporting diverse company functions
Data Warehouses
Data in organisations
Data in organisations tend to be spread over several
databases and therefore tend to have the following
characteristics:
•
•
•
•
•
•
Massive volume
Dispersed
Difficult to access
Badly integrated
Complex data structures
Not suitable for high level business queries
What is a Data Warehouse?
• A single, complete and consistent source of data obtained
from a variety of sources and made available to end users
in a way that they can understand and use in a business
context
• DW form a framework in which to perform ‘data mining’
in the organisations information. (looking for ‘gold
nuggets’ of information/trends with strategic importance)
• Separate from the operational systems in the enterprise
and populated by data from these systems. It therefore does
not burden the operational systems, but rather ‘sits on top’
of whatever systems provide data for the warehouse
Data Warehouse
Data Warehouse Mining
• Traditional database queries are typified by
relatively simple questions.
• Data mining, on the other hand, through the use
of specific algorithms or search "engines",
attempts to source out discernable patterns and
trends in the data and infers rules from these
patterns. With these rules or functions, the user is
then able to support, review and examine
decisions in some related business or scientific
area