Database (PowerPoint)

Download Report

Transcript Database (PowerPoint)

Databases - their construction
and use
John Dubery, U3A Intermediate Computing
April 2011
Databases are BIG!
E.g. a new Government database is called
COINs: it's 120Gb
Recently I've read the following about
databases:
 “many of today's databases routinely grow
into the 100-800 gigabyte (GB) range.”
 “databases of 400 GB or more are not
uncommon”
 “Wal-Mart:- in 2004: at more than 500
terabytes, the data warehouse dwarfs other
retail databases.”

in 2007 - Plans for a 4PB Data
Warehouse
1 Terabyte – Tb - is 1000 Gigabytes
1 Petabyte – Pb - is 1000000 Gigabytes

in 2007 - Plans for a 4PB Data Warehouse
So that's 4,000,000 Gigabytes!
Consequently, database software doesn't bring
the data into memory even when it's being
worked on. This is quite different from a
spreadsheet.
But databases are highly structured, so the
links between chunks of data are stored on
disk.
Originally disk addresses were used for the links
Customer1 name
Customer2 name
Customer3 name
Customer4 name
Customer5 name
Disk address 1
Disk address 3
Disk address 1
Disk address 5
Disk address 5
Street address 1
Street address 2
Street address 3
Street address 4
Street address 5
But this is difficult to maintain,
particularly when you need to move
chunks of the data around
So, nowadays Identifiers are created and
copied to form the links
Customer1 name
Customer2 name
Customer3 name
Customer4 name
Customer5 name
ID 1
ID 3
ID 1
ID 5
ID 5
ID 1
ID 2
ID 3
ID 4
ID 5
Street address 1
Street address 2
Street address 3
Street address 4
Street address 5
The lesson is that YOU have to create these
identifiers in your own database.
Databases that do this are called
RELATIONAL
Relational Databases



Invented by Ted Codd at IBM, California in
1970
Data is arranged in a tabular form, similar
to a spreadsheet
Links between “chunks” of data are
provided by copying identifiers
Common Relational Databases

Adabas D, DataEase, FileMaker Pro, IBM
DB2, IBM Lotus Approach, Informix, Ingres,
InterBase, Microsoft Access, Microsoft SQL
Server, Microsoft Visual FoxPro, MySQL,
Netezza, NonStop SQL, Openbase, Oracle,
Sybase SQL Anywhere, Teradata
The “Chunks” of data

Filesystem

Database

File

Table

Record

Row

Field

Column
Layout of data

Each Table is named

Each Column is named


The number of tables and the columns
they contain are fixed when the database
is designed
Each column can contain only one type of
data, decided at design time


e.g. date, money, text
The number of rows in each table is
indefinite
And now
A Demo ...
Extracting (a copy of) the Data –
i.e. Queries
SHOW ME ArtistName, AlbumTitle, Media
FROM THE TABLE Query_All-Music
SUCH THAT ArtistName IS LIKE 'Elvis
Presley'


SHOW ME ArtistName, AlbumTitle, Media
FROM THE TABLES Music-Collection
AND ArtistID, MATCHING ON ArtistID
SUCH THAT ArtistName IS LIKE 'Elvis
Presley'
And now
A Demo ...
How to Use a Database

Maintain the data


Insert new rows, update columns, delete old
rows
Extract “actionable” subsets of the data

Regular queries to list items for action




Find all the data related to one
account/member/...


e.g. list of members with overdue subs
e.g. list of deliveries for today
e.g. list of upcoming bills
e.g. at a call centre
In businesses the queries would be
embedded in programs