Creating and maintaining a Database(powerpoint)
Download
Report
Transcript Creating and maintaining a Database(powerpoint)
Creating and Maintaining a
Database
The DBA’S Job
New York Conference 2005
Tasks
Design
– Logical Design
– Physical Design
– Documentation
Implementation
–
–
–
–
Test
Performance
Security
Concurrent Updates
Maintenance
– Backup
– Recovery
– Data Integrity
New Releases
– SIR
– Application
Designing a
Relational Database
Normalization
– Eliminate redundant data
– Identify data dependencies – keys
1st Normal Form
– One value per column
– Unique primary key
2nd Normal Form
– No subsets of data in multiple rows of a table
3rd Normal Form
– All columns fully dependant on primary key
Example
Order #
Customer #
Address
Part #
Description
Unit Price
Quantity
Total
1234
409
xxxxxx
10
xxxxxx
$10.00
5
50.00
1234
409
xxxxxx
20
xxxxxx
$15.00
3
45.00
Possible Tables
Order - Order #
– Customer #
Order Item - Order # Line #
– Product Code
– Qty
– Unit Price ?
Customer – Customer #
– Address
Product – Product #
– Description
– Unit Price
Keys
Must be unique
Good if real world
– Employee Id/Product Code etc.
May not be the only access required
Should be short
Avoid unformatted alphabetic
If subordinate repeating group,
consider sequence number
Normalized Implementation
Know the rules
– Know the application
Alternatives
– How many repeats of a column/group?
– Dependent data volatility/convenience
Document
– Variables – labels, descriptions
– Records – keys, variables, foreign keys
SIR Schema
Case Definition
– Case Id
– Max Counts
Record Definition
–
–
–
–
Key Fields
Max Counts
Default Security
Variables within records
Documentation
command for case and
record
Variable Definition
–
–
–
–
–
–
–
–
Format & Position
Missing Values
Valid Values
Value Labels
Categorical Vars
Variable Ranges
Variable Label
Extended label for
variable documentation
– Variable Security
Schema functions in PQL
60+ database functions
30 tabfile functions
Examples:
–
–
–
–
–
–
–
–
–
NRECS
RECNAME
NKEY
KEYNAME
NVARS
VARNAME
VARLABSC
VFORMAT
VTYPE
Sec Index Functions
–
–
–
–
DBINDS
DBINDR
DBINDV
DBINDT
Quick Data Dictionary
Four Record Types
– Variables
– Records
– Record keys
– Record data
Populate from any database
Check consistency
Example Data Dictionary
Create
Populate from MNYR
– 55 record types
– 2216 variables in records
Check consistent use of variables
– Labels
– Formats
– Types
Identify foreign keys
Look at secondary indexes
SIR Structures
Multiple Database
– Until SIR2000 exactly one database in SIR session
– Design suggests separate databases for separate
hierarchies
– Had to use ‘dummy’ cases in single database
Inverted Lists
– Until SIR2002 no secondary index
– Had to use ‘dummy’ cases for inverted list
Auto Increment Keys
Physical Structure
Single Data File
Two types of blocks
– Index
Contain keys plus pointers to other blocks
Single top level block
From one to six further levels
Bottom level points to data block
– Data
Contain keys and data
SIR Data File
Top Level Index
Index Level 1
Bottom Level Index
Data Blocks
Index Level 1
Index Level 1
Data Record
Header
– Size
– Update level
– Lock status
Separate Key in front of record
– All keys same size in single database
– Case id, record number,record key fields
– Special so can be searched
Record organized by data format
– Real8, real4, I4, I2, I1, Character
CIR
One per case
Count for each record type
– e.g. max rec types – 100
100 integers
– I1 – up to 123
– I2 – up to 32,000
– I4 – over this
Common variables
Size of Blocks
Index
– Calculated from key size and number of
records
– Minimum 2K (253 dwords) Maximum 32K
Data
– Calculated from maximum record size and
number of records
– Minimum 2K (254 dwords) Maximum 32K
– Minimum 4 records per block
Index Calculation
Example: Key Length 16
Number of records 1 million
– At 4 per block need 250,000 blocks
– Each index entry takes 3 dwrds
–
–
–
–
–
Key in dwords + 1 dwrd for pointer & count)
Minimum block holds 253/3 – 84 entries
Top level Single block 84 entries
Second level 84 blocks each with 84 entries
Two level index points to 7,056 data blocks
Three level points to 592,704 data blocks
Data Block Growth
Add first record
Start with 3 blocks
– Find data block
– Put record in block in
correct key sequence
– If new record first in
block, update higher
level that pointed
– Top Level Index
1 entry
key of record
Points to second level
– Second level index
1 entry
Key of record
Points to data block
– Data Block - 1 entry
Add records
Block Too Big
– Split into two blocks
– Add new key to higher
level
Loading Factor
How to split block
Records being added randomly
– Split block in middle
– Any block added to likely to have space
– .5 loading factor
Records added in ascending key
– Leave original block as full as possible
– All adding to new block
– .99 loading factor
Suggested Loading Factors
Standard Updates - .5
– If set very high and activity all on original block,
lots of empty new blocks
IMPORT - .99
– Let standard updates split blocks when needed
RELOAD - .99
– no block splitting
– Block filled to loading factor
– Space for largest record
List Stats Info
Number of Index Levels
2
Max Entries Per Index Block 42
Index/Data Block Size
253/3314
Active/Inactive Data Blocks
92/0
Active/Inactive Index Blocks
4/0
Keysize In Bytes
34
Min/Max Record Size
0/808
Inactive Blocks
New Blocks added at end
List maintained of empty blocks
Re-used when new block needed
All records deleted in stand alone
Block update strategy in Master
Master
Allow multiple users to update a
database concurrently
Intended primarily for multiple
interactive users
Communicates via TCP/IP
Machine dependent database access
Provides a consistent database view for
independent retrievals from database
Master Operation
Start Master
– Starts with an address
– Waits for client message
– Does nothing else, NO database access
Master is NOT permanently connected
to any specific database
Client tells master which database to
connect to
Database Access
Database is opened and closed during
SIR session as needed
– During PQL retrievals
– During utilities (No master)
Batch Data Input
Export, Unload, Spreadsheet, …
– During schema updates (No master)
– ‘Old’ Forms while form is running
Database Access
Database open for write
– Single User
– Master
- Exclusive Use
- Shared Read
SIR database files
– sr1/sr2 - meta data – needed by both client (read
only) and master
– sr3 - data
– controlled by master
– sr4 - procedures
– controlled by client
– sr5 - journal
– controlled by master
– sr6 - sec. index
– controlled by master
Single user SIR allows
Multiple Readers OR
a Single Writer to a database
Master:
How it Works
Part 1
SIR
(Copy 1)
User A
Reads & Writes
Exclusively
SIR
(Copy 2)
OR
SIR
(Copy 3)
Users B & C
Share
Read Only
Master:
How it Works
Part 2
SIR
(Copy 1)
Master allows multiple writers,
readers plus independent
readers
SIR
(Copy 2)
SIR
(Copy 3)
Users A & B
Reads & Writes
User C
Independent
Read Only
Master
SIR Data File
SIR Data File
How does Master work?
Client changes access to use Master
Lock Manager for clients accessing
through Master
Delayed view of updates ‘Difference File
Copy’ for independent readers
Client
SIR session switches from single user to use
a specific master
– Master must be available at this point
– All subsequent retrievals then automatically use
Master
– Various utilities not available
Sends Master a request for single data record
at a time
– Master selects on key values
– Client does any selection on data values
Data requests preceded by lock requests
Master
Gets initial logon from client
– Allocates identifier
Gets database open from client
– Checks if already known (open by another client)
– Allocates identifier
Database Identification
– Full pathname is passed by client
– Path is ‘as seen’ by client
Client needs to find database
– Master needs to find database
– Master needs to know that database referenced
by multiple clients is same database
Master Resource Control
Gets request for lock on resource (e.g.
case/record key)
– Checks lock table
– Creates entry if resource available
Gets request for record retrieval
Gets request for record update
Lock Types
Transmitted by client from PQL. Checks
existing lock on resource
1 = Null - becomes exclusive in Update, concurrent
read in retrieval
2 = Concurrent read - Fails if exclusive
3 = Concurrent write - Fails if protected or exclusive
4 = Protected read - Fails if concurrent write,
protected write or exclusive
5 = Protected write - Fails if concurrent write,
protected or exclusive
6 = Exclusive - Fails if any prior lock
Locks in PQL
Ignored in single user mode
RETRIEVAL
– LOCK = CR, CW,PR,PW,EX (2,3,4,5,6)
– CIRLOCK, RECLOCK
– Default: Update –Ex Retrieval – CR
CASE/RECORD commands
– LOCK = numeric_expression
– Nested case/records inherit outer lock
– Lock held until NEXT or EXIT at this level
Locked Case/Record
Block is entered
Variables set to undefined
Test status with functions:
– SYSTEM(36) = 1 Record available
– SYSTEM(37) = 1 Case available
– SYSTEM(38) = 1 Master mode
Wait and retry, tell user with option,…
– RETRY CASE|RECORD
‘Delayed’ Updates
Enables independent retrieval to have
consistent view of data i.e. no updates seen
while retrieval running
Master creates local copy of master index
block
Whenever index or data block rewritten for
first time, Master allocates new block
Keeps list of redundant blocks (index & data)
Identical process on secondary indexes
Difference File Copy
Makes updates available to independent
retrieval
Increments update level
Creates journal header
Writes master index
Makes redundant blocks available if no
other users (can get exclusive access)
Managing Master
Start Parameters
– MST =
– PW =
– DFC=
MST = parameter
Master finds machine name, port 3000
MST = change port number to even_number
CLIENT MST = machine_name[:port]
– The machine name consists of a host and a
domain. It makes the start up for clients faster to
quote both the host and domain name (DNS)
– Start Master - Master started SirNT:3000
– Start Forms
MST=SirNT
MST=SirNT.sir.com.au
Other parameters
PW = password
– Any remote user who wants to administer
master must specify a matching password
DFC = minutes since a difference file
copy which would force an automatic
copy
Administering Master
Interrupt
–
–
–
–
No users being served
Commands
No password
Usage Statistics
Remote
– Other users still active
– Menu driven
– Password Protected
Administering Master
– List logged on users
– List attached databases
– Stop
Immediately
After users logoff
– Difference File copy
Set interval
Backup and Recovery
Unload
–
–
–
–
Header
Internal copy of database
Machine specific/SIR version specific
Brings all unloaded records up to current schema
definition
– Can have multiple unloads on same file
– Accessed sequentially
Journal
– Header
– Copy of database record after update
Update Level & Restructure
Update level incremented when database
open/closed for update
When record written, update level held in
record header
If schema changed old and new version kept
with update level
When record read, if record update level
earlier than schema change record is
transformed
When record written, in new format
Immediate Unload
Length of key changes
– Record type in key so increase in length
forces (Number of record types > 123)
Key definition changes for existing
record
Record is locked until reloaded
Reload
Reload takes specific unload
Defined by update level
Database is recreated
If complete set of journals, can be
applied to bring reloaded database up
to date
ITEMIZE lists unloads or journals
EXPORT
Text version of database
Machine Independent
IMPORT rebuilds completely
Machine Independent
SIR version independent
Choose for long term archive
VERIFY
Walks index
– Retrieves each data block
– Checks counts and pointers
– Reports structural problems
– Patch puts calculated value in counts
– Clear corruption flag
Walks secondary indexes
Creating and Maintaining
Databases
New York Conference 2005