Transcript Mitilec09

Layouts and Scripts
In this session, we will be looking at a number of different
physical layouts
The physical layout very much influences
 How much data a database can hold
 How many users
 How many concurrent processes
 Recovery capability
 Performance
 Nature of Database Administration
 Cost
 Expansion
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 1
Oracle Architecture
Oracle 8i is an object-relational database management system.
It contains the capabilities of relational and object-oriented
database systems
It is a database server for many types of business applications
including
 On Line Transaction Processing (OLTP)
 Decision Support Systems
 Data Warehousing
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 2
Oracle Architecture
In perspective, Oracle is a not strictly high end application
DBMS - but it is getting close
A high end system has one or more of these characteristics:
 Management of a very large database (VLDB) - probably
hundreds of gigabytes or terabytes
 Provides access to many concurrent users - in the
thousands, or tens of thousands
 Gives a guarantee of constant database availability for
mission critical applications - 24 hours a day, 7 days a week.
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 3
Oracle Architecture
High end applications environments are not normally controlled
by Relational Database Management Systems
High end database environments are controlled by mainframe
computers and non-relational DBMSs.
Current RDBMSs (such as Oracle 9i, DB2/6000) manage very
large amounts of data, or perform under demanding transaction
loads.
However, very fast processors, fast and secure
communications, XML, large and fast storage facilities,
‘intelligent’ software, and advanced distributed database
engines are making RDBMSs very attractive
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 4
Layouts and Scripts
There are some guidelines for designing a database with files
distributed so that optimum performance, from a specific
configuration, can be achieved
The primary aspect which needs to be clearly understood is the
nature of the database
 Is it transaction oriented ?
 Is it read-intensive ?
 What is the mix ?
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 5
Layouts and Scripts
The key items which need to be understood are
 Identifying Input/Output contention among datafiles
 Identifying Input/Output bottlenecks among all database files
 Identifying concurrent Input/Output among background
processes
 Defining the security and performance goals for the
database
 Defining the system hardware and mirroring architecture
 Identifying disks which can be dedicated to the database
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 6
Layouts and Scripts
Let’s look at tablespaces :
These ones will be present in some combination
System
Data
Data_2
Indexes
Indexes_2
RBS
RBS_2
Temp
Temp_user
© 2003 Monash IT Pty Ltd
Data dictionary
Standard-operation tables
Static tables used during standard operation
Indexes for the standard-operation tables
Indexes for the static tables
Standard-operation Rollback Segments
Special rollback segments used for data loads
Standard operation temporary segments
Temporary segments created by a temporary
user
DataBase Design Lect 9 / 7
Layouts and Scripts
Tools
Tools_1
Users
Agg_data
Partitions
RDBMS tools tables
Indexes for the RDBMS tools tables
User objects in development tables
Aggregation data and materialised views
Partitions of a table or index segments; create
multiple tablespaces for them
Temp_Work Temporary tables used during data load
processing
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 8
Layouts and Scripts
(A materialised view stores replicated data based on an underlying query. A
materialised view stores data which is replicated from within the current
database.
A Snapshot stores data from a remote database.
The system optimiser may choose to use a materialised view instead of a
query against a larger table if the materialised view will return the same data
and thus improve response time. A materialised view does however incur an
overhead of additional space usage, and maintenance)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 9
Layouts and Scripts
Each of the tablespaces will require a separate datafile
Monitoring of I/O performance among datafiles is done after
the database has been created, and the DBA must estimate the
I/O load for each datafile (based on what information ?)
The physical layout planning is commenced by estimating the
relative I/O among the datafiles, with the most active
tablespace given a weight of 100.
Estimate the I/O from the other datafiles relative to the most
active datafile
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 10
Layouts and Scripts
Assign a weight of 35 for the System tablespace files and the
index tablespaces a value of 1/3 or their data tablespaces
Rdb’s may go as high as 70 (depending on the database
activity) - between 30 and 50 is ‘normal’
In production, Temp will be used by large sorts
Tools will be used rarely in production - as will the Tools_2
tablespace
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 11
Layouts and Scripts
So, what do we have ?
Tablespace Weight
Data
100
Rbs
40
System
35
Indexes
33
Temp
5
Data_2
4
Indexes_2
2
Tools
1
- Something like this % of Total
45
18
16
15
2
2
1
1
(220)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 12
Layouts and Scripts
94% of the Input/Output is associated with the top four
tablespaces
This indicates then that in order to properly the datafile activity,
5 disks would be needed, AND that NO other database files
should be put on the disks which are accommodating the top 4
tablespaces
There are some rules which apply :
1. Data tablespaces should be stored separately from their
Index tablespaces
2. Rbs tablespaces should be stored separately from their
Index tablespaces
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 13
Layouts and Scripts
and 3. The System tablespace should be stored separately
from the other tablespaces in the database
In my example, there is only 1 Data tablespace. In production
databases there will probably be many Data spaces (which will
happen if Partitions are used).
If/when this occurs, the weightings of each of the Data
tablespaces will need to be made (but for my efforts, 1 Data
tablespace will be used).
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 14
Data Protection
There is a Log Writer process (LGWR) whose function is to
write this data to the log file.
Transaction Log Group
Log Switch
Log
Log
group
group
Archived
transaction log
Log Switch
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 15
Layouts and Scripts
As you have probably guessed, there are other tablespaces
which require to be considered - many used by the many and
various ‘processes’ of Oracle
One of these considerations is the on-line redo log files (you
remember these and their purpose ?)
They store the records of each transaction. Each database
must have at least 2 online redo log files available to it - the
database will write to one log in sequential mode until the redo
log file is filled, then it will start writing to the second redo log
file.
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 16
Layouts and Scripts
Redo log files (cont’d)
The Online Redo Log files maintain data about current
transactions and they cannot be recovered from a backup
unless the database is/was shut down prior to backup - this is a
requirement of the ‘Offline Backup’ procedure (if we have time
we will look at this)
On line redo log files need to be ‘mirrored’
A method of doing this is to employ redo log groups - which
dynamically maintain multiple sets of the online redo logs
The operating system is also a good ally for mirroring files
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 17
Layouts and Scripts
Redo log files should be placed away from datafiles because of
the performance implications, and this means knowing how the
2 types of files are used
Every transaction (unless it is tagged with the nologging
parameter) is recorded in the redo log files
The entries are written by the LogWriter (LGWR) process
The data in the transaction is concurrently written to a number
of tablespaces(the RBS rollback segments and the Data
tablespace come to mind) via the DataBase Writer (DBWR)
and this raises possible contention issues if a datafile is located
on the same disk as a redo log file
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 18
Layouts and Scripts
Redo log files are written sequentially
Datafiles are written in ‘random’ order - it is a good move to
have these 2 different demands separated
If a datafile must be stored on the same disk as a redo log files,
then it should not belong to the System tablespace, the Rbs
tablespace, or a very active Data or Index tablespace
So what about Control Files ?
There is much less traffic here, and they can be internally
mirrored. (config.ora or init.orafile). The database will maintain
the control files as identical copies of each other.
There should be 3 copies, across 3 disks
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 19
Layouts and Scripts
The LGWR background process write to the online redo files in
a cyclical manner
When the lst redo file is full, it directs writing to the 2nd file ….
When the ‘last’ file is full, LWGR starts overwriting the contents
of the 1st file .. and so on
When ARCHIVELOG mode is used, the contents of the ‘about
to be overwritten file’ are written to a redo file on a disk device
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 20
Layouts and Scripts
There will be contention on the online redo log as LGWR will be
attempting to write to one redo log file while the Archiver
(ARCH) will be trying to read another.
The solution is to distribute the redo log files across multiple
disks
The archived redo log files are high I/O and therefore should
NOT be on the same device as System, Rbs, Data, or Indexes
tablespaces
Neither should they be stored on the same device as any of the
online redo log files.
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 21
Layouts and Scripts
The database will stall if there is not enough disk space, and
the archived files should directed to a disk which contains small
and preferably static files
Concurrent I/O
A commendable goal, and one which needs careful planning to
achieve.
Placing two random access files which are never accessed at
the same time will quite happily avoid contention for I/O
capability
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 22
Layouts and Scripts
What we have just covered is known as
1. Concurrent I/O - when concurrent processes are being
performed against the same device (disk)
This is overcome by isolating data tables from their Indexes for
instance
2. Interference - when sequential writing is interfered by reads
or writes to other files on the same disk
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 23
Layouts and Scripts
At the risk of labouring this a bit,
The 3 background processes to watch are
1. DBWR, which writes in a random manner
2. LGWR, which writes sequentially
3. ARCH, which reads and writes sequentially
LGWR and ARCH write to 1 file at a time, but DBWR may be
attempting to write to multiple files at once - (can you think of an
example ?)
Multiple DBWR processes for each instance or multiple I/O
slaves for each DBWR is a solution
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 24
Layouts and Scripts
What are the disk layout goals ?
Are they (1) recoverability or (2) performance
Recoverability must address all processes which impact disks
(storage area for archived redo logs and for Export dump files (which so far we haven’t mentioned) come to mind)).
Performance calls for file I/O performance and relative speeds
of the disk drives
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 25
Layouts and Scripts
What are some recoverability issues ?
All critical database files should be placed on mirrored drives,
and the database run in ARCHIVELOG mode
The online redo files must also be mirrored (Operating system
or mirrored redo log groups)
Recoverability issues involve a few disks
and this is where we start to look at hardware specification
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 26
Layouts and Scripts
Mirroring architecture leads to specifying
 the number of disks required
 the models of disks (capacity and speed)
 the strategy
 If the hardware system if heterogeneous, the faster drives
should be dedicated to Oracle database files
 RAID systems should be carefully analysed as to their
capability and the optimum benefit sought - RAID-1, RAID-3
and RAID-5 have different processes relating to parity
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 27
Layouts and Scripts
The disks chosen for mirroring architecture must be dedicated
to the database
This guarantees that non-database load on these disks will not
interfere with database processes
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 28
Layouts and Scripts
Goals for disk layout :
 The database must be recoverable
 The online redo log files must be mirrored via the system or
the database
 The database file I/O weights must be estimated
 Contention between DBWR, LGWR and ARCH must be
minimised
 Contention between disks for DBWR must be minimised
 The performance goals must be defined
 The disk hardware options must be known
 The disk mirroring architecture must be known
 Disks must be dedicated to the database
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 29
Layouts and Scripts
So where does that leave us ?
We’re going to look at ‘solutions’ from Optimal to Practical
and we’ll assume that :
the disks are dedicated to the database
the online redo log files are being mirrored by the Operating
System
the disks are of identical size
the disks have identical performance characteristics
(obviously the best case scenario !)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 30
Layouts and Scripts
So, with that optimistic outlook let’s proceed
Case 1 - The Optimum Physical Layout
Disk No
Contents
1
Oracle Software
2
SYSTEM tablespace
3
RBS tablespace
4
DATA tablespace
5
INDEXES tablespace
6
TEMP tablespace
7
TOOLS tablespace
8
OnLine Redo Log 1
9
OnLine redo log 2
10
OnLine redo Log 3
11
Control file 1
© 2003 Monash IT Pty Ltd
Disk No.
Contents
12
Control file 2
13
Control file 3
14
Application software
15
RBS_2
16
DATA_2
17
INDEXES_2
18
TEMP_USER
19
TOOLS_1
20
USERS
21
Archived redo dest. disk
22
Archived dump file
DataBase Design Lect 9 / 31
Hardware Configurations
 The 22 disk solution is an optimal solution.
 It may not be feasible for a number of reasons, including
hardware costs
 In the following overheads there will be efforts to reduce the
number of disks, commensurate with preserving performance
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 32
Hardware Configurations
This leads to - 17 disk configuration
Disk Contents
Disk Contents
1 Oracle software
11 Application software
2 SYSTEM tablespace
12 RBS_2
3 RBS tablespace
13 DATA_2
4 DATA tablespace
14 INDEXES_2
5 INDEXES tablespace
15 TEMP_USER
6 TEMP tablespace
16 Archived redo log
7 TOOLS tablespace
destination disk
8 Online Redo log 1, Control file 1 17 Export dump
9 Online Redo log 2, Control file 2
destination disk
10 Online Redo log 3, Control file 3
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 33
Hardware Configurations
The Control Files are candidates for placement onto the three
redo log disks. The altered arrangement reflects this.
The Control files will interfere with the online redo logfiles but
only at log switch points and during recovery
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 34
Hardware Configurations
The TOOLS_1 tablespace will be merged with the TOOLS
tablespace
In a production environment, users will not have resource
privileges, and the USERS tablespace can be ignored
However, what will be the case if users require development
and test access ?
Create another database ? (test ?)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 35
Hardware Configurations
The RBS and RBS_2 tablespaces have special rollback
segments used during data loading.
Data loads should not occur during production usage, and so if
the 17 disk option is not practical, we can look at combining
RBS and RBS_2 - there should be no contention
TEMP and TEMP_USER can be placed on the same disk
The TEMP tablespace weighting (5 in the previous table) can
vary. It should be possible to store these 2 tablespaces on the
same disk.
TEMP_USER is dedicated to a specific user - (such as Oracle
Financials, and these have temporary segments requirements
which are greater than the system’s users)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 36
Hardware Configurations
The revised solution is now
Disk Contents
Disk Content
1 Oracle software
11 Application software
2 SYSTEM tablespace
12 DATA_2
3 RBS, RBS_2 tablespace
13 INDEXES_2
4 DATA tablespace
14 Archived Redo Log
5 INDEXES tablespaces
destination disk
6 TEMP, TEMP_USER tablespace 15 Export dump file
7 TOOLS tablespace
destination disk
8 Online Redo Log 1, Control file 1
9 Online Redo Log 2, Control file 2
15 disks
10 Online Redo Log 3, Control file 3
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 37
Hardware Configurations
What if there aren’t 15 disks ? -->> Move to attempt 3
Here the online Redo Logs will be placed onto the same disk.
Where there are ARCHIVELOG backups, this will cause
concurrent I/O and interference contention between LGWR and
ARCH on that disk
What we can deduce from this, is that the combination about to
be proposed is NOT appropriate for a high transaction system
or systems running in ARCHIVELOG mode
(why is this so - Prof. Julius Sumner Miller ?)
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 38
Hardware Configurations
The ‘new’ solution Disk Contents
1 Oracle software
2 SYSTEM tablespace, Control file 1
3 RBS, RBS_2 tablespaces, Control file 2
4 DATA tablespace, Control file 3
5 INDEXES tablespaces
6 TEMP, TEMP_USER tablespaces
7 TOOLS, INDEXES_2 tablespaces
8 OnLine Redo Logs 1, 2 and 3
9 Application software
10 DATA_2
11 Archived redo log destination disk
12 Export dump file destination disk
© 2003 Monash IT Pty Ltd
12 disks
DataBase Design Lect 9 / 39
Hardware Configurations
Notice that the Control Files have been moved to Disks 3, 4
and 5
The Control Files are not I/O demanding, and can safely coexist
with SYSTEM, RBS and DATA
What we have done so far is to ‘move’ the high numbered disks
to the ‘low’ numbered disks - these are the most critical in the
database.
The next attempt to ‘rationalise’ the disk arrangement is to look
carefully at the high numbered disks.
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 40
Hardware Configurations
 DATA_2 can be combined with with the TEMP tablespaces
(this disk has 4% of the I/O load).
 This should be safe as the static tables (which ones are
those ?) are not as likely to have group operations
performed on them as the ones in the DATA tablespace
 The Export dump files have been moved to the Online Redo
disk (the Redo log files are about 100Mb and don’t increase
in size -(is that correct ?) Exporting causes minor
transaction activity.
 The other is the combination of the application software with
the archived redo log file destination area. This leaves
ARCH space to write log files, and avoids conflicts with
DBWR
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 41
Hardware Configurations
Disk Content
1 Oracle software
2 SYSTEM tablespace, Control file 1
3 RBS tablespace, RBS_2 tablespace, Control file 2
4 DATA tablespace, Control file 3
5 INDEXES tablespace
9 disks
6 TEMP, TEMP_USER, DATA_2 tablespaces
7 TOOLS, INDEXES_2 tablespaces
8 Online Redo logs 1, 2 and 3, Export dump file
9 Application software, Archived Redo log destination disk
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 42
Hardware Configurations
Can the number of required disks be further reduced ?
Remember that the performance characteristics will deteriorate
It’s now important to look closely at the weights set during the
I/O estimation process.
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 43
Hardware Configurations
Estimated Weightings for the previous (9 disk) solution are
Disk Weight
Contents
1
Oracle software
2
35
SYSTEM tablespace, Control file 1
3
40
RBS, RBS_2 tablespace, Control file 2
4
100 DATA tablespace, Control file 3
5
33 INDEXES tablespaces
6
9 TEMP, TEMP_USER, DATA_2 tablespace
7
3 TOOLS, INDESES_2 tablespaces
8
40+ Online Redo logs 1,2 and 3, Export dump
file destination disk
9
40+ Application software, archived redo log
destination disk
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 44
Hardware Configurations
A further compromise distribution could be
Disk Weight
Contents
1
Oracle software
2
38
SYSTEM, TOOLS, INDEXES_2
tablespaces, Control file1
3
40
RBS, RBS_2 tablespaces, Control file 2
4
100
DATA tablespace, Control file 3
5
42
INDEXES, TEMP, TEMP_USER, DATA_2
tablespaces
6
40+
Online redo logs 1,2 and 3, Export dump
file destination disk
7
40+
Application software, Archived redo log
destination disk
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 45
Hardware Configurations
A few thoughts for a small database system - 3 disks
1. Suitable for an OLTP application. Assumes that the transactions
a small in size, large in number and variety, and randomly
scattered among the available tables.
The application should be as index intensive as possible, and
the full table scans must be kept to the minimum possible
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 46
Hardware Configurations
2. Isolate the SYSTEM tablespace. This stores the data dictionary
- which is accessed for every query and is accessed many time
for every query
In a ‘typical case’, query execution requires
 the column names to be checked in CODES_TABLE table
 the user’s privilege of access to the CODES_TABLE table
 the user’s privilege to access the Code column of the
CODES_TABLE table
 the user’s role definition(s)
 the indexes defined on the CODES_TABLE table
 the columns of the columns defined on the CODES_TABLE
table
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 47
Hardware Configurations
3. Isolate the INDEXES tablespace. This probably accounts for 35
to 40% of the I/O
4. Separate the rollback segments and DATA tablespaces
There is a point to watch here - with 3 disks there are 4
tablespaces - SYSTEM, INDXES, DATA and RBS.
The placement of RBS is determined by the volume of
transactions. If high, RBS and DATA should be kept apart.
If low, RBS and DATA should work together without causing
contention
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 48
Hardware Configurations
The 3 disk layout would be one of these :
Disk 1: SYSTEM tablespace, control file, redo log
Disk 2 : INDEXES tablespace, control file, redo log, RBS
tablespace
Disk 3 : DATA tablespace, control file, redo log
OR
Disk 1 : SYSTEM tablespace, control file, redo log
Disk 2 : INDEXES tablespace, control file, redo log
Disk 3 : DATA tablespace, control file, redo log, RBS
tablespace
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 49
Hardware Configurations
Summary :
Database Type
Tablespaces
Small development SYSTEM
database
DATA
INDEXES
RBS
TEMP
USERS
TOOLS
© 2003 Monash IT Pty Ltd
DataBase Design Lect 9 / 50
Hardware Configurations
Summary :
Database Type
Production OLTP
database
© 2003 Monash IT Pty Ltd
Tablespaces
SYSTEM
DATA
DATA_2
INDEXES
INDEXES_2
RBS
RBS_2
TEMP
TEMP_USER
TOOLS
DataBase Design Lect 9 / 51
Hardware Configurations
Summary :
Database Type
Production OLTP
with historical
data
© 2003 Monash IT Pty Ltd
Tablespaces
Tablespaces
SYSTEM
TEMP
DATA
TEMP_USER
DATA_2
TOOLS
DATA_ARCHIVE
INDEXES
INDEXES_2
INDEXES_ARCHIVE
RBS
RBS_2
DataBase Design Lect 9 / 52
Hardware Configurations
Summary :
Database Type
Data Warehouse
© 2003 Monash IT Pty Ltd
Tablespaces
SYSTEM
DATA
DATA_2
INDEXES
INDEXES_2
RBS
RBS_2
TEMP
TEMP_USER
TOOLS
Tablespaces
PARTITIONS
PARTITIONS_2
AGG_DATA
AGG_DATA_1
SNAPS
SNAPS_1
TEMP_WORK
TEMP_WORK_1
DataBase Design Lect 9 / 53