Transcript Lec 1

Performance
Week 12 Lecture 1
There are two general measures
of performance
• The time an individual takes to complete a
task – RESPONSE TIME
• The number of transactions the system can
process in a given time period THROUGHPUT
Some general observations
• Performance is a design criteria from the start –
determine the required response times and
throughput before design starts
• It is very difficult to estimate, but you still need to
know what you are trying to achieve
• The 80:20 rule applies.
– 80% of the transactions use 20% of the components – as
always the trick is, which 20%
– 20% of the transactions generate 80% of the work
• All aspects of the system affect
performance. Technical experts
concentrate on getting their bit right,
and tend to leave performance to the
end or to someone else
– They pass the parcel
– Or demand more iron
• Layering manages complexity but adds to
processing time. Every interface, every
message constructed, passed and parsed,
every database session initiated, costs time.
Performance monitoring is a
continuous process
• DBA and Network manager will have lots of stats to manage
their bit
• The application should collect stats on the areas defined in
response time requirements. This could be:
– Identifying the key transactions and queries – the 80%
– For each create a statistical record recording transaction type, date, time,
user ID and start and end times
– This can affect times as well so have a switch to turn it on and off
– Analyse the stats in terms the user can understand - a report for each user
perhaps
All areas of the system affect
performance
•
•
•
•
•
•
User Interface Design
System design
Programming
System architecture
Database implementation
Operating system, Middleware and Server
hardware
• Network
User Interface Design
• Primary objective is making it easy to use and to
learn
• But efficiency is integral to this
• Some points to consider:
– Multiple forms take resources and time for the user to reorientate
– Keep the number of processing steps to a minimum
– The system should know the user and complete as much
of the form as possible, and then allow change
• Allow the user to store and reload parameters that
are commonly used. Many users have a small
number of queries they commonly make, or generate
transactions within a specific area
• Allow users quick access to functions they
commonly use
• When the system is checking a completed form,
check all parameters and report all mistakes
• Minimise the amount of “Pretties” on a form. They
take resources, distract the user and don’t help the
task at hand. Keep the form clean
System design
• Keep it simple! When you are reviewing your dataflow
diagrams, if it is complex to you or there is a large number
of layers, it is probably wrong
• Walk it through with someone you respect – not
necessarily someone who knows the area – explaining will
show up problem areas
• Walk it through with the users – prototype or story board
• $1 spent in design, is worth $10 in development or $100
after implementation
The time of the day is important
• Most users work 9:00 to 5:00
• Transaction peaks are often between 10:00
and 12:00 and between 2:30 and 4:30
• A system serving multiple time zones has a
smoothing effect of this factor
• Peaks also occur at week or month end
• These have to be designed for
Programming
• Design issues apply here as well – including the
80:20 rule and you will not be able to tell which
20%
• Most programmers test on small amounts of data
and don’t see the inefficient areas of their code
• With large databases, poorly written SQL is a
common problem. It you are new to this, talk to the
DBA. Read the manual sections that apply. Use
stored procedures & triggers
System architecture
• If multiple servers are even a remote possibility, then the
system has to be architected for this
• Do all of your estimates over the life of the system and
double or treble them
• Know the hardware and software platform that the system
is to use. Know where the bottlenecks are when the
platform scales
• Design in middleware if multiple servers are possible
• Design components that are logical and do a unit of work
• Leave compute intensive work to the end of the day if time
is not of the essence
• If the application is to be purchased from a major
supplier, they probably have it designed for large
volumes
• If the supplier is a niche player or the market is
small, the system may not be designed for
scalability but they may not tell you
• Before any application is purchased from any
supplier, performance & scalability must be
evaluated and resolved. Visit existing users and
get the specifics, don’t rely on the sales rep. He or
she may fudge it. Make sure they use the target
platform
Database
• The development team develop the ER diagrams and then
the conceptual & logical schemas. The DBA usually takes
over from there and completes the physical schema
• Given that the normalisation process is well known, the
database tables should be structured correctly
Business Analyst
Programmer
Responsibility of people with different skills
Application
Experienced
Programmer
DBMS Query processor
DBMS
Indexes
Database
Administrator
Concurrency
control
Recovery
Operating system
Hardware
(file system)
(disk etc)
Five Basic Principles
of tuning databases
•
•
•
•
•
Think globally, fix locally
Partitioning breaks bottlenecks
Start up cost are high, running costs are low
Render unto server what is due unto server
Be prepared for trade-offs
What are indexes?
• Like the index in a book, they are tables that
point to records, that can be read more
quickly than scanning the whole book
• They are usually B-trees
• The key can be one attribute or made up of
multiple attributes
• There is usually one or more indexes for
each table
If we used the following SQL
SELECT name
FROM Student
WHERE SID = 9912345
Then if we had an index, the query processor would use
The B-Tree index to find the record quickly
Otherwise the student table would have to be scanned to
Find the record with an SID = 9912345
Indexes & usage plans
• Index definition is not an intuitive process
• Indexes made up of more than one attribute need
care to design
• More indexes is not the answer – more to update
and queries are not necessarily faster
• Study query optimisation plans for common
queries and test alternatives
• Different DBMS will produce different
optimisation plans
Databases can be tuned for update or query
• If the indexes required for query optimisation affect the
update times, consider;
– Duplicating some tables or perhaps the whole database and tune them
differently
– Have minimal indexes on the update tables, and more indexes on the
query tables
– Update the query tables on a trigger or subsequent event
– Flatten or de-normalise the tables in the query database to optimise
execution of common queries
• I am a believer in de-normalising and carrying some
redundancy if that improves performance
Other database issues
• Number and size of memory pools given to databases and
processes
• Consider the order in which data should be held in a table
– affects update and query times
• Avoid records with very high hit rates – major cause of
locking – hot spots
• Give programmers a table locking plan
• Ensure tables that are often used together and log files are
on different devices to avoid head thrash (there are also
facilities in DBMS where related records can be held in the
same or contiguous pages as another method of
overcoming head thrash)
• Cause the physical re-organisation of data in the database
to ensure free space and to ensure records are held in the
correct sequence in contiguous pages
Platform evaluation
• The platform consists of:
– Operating systems
– Middleware (MOM, TP Monitors, Distributed Component
services)
– Server computers
• Usually best evaluated as a unit
• Sometimes all or some of the suppliers of these elements
are organisation standards
• But the precise platform still needs to be specified and
evaluated for suitability for the application
How do we specify and evaluate the
platform?
• Again, with difficulty
• Because of the overall complexity of the
relationship of these platform components,
benchmarking is usually the approach
• That is, how many transaction for this
application, can the recommended platform
process per hour i.e. What is its throughput?
Benchmarks are not easy
• At the time the benchmark needs to be
done, the application code is usually not
written. So we can’t benchmark the actual
application.
• Setting up quantities of benchmark data,
meeting the structure of the new database is
a difficult and time consuming task
• An alternative is to use TPC benchmarks
What are TPC benchmarks?
• The Transaction Processing Council is an independent organisation
that prepares and audits benchmarks of combinations of Operating
system, DBMS and Server and publishes those benchmarks in a
comparative form.
• It has been functioning for 10+ years
• It specifies a number of benchmarks, related as far as possible to real
world situation
• It monitors and audits tests by manufacturers to ensure all conditions
are met and the results are comparative
• Website is www.tpc.org
Four current benchmarks in use
• TPC-C an order entry transaction processing benchmark
with a primary metric of transaction processed per minute tpmC
• TPC-H a decision support benchmark with a primary
metric of Queries per hour - QphH
• TPC-R similar to H but allows greater optimisation based
on advance knowledge of the queries - QphR
• TPC-W a transactional web benchmark with a primary
metric of Web Interactions Per second - WIPS
TPC-C
•
•
•
•
•
•
•
TPC-C simulates an order entry environment
Involves a mix of five transaction types of different complexity
Multiple on-line terminal sessions
Moderate system and application execution time
Significant disk input/output
Transaction integrity (ACID properties)
Non-uniform distribution of data access through primary and
secondary keys
• Databases consisting of many tables with a wide variety of sizes,
attributes, and relationships
• Contention on data access and update
TPC-H and TPC-R
• illustrates decision support systems that examine
large volumes of data, execute queries with a high
degree of complexity, and give answers to critical
business questions.
• Use variable sized databases
• Tests queries submitted sequentially and
concurrently
TPC-W
• Web benchmark simulating a business orientated transactional
web server
• Multiple on-line browser sessions
• Dynamic page generation with database access and update
• The simultaneous execution of multiple transaction types that
span a breadth of complexity with contention on data access
and update
• On-line transaction execution modes
• Databases consisting of many tables with a wide variety of
sizes, attributes, and relationships
• Transaction integrity (ACID properties)
What’s wrong with the TPC tests?
• The main criticism is that it does not match the actual
application
• But it does:
• Provide cross platform comparison with different combinations of system
software and hardware, in a completely objective manner,
• Because it is well documented, the test application allows some
comparison with the subject application
• Measure the complete system, with the exception of the network, and does
not use measures of processor speed MIPs and SPEC rates
• Provide a very low cost method of benchmarking
Example
Table 1
SPEC rate
SPARCserver
1000 8-way
HP 9000 H70
tpmC
$/tpmC
10,113
1,079.4
$1,032
3,757
1,290.9
$961
• The SPEC rate would suggest the SPARC server
2.7 times better
• The tpmC rate suggests the HP has a higher
throughput
Issues with some benchmarks
• Suppliers can configure their machines to achieve an
optimal performance
• Many machines use about half the processors their design
permits
• This suggests that because of the SMP bottleneck on
memory buses, the best performance is around half the
maximum
End-to-end transmission rate is affected
by two factors
•
•
•
•
The bandwidth on each link
The latency at each switch
Of these latency is becoming the more important
But how can we estimate end-to-end transmission
time as part of the estimate for response time?
Determine whether it matters
• Isolate the network element and express it as an overhead
on transactions executed on the LAN
• Carry out tests locally and then across the links that are
expected to be a problem
• On an Australian Frame relay network with relatively low
bandwidth links, we had a network overhead of between
20% and 50%
• If the response time is relatively low anyway, say 1-2
seconds, then this overhead may not be noticeable in many
systems
• Check other applications using the same
network – you may not be the first
• Talk to the Network Administrator or
Supplier, and carry out tests that will
indicate any links that may be a problem
• Simulate the application as closely as
possible and test the network
• If the network is complex and the load high
then full simulation may be necessary –
specialist organisations have test
laboratories that can test for sensitivity
under load – this can be expensive
Network monitoring has to be continuous
• We know that many factors not within our control,
are going to affect our end-to-end rates
• So continuous monitoring is necessary, both by the
Network administrator in general and as part of
our application’s response time
• Planning and managing networks is a complex and
specialised job, so use the experts, but know
enough to understand the issues and to discuss
solutions
Conclusion
• Accurately estimating response time is difficult, but you have
to plan to achieve a given response time and throughput rate
• Clearly defined requirements are absolutely necessary
• Be very specific about what the response time refers to
• Ensure all technical experts know what is expected
• Over provide for each key element – you will need it
• Simulate or test any area that is expected to be a problem
• Have fall-back plans