Transcript Class 8

Pop Quiz
• How does fix response time and fix
quality impact Customer Satisfaction?
• What is a Risk Exposure calculation?
• What’s a Scatter Diagram and why do
we care about it? (what’s it for?)
• Define BMI
• Define Predictive Validity
1
Software Quality Engineering
CS410
Class 8
Exponential Distribution
Reliability Growth Models
2
Reliability Growth Models
• A class of software reliability models
• Usually based on data from formal
testing
• Models are more appropriate to the
software product once development is
complete
• Rational - defect arrival and failure
patterns during testing is a good indicator
of reliability when product is used by the
3
customer
Reliability Growth Models
• Reliability grows over time
– Defects are identified
– Defects are reported
– Defects are fixed
– Defects made available to customers
– Models
• Reliability Growth Models are most often
used for projecting reliability of software
4
before it’s shipped to customers
Reliability Growth Models
• More than 100 Reliability Growth Models
have been proposed for software
engineering
• Each model has it’s own assumptions,
applicability, and limitations
• Some general limitations
– Cost of gathering data
– Practicality
– Understandability
– Validation in SWE field
5
Reliability Growth Models
• Two major classes of Reliability Growth Models
1. Time Between Failure Models
• Time is the dependant variable
• Expected that time gets longer as reliability grows
(I.e as defects are removed)
• Earliest class of SWE reliability models
• Assumption that time between failure i-1 and failure
I follows a distribution whose parameters are related
to the number of latent defects remaining in the
product
• Mean time to the next failure is the parameter to be
estimated by the model
6
Reliability Growth Models
– 2. Fault Count Models
• Number of faults (or failures) in a
specified time interval is the dependant
variable
• Time interval is fixed, and number of
faults (or failures) is a random variable
• As defects are removed it is expected that
the number of failures per unit of time
will decrease
• The number of remaining defects (or
failures) is the parameter to be estimated 7
General Reliability
• Basic Statistics (Kiemele & Schmidt) Chap. 10
– Reliability is a measure of the likelihood that
a product (or system) will operate without
failure for a stated period of time (time t).
– Normal Failure Function f(t) is usually either
Bell Shaped, or Exponential
– Probability is determined by the area under the
Failure Function curve
– Reliability is a probability, and therefore is
always a number between 0 and 1
8
General Reliability
– Example of a Failure Function curve: Fig 10.1
p. 10-3 (Kiemele & Schmidt)
– Hazard Function:
• Rate of failure at an instant of time, or instantaneous
failure rate
• Time between i-1’st failure and i'th failure
f(t)
h(t) = R(t)
• Hazard function equals the failure function over the
reliability function.
• Fig. 10.2 p. 10-4, and fig. 10.3 p. 10-5 (Kiemele &
Schmidt)
9
Exponential Distribution
• Another special case of Weibull, where m=1
• Best used for statistical processes that decline
monotonically (repetitiously) to an asymptote
(approaches, but never reaches the limit).
-(t/c)
(Cumulative Distribution Function) CDF: F(t) = 1 - e
(Probability Density Function) PDF: f(t) =
1 e
c
-(t/c)
10
Exponential Distribution
• Graph represents a standard distribution
curve (total area under curve equals 1)
• Fig 8.1 and 8.2 p. 199
• Key factors for exponential distributions
– Accurate input data
– Homogeneous (uniform) time
increments (I.e defects per week), or
else normalized (I.e. defects per n
person hours)
– The more data points the better
11
Jelinski-Moranda (J-M) Model
– Time between failures model
– Assumes:
•
•
•
•
•
•
There are n software defects at start of testing
Failures occur purely random
All defects contribute equally to cause of failure
Fix time is negligible
Fix is perfect for each failure
Failure rate improves equally for each fix
– Hazard function:
•
•
•
•
Z(ti) = [N-(i-1)]
N = number of defects
 (Phi) = proportional constant
Decreases in increments of  following the removal of each
defect, therefore the time between failures grows as defects are
removed
12
Littlewood (LW) Model
• Similar to J-M except it assumes that different
defects have different sizes, thereby contributing
unequally to failures
• Assumptions:
– Larger defects are found/fixed earlier
– Average defect size decreases as defects are removed
• The model is more accurate than J-M because
of the concept of ‘error size’
– Major path/function defects vs. minor path/function
defects (based on product usage)
13
Goel-Okumoto (G-O) Imperfect
Debugging Model
• Acknowledges that fixes can introduce
new defects
• Model designed to overcome limitations
of J-M model
• Hazard function:
•
•
•
•
Z(ti) = [N-p(i-1)]
N = number of defects
p = probability of imperfect debugging (potential for a bad fix)
 (Lambda) = failure rate per defect
14
Goel-Okumoto Nonhomogeneous
Poisson Process (NHPP) Model
• Models the number of failures observed in a
given test interval
• Assumption: Cumulative number of failures in
time t, N(t), can be modeled as a nonhomogeneous
Poisson process
• The time-dependant failure rate follows an
exponential distribution and therefore the NHPP is
an application of the exponential model
• Model estimates the cumulative number of failures
at a specific time t
15
Musa-Okumoto (M-O) Logarithmic
Poisson Execution Time Model
• Similar to NHPP in that it models the
number of failures at a specific time t
• Mean value function takes into account
that later fixes have a smaller effect on
the software’s reliability
• A more appropriate model for systems
which have varying use of functions
16
Delayed S and Inflection S Models
• Delayed S Model
– Based on NHPP
– Acknowledges that there is a delay between
defect detection and defect isolation
– The observed growth curve of the cumulative
number of defects is S-shaped
• Inflection S Model
– Assumes a phenomenon where the more defects
detected, the more undetected defects become
detectable
– Fig. 8.3 p. 206
17
Model Assumptions
• Time Between Failures Models
1. N unknown software defects at the start of
testing - undisputed
2. Failures occur randomly (times between
failures are independent) - not always the case,
focused testing may result following detection
of a defect and cause failures to become closer
together
Note: Assumption 2 is used in all Time Between
Failures Models
18
Model Assumptions
• Time Between Failures Models
3. All faults contribute equally to cause a
failure - major function (major code
path) defects contribute more to
failures
4. Fix time is negligible - some fixes
require significant cycle time
5. Fix is perfect for each new failure often times fixes introduce new defects,
19
or fail to correct the failure
Model Assumptions
• Fault Count Models
1. Testing intervals are independent of each other
- I.e weeks of testing are treated as independent
2. Testing during intervals is reasonably
homogeneous (uniform) - testing can be
normalized using person hours or other
measures
3. Number of defects detected in each interval are
independent - I.e. defects per week of testing is
independent of other weeks
20
Model Assumptions Summary
• The physical process to be modeled in SWE is the
‘Software Failure Phenomenon’
• Physical processes being statistically modeled are
often times not precise
• Unambiguous statements of the underlying
assumptions are necessary in the development of a
model
• The performance and accuracy of the model is
linked to the degree of which the underlying
assumptions are met
21
Criteria for Model Evaluation
1. Predictive Validity - capability of model to
(accurately) predict future failure behavior
2. Capability - ability of model to estimate quantities
of process change
3. Quality of Assumptions - likelihood that model
assumptions can be met
4. Applicability - degree of applicability to SW
development process/product
5. Simplicity - inexpensive data collection, simple in
concept (understandable), supported by SW tools
22
Modeling Process
1. Examine the data - study nature of data,
identify units of analysis (days, weeks, etc.),
plot the data in a table of scatter diagram,
observe trends and fluctuations
2. Select model(s) - pick a model that seems
appropriate from the initial analysis of step
1 and also from previous experience. For
example if data shows a decreasing trend,
then exponential models are appropriate
23
Modeling Process
3. Estimate the parameters of the model - use
statistical techniques and/or statistical tools
to estimate the parameters
4. Obtain the fitted model - plug parameter
estimates into the model to get PDF and
CDF
5. Perform goodness-of-fit test - compare
model estimates to observed data,
determine if model is performing within
acceptable limits
24
Modeling Process
6. Make reliability predictions - use historical
data for model calibration, use cross-model
comparison for reliability assessment, use
model to make reliability predictions based
on the fitted model
25
Test Compression Factor
• The main goal of testing is to find defects
• Customer usage should be less vigorous and
comprehensive than testing
• Defect arrival during testing should be higher than
defect arrival during field use
• The actual difference between test and field defect
arrival patterns is called Compression Factor
(testing tends to compress the defect arrival rates)
• Historical data is a good estimator of Compression
Factor
• Software maintenance planning should consider
Compression Factors
26
Models Summary
• Exponential model is the simplest and most
widely used model in SWE
• Models work best for back-end processes, I.e. for
formal testing phases
• Models fit into two categories
• Time Between Failures Models
• Fault Count Models
• Testing intervals must be uniform
• Key Factors for SW Reliability Model Use:
• Correct model to match process attributes
• Degree to which assumptions are met by model
• Accuracy of test data
27