NCPO Cost Research Review - Center for Software Engineering

Download Report

Transcript NCPO Cost Research Review - Center for Software Engineering

COCOMO II Integrated with
Crystal Ball® Risk Analysis Software
Clate Stansbury
MCR, LLC
[email protected]
(703) 506-4600
Prepared for
19th International Forum on COCOMO Software Cost Modeling
University of Southern California
Los Angeles CA
27 October 2004
1
Contents
•
•
•
•
•
Purpose: Describing Uncertainty
Representing Uncertain Inputs
Simulating Costs
Correlating Inputs and Costs
Summary
2
Traditional “Roll-Up” Method (Too
Simple)
•
•
•
Define “Best Estimate” of Each Cost Element to be
the Most Likely Cost of that Element
List Cost Elements in a Work-Breakdown Structure
(WBS)
– Calculate “Best Estimate” of Cost for Each Element
– Sum All Best Estimates
– Define Result to be “Best Estimate” of Total Project
Cost
Two Problems With Roll-up Method
1. Ignores Uncertainty—Only Outputs a Point Estimate
2. Estimate is Too Low (We’ll Discuss Later)
3
Estimators Must Describe Uncertainty
• Report Cost As a Statistical Quantity, Not a Point
– Cost of Any Incomplete Program Is Uncertain
– Estimator Must Report That Uncertainty as Part of His
or Her Delivered Estimate
• Cost-risk Analysis Allows Estimator to Report Cost
As a Probability Distribution, So Decision-maker Is
Made Aware of
– Expected Cost (Mean)
– 50th Percentile Cost (Median)
– 80th Percentile Cost
– Overrun Probability of Project Budget
4
What a Cost Estimate Should Look Like
Percentile
Value
0%
450.19
10%
516.81
20%
538.98
30%
557.85
40%
575.48
50%
592.72
60%
609.70
70%
629.19
80%
650.97
90%
683.01
100%
796.68
Statistics
Value
Trials
10,000
Mean
596.40
Median
592.72
Mode
--Standard Deviation 63.18
Range Minimum
450.19
Range Maximum
796.68
(Crystal Ball Outputs)
Forecast: A8
10,000 Trials
Cumulativ e Chart
71 Outliers
1.000
10000
“S-Curve”
.750
.500
.250
.000
0
462.43
537.16
611.89
686.62
761.35
Forecast: A8
10,000 Trials
Frequency Chart
71 Outliers
.020
197
“Density Curve”
.015
147.7
.010
98.5
.005
49.25
.000
0
462.43
537.16
611.89
686.62
761.35
5
Representing Uncertain Inputs Using
Triangular Distributions
6
DENSITY
Triangular Distribution of Element Cost,
Reflecting Uncertainty in “Best” Estimate
$
L
M
Optimistic Best-Estimate
Cost (Mode =
Cost
Most Likely)
H
Cost Implication of Technical,
Programmatic Assessment
7
COCOMO Cost Drivers as Triangular
Distributions
Why triangular distribution?
• Triangular Distribution is Simple and Malleable
• Parameters (Optimistic, Most Likely, Pessimistic) Are
Easy to Define and Explain
• Could Have User Provide Parameters for Normal,
Lognormal, Exponential, Uniform, or Beta Distributions,
for Example, if More is known about the distributions
• Good Topic for Further Research….
8
COCOMO Cost Drivers as Triangular
Distributions
• For Each COCOMO II Input …
– Input Request Interpreted as a Triangular Distribution
– User Estimates Optimistic, Most Likely, and Pessimistic Values
(which may not always be all different from each other)
Most Likely (mode)
Probability
Optimistic
Pessimistic
Cost
User provides three values for each COCOMO II input,
as though there were three separate projects.
Range of Realistic
Input Values
9
COCOMO Cost Drivers as Triangular
Distributions
0.90
1.14
10
Processing Uncertainty Using
Simulations
11
How to Process Triangular Distributions?
• Taking the Product of Effort Multipliers When Each
EM is a Triangular Distribution?
• How to Sum Code Counts for All CSCIs?
• How to Compute Rest of COCOMO II Algorithm?
12
Process Optimistic, ML, Pessimistic as 3
Separate Projects (Too Simple)
• Perform “Roll-up” Method Three Times
– Input Optimistic Values into COCOMO II
– Input Most Likely Values into COCOMO II
– Input Pessimistic Values into COCOMO II
• Obtain Total Project Effort as a Triangular
Distribution
13
Why “Roll-up” Doesn’t Work
WBS-ELEMENT TRIANGULAR
INPUT DISTRIBUTIONS
MERGE INPUT DISTRIBUTIONS INTO
TOTAL-COST DISTRIBUTION
Most
Likely
Most
Likely
$
.
.
.
$
Most
Likely
$
$
ROLL-UP TO MOST LIKELY
TOTAL COST
REAL MOST LIKELY
TOTAL COST
14
Use Monte Carlo Simulation to Process
the Input Triangular Distributions
Trial 1
Trial 10,000
Trial 2
Assumption
Cell G5
=SUM($G$4:$G$8)
Total Cost
Forecast
15
Crystal Ball Risk- Analysis Software
• Commercially Available Third-Party Software Add-on
to Excel, Marketed by Decisioneering, Inc., 2530 S.
Parker Road, Suite 220, Aurora, CO 80014, (800)
289-2550
• Inputs
– Parameters Defining WBS-Element Distributions
– Rank Correlations Among WBS-Element Cost Distributions
• Mathematics
– Monte-Carlo (Random) or Latin Hypercube (Stratified)
Statistical Sampling
– Virtually All Probability Distributions That Have Names Can
Be Used
– Suggests Adjustments to Inconsistent Input Correlation Matrix
• Outputs
– Percentiles and Other Statistics of Program Cost
– Cost Probability Density and Cumulative Distribution Graphics
16
Representing Correlations Among Risks
17
Risks are Correlated
• Resolving One Cost Driver’s Risk Issues by Spending
More Money Often Involves Increasing Values of
Several Other Drivers as Well
– For Example, the Monte Carlo Could Generate a High RELY Value and a
Low DOCU Value for the Same Trial, Which Doesn’t Make Any Sense
– Schedule Slippage Due to Problems in One CSCI Lead to Cost Growth
and Schedule Slippage in Other CSCIs
• As We Will Soon See, Correlation Tends to Increase
the Variance of the Total-Cost Probability Distribution
• Numerical Values of Correlations are Difficult to
Estimate, but That’s Another Story
18
Maximum Possible Underestimation
of Total-Cost Sigma
• Percent Underestimated σ When Correlation
Assumed to be 0 Instead of r (n=# of Input Values)
100
n = 1000
n = 100
n = 30
Percent Underestimated
80
n = 10
60
40
20
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Actual Correlation
19
Determining Correlations Among
COCOMO II Cost Drivers
• Default Correlations to 0.2
• More Detailed Default Correlations?
– Higher Correlation Between RELY and DOCU?
– COCOMO II Security Extension Cost Driver Related to
Existing Cost Drivers
20
Summary
• Estimator Must Model Uncertainty
• Describe Uncertainty by Representing COCOMO
Inputs as Triangular Distributions
• Calculate Implications of Uncertainty by Using
Monte Carlo or Latin Hypercube Simulations to
Perform COCOMO II Algorithm
• Consider Correlation Among CSCI Risks and Costs
• Professional Software, e.g., Crystal Ball, is
Available to do Computations
21
Acronyms
AA
AT
CB
CM
COCOMO
CSCI
DM
EI
EIF
EO
EQ
ILF
IM
KSLOC
MS
O,M,P
SCED
SLOC
SU
UFP
UNFM
USC
WBS
Assessment and Assimilation
Automatically Translated code
Crystal Ball
Percent of Code Modified
Constructive Cost Model
Computer Software Cost Integrator
Percent of Design Modified
External Input
External Interface File
External Output
External Inquiry
Internal Logical File
Effort for Integration
Thousands of Source Lines of Code
Microsoft
Optimistic, Most Likely, Pessimistic
Schedule compression/expansion rating
Source Lines of Code
Software
Unadjusted Function Point
Programmer Unfamiliarity rating
University of Southern California
Work Breakdown Structure
22
Backup Slides
23
Correlation Matters
• Suppose for Simplicity
– There are n Cost Elements C , C ,  , C
1
2
n
– Each Var (C ) = s 2
i
– Each Corr(Ci ,Cj ) = r < 1
n
– Total Cost C =  C i
k =1
n
n 1
n
• Var(C ) = 
Var(C i )  2r  
k =1
i =1 j = i 1
2
( )
Var(C i ) Var C j
= ns 2  n( n  1) rs
= ns 2 (1  ( n  1) r )
Correlation
0
1
Var( C )
r
ns 2
ns 2 ( 1  ( n  1) r )
n2s 2
24
Correlation Matrices Allow User to Adjust
Correlations
• One Matrix for Each CSCI Allows Estimator to Set
Correlations Among Cost Drivers for that CSCI
How to Record Inter-CSCI Information?
• One Matrix for All Inputs in All CSCI’s
Difficult for User (and Developer!) to Manipulate
• One Matrix for Project with which the Estimator
Sets Correlations Among the Efforts of the CSCI’s
But CSCI Costs are Not Inputs (aka Assumptions). Only
Inputs Can Be Correlated
25
Selection of Correlation Values
• “Ignoring” Correlation Issue is Equivalent to
Assuming that Risks are Uncorrelated, i.e., that All
Correlations are Zero
• Square of Correlation (namely, R2) Represents
Percentage of Variation in one WBS Element’s Cost
that is Attributable to Influence of Another’s
• Reasonable Choice of Nonzero Values Brings You
Closer to Truth
• Most Elements are, in Fact, Pairwise Correlated
• 0.2 is at “Knee” of Curve on Previous Charts, thereby
Providing Most of the Benefits at Least Commitment
Correlation
% Influenced
0.00
0%
0.10
1%
0.32
10%
0.50
25%
0.71
50%
26
Cost-Risk Analysis Works by
Simulating System Cost
• In Engineering Work, Computer Simulation of System
Performance is Standard Practice, with Key Performance
Characteristics Modeled by Monte Carlo Analysis as
Random Variables, e.g.
– Data Throughput
– Time to Lock
– Time Between Data Receipt and Delivery
– Atmospheric Conditions
• Cost-Risk Analysis Enables the Cost Analyst to Conduct a
Computer Simulation of System Cost
– WBS-element Costs Are Modeled As Random Variables
– Total System Cost Distribution is Determined by Monte
Carlo Simulation
– Cost is Treated as a Performance Criterion
27
Traditional “Roll-Up” Method (Too
Simple)
• Define “Best Estimate” of Each Cost Element
to be the Most Likely Cost of that Element
• List Cost Elements in a Work-Breakdown
Structure (WBS)
– Calculate “Best Estimate” of Cost for Each
Element
– Sum All Best Estimates
– Define Result to be “Best Estimate” of Total
Project Cost
• Unfortunately, It Turns Out That Things are
Not as Simple as They Seem – There are a
Lot of Problems with This Approach
28