Erik Arisholm: Generalizing results through a series of

Download Report

Transcript Erik Arisholm: Generalizing results through a series of

Generalization through a series
of replicated experiments on
maintainability
Erik Arisholm
An ideal controlled experiment …
• Relevant research question!
• Statistical generalization from a sufficiently large,
representative sample of systems, tasks, settings and subjects
to a well-defined and relevant target population of systems,
tasks, settings and subjects (statistical power and external
validity)
• Well defined independent and dependent variables that fully
and reliably measure the concepts under study (construct
validity and external validity)
• Equivalent treatment groups (e.g., random assignment and
avoiding post-assignment drop-outs) to ensure internal validity
• Supported by theory that can explain the observed effects for
all possible factor level combinations
14/11/2005
2
Experiments on Software Maintenance
• A series of five controlled experiments (can be considered
as one quasi-experiment) where the subjects consisting of
• 295 junior/intermediate/senior Java consultants from Norway,
Sweden and the UK, and
• 273 undergraduate/graduate students from Norway and Canada
• performed maintenance tasks on two alternative designs of
the same Java system
• to assess the effects of (combinations of)
•
•
•
•
control style (centralized vs delegated),
maintenance task order (easy vs difficult task first),
documentation (UML versus no UML), and
development process (pair programming vs individual)
• on software maintainability (change effort and correctness)
14/11/2005
3
Centralized vs Delegated Control Style
The Delegated Control Style:
M
Object5
6
ge
sa
4
es
ge
Message5
Object4
Object3
Delegated Control Style
Centralized Control Style
1
Object1
M
es
sa
ge
4
3
Data Driven Design
Object2
ge
M
M e ss
es a
sa g e
ge 2
5
sa
Object3
es
Use-Case Driven
Design
sa
M
Role Modelling
Object2
es
M
M
ge
–
Rebecca Wirfs-Brock: A centralized control
style is characterized by single points of
control interacting with many simple objects.
To me, centralized control feels like a
"procedural solution" cloaked in objects…
Alistair Cockburn: Any oversight in the
“mainframe” object (even a typo!) [in the
centralized coffee-machine design] means
potential damage to many modules, with
endless testing and unpredictable bugs.
M
M e ss
es ag
sa e
ge 1
2
3
Responsibility
Driven Design
The Centralized Control Style:
–
s
es
e
ag
sa
•
Object1
Control Style
Object-Oriented Design Method
–
Rebecca Wirfs-Brock: A delegated control
style ideally has clusters of well defined
responsibilities distributed among a number of
objects. To me, a delegated control
architecture feels like object design at its
best…
Alistair Cockburn: [The delegated coffeemachine design] is, I am happy to see, robust
with respect to change, and it is a much more
reasonable ''model of the world.'‘
Object5
es
–
Object4
M
•
14/11/2005
4
The individual experiments
Exp 1: The Original Experiment with students and pen-and-paper tasks (Fall 1999)
–
–
–
Effect of Centralized (bad) vs Delegated (good) Control Style
Arisholm, Sjøberg & Jørgensen, "Assessing the Changeability of two Object-Oriented Design Alternatives - a Controlled
Experiment," Empirical Software Engineering, vol. 6, no. 3, pp. 231-277, 2001.
36 undergraduate students and 12 graduate students
Exp 2: The Control-style experiment with professional Java developers and Java tools (Fall 2001- Spring 2002)
–
–
–
Effect of Centralized vs Delegated Control Style for Categories of Developer
Arisholm & Sjøberg, "Evaluating the Effect of a Delegated versus Centralized Control Style on the Maintainability of ObjectOriented Software," IEEE Transactions on Software Engineering, vol. 30, no. 8, pp. 521-534, 2004.
99 professionals, 59 students
Exp 3: UML experiments (Spring 2003 - Fall 2004)
–
–
–
Effect of UML (vs No UML) for the Delegated Control Style
Arisholm, Briand, Hove & Labiche, "The Impact of UML Documentation on Software Maintenance: An Experimental
Evaluation," Simula Technical Report 2005-14. Submitted to IEEE Transactions on Software Engineering, 2005.
20 students from UiO (Spring 2003) + 78 students from Carleton Univ., Canada (Fall 2004)
Exp 4: Task Order Experiment (Fall 2001-Spring 2005)
–
–
–
–
Effect of Task Order and Centralized vs Delegated Control Style
Arisholm, Wang & Syrstad, “The impact of Task Order on Maintainability”, in preparation
Easy task first: 59 students from Exp 2 (2001-2002)
Difficult task first: 66 students from NTNU (Spring 2005)
Exp 5: Pair programming experiment (Fall 2003-Spring 2005):
–
–
–
–
Effect of Pair Programming (vs individual programming) and Delegated vs Centralized Control Style
Gallis, Arisholm, Dybå & Sjøberg, “Evaluating Pair Programming among Professional Java Developers”, in preparation
196 professional programmers in Norway, Sweden, and UK (98 pairs)
99 individuals (from Exp 2)
14/11/2005
5
A quasi-experiment of increasing scope
Research
Question:
Effect of …
Control Style
Exp1
x
Exp2
x
UML
Exp3
x
Exp4
x
x
Task Order
x
Pair Programming
System
Dependent
Variables
Control Style
Coffee-machine
Duration (minutes)
Correctness (%)
Centralized (CC)
Delegated (DC)
Documentation No UML
Some UML
Complete UML
Task Order
Easy First
Difficult first
Development
Java Pen & Paper
tools
Java IDE
UML Tool
Development
Individual
process
Pair Programming
Subjects
BSc-students
MSc-students
Juniors
Intermediates
Seniors
Exp5
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
36
12
27
32
31
32
36
98
66
50
70
76
14/11/2005
6
Summary of experimental results
• Performing change tasks on a delegated control style requires, on
average, more time and results in more defects than on a centralized
control style, in particular for novices (undergraduate students and
junior consultants)
– Only seniors seem to have the necessary skills to benefit from the more
”elegant” delegated control style.
– Explanation: Unlike experts, novices perform a mental trace the code in
order to understand it. This tracing effort is more difficult in a delegated
control style
– Results are consistent with pen&paper versus using Java tools
– Results are consistent when performing the most difficult task first
• Two ways to decrease the cognitive complexity of the delegated
control style (for novices in particular)
– Extensive UML documentation
– Pair programming
• Uncertain whether the results are valid for
– other systems and tasks (external validity)
– alternative ways of measuring maintainability (construct validity)
14/11/2005
7
Dealing with non-equivalent groups
(to compare results across experiments)
Necessary to have a common pre-test to adjust for skill
differences between groups (using ANCOVA)*
Y = βo + β1*x + β2*z
Y = βo + β1*x + β2*z
β2 = 0
o
β2 ≠ 0
o
z
z
z
Z = treatment z=1
O = treatment z=0
Post-test (Y)
Post-test (Y)
o
o
z
o
z
o
Z = treatment z=1
O = treatment z=0
z
Pre-test (x)
Pre-test (x)
*T.D. Cook, and D.T. Campbell (1979), Quasi-Experimentation: Design & Analysis Issues for Field Settings, Houghton Mifflin Company.
14/11/2005
8
Effect of Pen & Paper vs Tools
(Exp1 + Exp 2)
100
120
90
% Correct
70
80
60
50
60
40
40
30
20
Duration (minutes)
100
80
Correctness
Duration
20
10
0
0
CC
DC
BSc - Pen & Paper
CC
DC
BSc - Java IDE
14/11/2005
9
Effect of Expertise (Exp 2)
100
120
90
% Correct
70
80
60
50
60
40
40
30
20
Duration (minutes)
100
80
Correctness
Duration
20
10
0
0
CC
DC
BSc
CC
DC
MSc
CC
DC
Junior
CC
DC
Intermediate
CC
DC
Senior
14/11/2005
10
Effect of UML (Exp 3)
100
140
90
120
100
% Correct
70
60
80
50
60
40
30
40
Duration (minutes)
80
Correctness
Duration
20
20
10
0
0
DC
DC
No UML (BSc)
UML (BSc)
14/11/2005
11
Effect of Task Order (Exp 2 + Exp 4)
100
120
90
% Correct
70
80
60
50
60
40
40
30
20
Duration (minutes)
100
80
Correctness
Duration
20
10
0
0
CC
DC
Easy Task First (Students)
CC
DC
Difficult Task First (Students)
14/11/2005
12
Effect of Pair Programming
(Exp 2 + Exp 5)
100
120
90
% Correct
70
80
60
50
60
40
40
30
20
Duration (minutes)
100
80
Correctness
Duration
20
10
0
0
CC
DC
Ind (Junior, intermediate, Senior)
CC
DC
Pair (Junior, intermediate, Senior)
14/11/2005
13