Transcript PPT

Three Software Development Gestalts
Kirby McMaster
Brian Rague
Steven Hadfield
Nicole Anderson
Schemas, Paradigms, and Gestalts
We use the term gestalt to refer to a mental framework that allows
students to organize course topics into a unified whole (greater than
the sum of the parts).
 Donald: A schema ... is a data structure of generic concepts stored in memory and
containing the network of relationships among the constituent parts.... If we are to
understand the relationships between concepts, we need to know in what order and
how closely concepts are linked and the character of the linkage.
 Bain: Students bring paradigms to the class that shape how they construct
meaning.... Even if they know nothing about our subjects, they still use an existing
mental model of something to build their knowledge of what we tell them.
Three Software Development Gestalts
2
Gestalts for Software Development
Software development is a broad and diverse field.
Our scope in this research is limited to the development of
Information Systems, consisting of:
 a set of application programs
 one or more databases.
In CS and IS curricula, software development principles are taught
in the following types of courses:
 Programming, with an emphasis on object-oriented concepts.
 Databases and database management systems.
 Software Engineering, including systems analysis and design.
Three Software Development Gestalts
3
Purpose of this Study
Our primary goal in this research was to construct scales to measure
gestalts for Programming, Database, and Software Engineering.
 Our methodology involved analyzing words in books.
 Our assumption is that words used frequently in a book reflect the
gestalt of the author.
 Our findings have relevance in designing ways to teach software
development courses.
 The gestalt scales can also help instructors choose suitable textbooks
for those courses.
Three Software Development Gestalts
4
Gestalt Scale Development
The construction of measurement scales for our three software
development gestalts involved the following steps:
1. Sampling: Select a diverse sample of Programming (OOP), Database
(DB), and Software Engineering (SE) books having an Amazon
concordance.
2. Measurement: For each concordance word, record the book, word, and
frequency.
3. Conversion: Change nouns, verbs, adjectives, and adverbs to a consistent
form.
4. Transformation: Rescale word frequencies (StdFreq) within a book to
give the average concordance word a score of 100.
5. Grouping: Combine relevant synonyms into word/groups, summing the
StdFreq scores for each group.
Three Software Development Gestalts
5
Gestalt Scale Development
6. Scale Construction: Build the Programming (PGestalt), Database




(DGestalt), and Software Engineering (SGestalt) scales using an iterative
process:
Look for words that are used frequently within each book and consistently
across similar books.
Build a tentative scale, and calculate scores for every book.
Remove books with low scale scores, and repeat the process.
Stop when the scales and list of remaining books stabilize.
Starting with 37 OOP books, 37 DB books, and 36 SE books, we
obtained (after several iterations) PGestalt, DGestalt, and SGestalt
scales constructed from 27 OOP books, 27 DB books, and 26 SE
books, respectively.
Three Software Development Gestalts
6
Programming Gestalt
The PGestalt scale consists of 14 word/groups and weights.
 The most frequent word/groups are:
 class/subclass
 method/algorithm
 object.
 The weights for each word/group are used to calculate an overall
weighted-average PGestalt score for a book.
 Sample calculations are illustrated for McMillan’s Object-Oriented
Programming with Visual Basic.NET (PGestalt score = 394.1).
Three Software Development Gestalts
7
PGestalt Scale
Word/Group
Books
class/subclass
method/algorithm
object
code/program
function/procedure
value/variable
integer
public/private
type/datatype
string
statement/line
data/information
new
file
TOTAL
27
23
27
27
18
27
21
24
26
24
18
26
27
24
Three Software Development Gestalts
Avg
StdFreq
541.1
391.2
317.5
314.3
268.9
268.7
203.3
201.7
186.2
180.3
168.6
164.2
154.9
152.2
Weight
20.87
13.78
10.29
10.14
7.99
7.98
4.89
4.81
4.08
3.80
3.25
3.04
2.60
2.47
100.00
8
PGestalt Calculations
McMillan (2004), OOP With VB.NET
Word/Group
class/subclass
method/algorithm
object
code/program
function/procedure
value/variable
integer
public/private
type/datatype
string
statement/line
data/information
new
file
TOTAL
Weight
StdFreq
Scale
20.87
13.78
10.29
10.14
7.99
7.98
4.89
4.81
4.08
3.80
3.25
3.04
2.60
2.47
*765.5
*437.0
*337.4
*434.8
168.4
248.2
236.7
*383.0
139.2
218.5
128.8
*314.3
131.0
36.3
159.8
60.2
34.7
44.1
13.5
19.8
11.6
18.4
5.7
8.3
4.2
9.6
3.4
0.9
394.1
Three Software Development Gestalts
9
Database Gestalt
The DGestalt scale consists of 14 word/groups and weights.
 The most frequent word/groups are:
 data/information
 table/relation
 database.
 The weights for each word/group are used to calculate an overall
weighted-average DGestalt score for a book.
 Sample calculations are illustrated for Watson’s Data Management
(DGestalt score = 438.7).
Three Software Development Gestalts
10
DGestalt Scale
Word/Group
Books
data/information
table/relation
database
query/sql
entity/relationship
attribute/column/field
key/primary/foreign
system/subsystem
object
model/modeling
user/client/customer
record/row/tuple
value/variable
type/datatype
TOTAL
27
27
27
26
18
25
22
25
21
22
25
24
27
27
Three Software Development Gestalts
Avg
StdFreq
524.8
438.1
335.3
266.9
257.0
234.7
220.3
206.2
197.5
195.1
190.2
187.6
164.8
154.5
Weight
19.55
15.56
10.83
7.68
7.23
6.20
5.54
4.89
4.49
4.38
4.15
4.03
2.98
2.51
100.00
11
DGestalt Calculations
Watson (2005), Data Management
Word/Group
Weight
data/information
table/relation
database
query/sql
entity/relationship
attribute/column/field
key/primary/foreign
system/subsystem
object
model/modeling
user/client/customer
record/row/tuple
value/variable
type/datatype
TOTAL
19.55
15.56
10.83
7.68
7.23
6.20
5.54
4.89
4.49
4.38
4.15
4.03
2.98
2.51
Three Software Development Gestalts
StdFreq
*1115.6
*431.7
*336.2
237.5
281.6
200.2
267.9
235.9
81.1
261.3
227.8
217.4
127.5
98.7
Scale
218.1
67.2
36.4
18.2
20.4
12.4
14.8
11.5
3.6
11.4
9.5
8.8
3.8
2.5
438.7
12
Software Engineering Gestalt
The SGestalt scale consists of 12 word/groups and weights.
 The most frequent word/groups are:
 software
 system/subsystem
 process.
 The weights for each word/group are used to calculate an overall
weighted-average SGestalt score for a book.
 Sample calculations are illustrated for Thayer’s Software Engineering,
Vol. 1 (SGestalt score = 394.7).
Three Software Development Gestalts
13
SGestalt Scale
Word/Group
Books
software
system/subsystem
process
data/information
code/program
requirement/specification
test/testing
user/client/customer
develop/development
project
design/designer
model/modeling
TOTAL
26
26
26
26
24
25
20
23
26
26
26
25
Three Software Development Gestalts
Avg
StdFreq
444.6
401.3
304.8
265.1
231.8
229.5
226.8
223.3
221.8
207.1
175.6
169.1
Weight
18.13
15.85
10.77
8.69
6.93
6.81
6.67
6.49
6.41
5.63
3.98
3.64
100.00
14
SGestalt Calculations
Thayer (2002), Software Engineering, Vol. 1
Word/Group
Weight
software
system/subsystem
process
data/information
code/program
requirement/specification
test/testing
user/client/customer
develop/development
project
design/designer
model/modeling
TOTAL
18.13
15.85
10.77
8.69
6.93
6.81
6.67
6.49
6.41
5.63
3.98
3.64
Three Software Development Gestalts
StdFreq
*727.7
*550.8
190.0
263.0
*382.3
*419.3
280.6
219.2
296.7
124.3
*370.0
95.1
Scale
131.9
87.3
20.5
22.9
26.5
28.6
18.7
14.2
19.0
7.0
14.7
3.5
394.7
15
SOFTWARE DEVELOPMENT GESTALTS
Software Engineering (SE)
softw are
system/subsystem
process
data/information
code/program
requirement/specification
test/testing
user/client/customer
develop/development
project
design/designer
model/modeling
SE + DB
SE + OOP
code/program
data/information
Programming (OOP)
class/subclass
method/algorithm
object
code/program
function/procedure
value/variable
integer
public/private
type/datatype
string
statement/line
data/information
new
file
SE + OOP + DB
data/information
OOP + DB
data/information
object
type/datatype
value/variable
data/information
model/modeling
system/subsystem
user/client/customer
Database (DB)
data/information
table/relation
database
query/sql
entity/relationship
attribute/column/field
key/primary/foreign
system/subsystem
object
model/modeling
user/client/customer
record/row /tuple
value/variable
type/datatype
Three Software Development Gestalts
Gestalt Scale Distributions
Gestalt scale scores varied widely across the books in the sample.
 PGestalt scores for the 37 OOP books ranged from 101.1 to 402.6.
 DGestalt scores for the 37 DB books ranged from 118.7 to 438.7.
 SGestalt scores for the 36 SE books ranged from 124.9 to 394.7.
A graph of the three gestalt distributions is shown on the next slide.
Each distribution is shown only for books in the relevant category
 PGestalt for OOP books
 DGestalt for DB books
 SGestalt for SE books.
Averages for the three scales for all books are presented in a table
on the following slide.
Three Software Development Gestalts
17
Gestalt Scale Distributions
37 OOP, 37 DB, and 36 SE Books
14
12
Books
10
8
6
4
2
0
0-49
50-99
100149
150199
200249
250299
300349
350399
400449
450+
Gestalt Score
P Gestalt/OOP
DGestalt/DB
Three Software Development Gestalts
SGestalt/SE
18
Gestalt Scale Distributions
Averages by Book Category and Scale Type
Category
PGestalt
DGestalt
SGestalt
OOP (37 books)
271.5
75.7
66.3
DB (37 books)
107.0
274.6
99.4
SE (36 books)
86.8
94.2
257.2
155.7
148.6
139.9
ALL (110 books)
Three Software Development Gestalts
19
Comparing Gestalt Scales
The relationship between PGestalt and DGestalt scores for all sample
books is displayed on the next screen as a scatter plot.
 The correlation between the PGestalt and DGestalt scales is slightly
negative (= -0.392), when all 110 books are included.
 The correlation is larger (= -0.769) when only OOP and DB books are
considered.
 Note that 3 DB books and 1 SE book have PGestalt scores above 200
(high on the “wrong” scale).
 Similar patterns exist for the PGestalt-Gestalt and DGestalt-SGestalt
pairs of scales.
Three Software Development Gestalts
20
PGestalt vs. DGestalt Scatter Plot
37 OOP, 37 DB, and 36 SE Books
450
400
DGestalt
350
300
250
200
150
100
50
0
0
50
100
150
200 250
300 350 400 450
PGestalt
PG
DB
SE
Three Software Development Gestalts
21
Gestalt Mixtures
The mixture of gestalts within a book can be obtained by expressing
the gestalt scores as percentages.
 The next slide is a triangular scatter plot, with PGestalt and DGestalt
percentages on the axes.
 The SGestalt percentage is implied, since the sum of all 3 is 100%.
 A gestalt is dominant if its percentage score exceeds 50%.
 Ninety of the 110 books have a dominant gestalt.
 Only one (DB) book was dominant on the "wrong" (PGestalt) scale.
Three Software Development Gestalts
22
Gestalt Scale Mixture Plot
37 OOP, 37 DB, and 36 SE Books
Three Software Development Gestalts
23
Choosing a Textbook
Question: Would McConnell’s Code Complete be a suitable textbook
for a Software Engineering course?
 The SGestalt score for this book is 185.4 (see next slide).
 The most frequent word/groups are:
 code/program
 data/information
 test/testing.
 The Gestalt mixture is: PGestalt = 42.7%, SGestalt = 38.1%, and
DGestalt = 19.2%.
 We would not recommend Code Complete as a primary SE textbook,
but it would provide worthwhile supplemental reading.
Three Software Development Gestalts
24
SGestalt Calculations
McConnell (2004), Code Complete, 2nd ed
Word/Group
Weight
software
system/subsystem
process
data/information
code/program
requirement/specification
test/testing
user/client/customer
develop/development
project
design/designer
model/modeling
TOTAL
18.13
15.85
10.77
8.69
6.93
6.81
6.67
6.49
6.41
5.63
3.98
3.64
Three Software Development Gestalts
StdFreq
183.1
107.7
50.1
*303.3
*915.4
65.2
223.6
-76.6
148.4
185.2
--
Scale
33.2
17.1
5.4
26.4
63.4
4.4
14.9
-4.9
8.4
7.4
-185.4
25
Summary and Conclusions
In this paper, we have described three scales for measuring gestalts in
software development:
1. Programming – for writing object-oriented programs.
2. Database – for designing and implementing databases.
3. Software Engineering – for building complex software systems.
We calculated three gestalt scores for each sample book and analyzed
the distributions of these scores.
 The scales were able to discriminate between the types of books used to
construct the scales.
 A few books had high gestalt scores on the "wrong" scale.
 When viewed as a mixture of gestalts, 90 of the sample books had a
dominant gestalt.
Three Software Development Gestalts
26