slides ( -- 342 KB)

Download Report

Transcript slides ( -- 342 KB)

Debugging Integrated Systems:
An Ethnographic Study of Debugging Practice
Thomas Østerlie & Alf Inge Wang
1
Debugging Integrated Systems
Overview
•
Background
– Debugging
– Motivation
– Sensemaking
•
Research setting
– Overview Gentoo
– Gentoo’s formal debugging process
•
Five characteristics of debugging practice in Gentoo
–
–
–
–
–
•
C1: Spans a variety of operating environments
C2: Collective
C3: Social
C4: Heterogeneous
C5: Ongoing
Concluding remarks
– Brief summary
– Transferability of findings
– Implications
2
Debugging: The traditional view
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
• Trace a causal chain from reported failure to its corresponding fault
• Access to and control over source code critical for this form of
debugging
3
Motivation
• Increased focus on systems integration
– Component-based development
– Information systems and enterprise integration
– Service-Oriented Architecture
• Debugging integrated systems
– Software being integrated is developed and maintained by thirdparties
– Integrators have to debug systems without access to the source
code of the components being integrated
• What do systems integrators do in practice when
debugging?
– Little is known about the debugging of integrated systems
– Need to understand what is going on before improving on the
process
4
Sensemaking
• Problem solving
–
–
–
–
Individual cognitive activity
Clearly defined problems and resources to solve the problem
Problem: software failure
Resources: tools and techniques for locating the fault
• Problem setting (Schön 1991)
– Problems do not present themselves as givens
– Make a situation that is puzzling, troubling, uncertain into something that
makes sense
– Constructing problems from the materials of problematic situations
• Sensemaking (Weick 1995)
– ‘What is going on?’
• A group’s collective experiences of a problems situation progressively
clarified
– Action precedes understanding
5
• Active engagement with the problem situation
• Understanding is retrospective
Research setting: Gentoo
• Community of volunteer software developers
– 320 official Gentoo developers
– Distributed across 38 countries, 17 time zones
– Develop and maintain a software system for integrating third-party OSS
packages with various Unix operating systems
• The Gentoo software distribution – Integrator’s view
–
–
–
–
–
8,480 third-party packages supported
One installation script for every version of each supported package
Total of 23,900 installation scripts in package database
Total SLOC of 671,971 in package database
Supports 6 different Unix operating systems (5 processors on GNU/Linux)
• Gentoo installation – User’s view
– Runs on a local computer
– Local copy of the Gentoo package database
– Uses the Gentoo package manager to integrate third-party OSS packages
with the local system
6
Formal process of debugging in Gentoo
7
C1: Spans a variety of operating
environments
• Variety of operating environments among individual
Gentoo installations:
– Configuration of individual packages
• Optionals
• Virtuals
– Operating system
– System evolution
• “Variation kills reproducibility”
– State of individual Gentoo installations often impossible to
replicate
– Reproduction of reported failure therefore often impossible
• Problem situation, rather than clearly defined software
failures
8
C2: Collective
• User and developers work together to make sense
of the problem situation
• Collaborative debugging
– User provides data - creates the materials of the
problem situation
– Developers interpret data, request new data
• Collaborative environment
– User to developer communication: Problem reports
– Developer to developer communication: IRC channel
9
C3: Social
• Workload
Date
New reports
Reports closed
Open reports
Number of developers
January 6, 2003
269
Not avail.
1893
102
January 5, 2004
837
428
4479
259
January 3, 2005
700
390
7877
Not avail.
January 16, 2006
799
447
9083
320
• Priorities
– Reproducible vs. irreproducible
– Curb work load vs. retaining users’ interest
• Responsibilities
–
–
–
–
10
Formal roles
Actual problem
Negotiate to bridge reality of formal roles with reality of actual problem
Determining responsibility inherent in sensemaking process
Exhibit: Enacting responsibilities
11
Statement
Researchers’ commentary
Reporting user: I have installed my system
from scratch
The problem is related to the way Gentoo
integrates software, and therefore the Gentoo
developers' responsibility
Developer A: [making reference to the
systems information provided with the
problem report] Is using an x86 profile for an
amd64 machine troublesome?
The reported problem is related to the way
the user's Gentoo systems configuration;
therefore the user's responsibility
Developer B: [making reference to the
installation script] Turning off the optional
esound support might solve the problem.
The problem may be related to how the
package integrates with the esound package,
and the third-party provider's responsibility.
Developer A: [making reference to the
compiler error provided with the problem
report] Why is it that the thing can't find
pthread? is that because of a missing pthread
The problem is related to the use of the
pthreads library, and therefore the
responsibility of another herd.
Developer B: sounds like the glibc library
was upgraded
Related to the user's system configuration,
and his responsibility
Summary of findings
• Debugging integrated systems is a
collective sensemaking process
• Cyclic rather than linear
• Influenced by technical as well as social
factors
• More than individual problem solving
• Process driven by plausibility rather than
accuracy
12
Transferability of findings
• Presented findings to systems integrators
– International network of researchers and practitioners
– R&D department of a global telecom
– National consultancy specializing in systems integration
• Similarities between commercial and volunteer context
– Finding the problem is the problem
– The role of organizational issues
– The precarious relationship between systems integrator as provider
and customers
• Similarities between systems integration and application
development (tentative)
– Finding the problem is the problem
– The role of organization issues
13
Implications
•
For research #1: Distinction corrective maintenance and debugging
– Not useful when studying practice
– The practice of debugging closely intertwined with the administration of the
corrective maintenance process
•
For research #2: For systems integrators the software failure is not an
unproblematic phenomenon
– Subject to interpretation and negotiation
– What constitutes a software failure contingent upon the situation (workload,
priorities, responsibilities, access to technical data)
– Clear relation between error in code and observed failures too simplistic
– Need better models for understanding software failures in practice
•
Implications for practice
– Difficult to determine what data is relevant prior to engaging with the problem
– Comprehensive classification schemas of limited use for users reporting problems
– Defect tracking systems need to support interaction between the reporting user and
the system integrators debugging the problem
14
Thank you for your time.
Questions?
15