Draft Presentation

Download Report

Transcript Draft Presentation

Innovation Success:
An Empirical Study of Software
Development Projects in the Context of the
Open Source Paradigm
Jorge Colazo, Ph.D. Candidate
Dissertation Proposal Committee:
Prof. Kevin B. Hendricks Operations Management (Chair)
Prof. Larry J. Menor Operations Management
Prof. Anabel Quan-Haase Sociology / Information and Media Studies
August 2006
@ a Glance
• Topic: Innovation Success (Software)
• Practical: How to design a successful software project?
• Academic:
– IP represents a multi-billion dollar business and is being protected
more than ever. However, can opening the IP rights be somehow
beneficial for project success?
– What kind of development collaboration pattern is best?
– Do users and peripheral developers help or hinder?
• Field study of working software projects using archival
data from source code repositories and other e-artifacts
• Unit of study: project
• Regression Analysis (OLS / Robust / Cox / Time Series)
• Proposal Approved; Data Collection at Advanced Stage
Motivation
• Success rates in new product
development are stubbornly low
• This is especially true in the case of
software (4% success rate). Issues:
– Developer productivity
– Developer retention
– Timeliness
– Quality
• All this can be very costly
Research Focus
• How to scholarly approach the problem of
improving new software development success?
• Growing paradigm in software development:
Open Source Software
– In OSS the “source code” is available and may be
modified and redistributed as per special licenses
– Engenders the intervention of a “community” of
volunteer developers and users
– Development team collaborates freely, choosing
own assignments and creating asynchronous
collaboration networks
Research Questions
How can the unique characteristics of the OSS
model improve the software NPDP and enrich
what we know about innovation management?
1. How are different OSS licenses associated with
development success?
2. How are the spontaneous collaboration patterns
in the development team associated with
development success?
3. How is the community of peripheral developers
and users associated with development success?
Expected Contributions
• In a context where IP management is a
billion-dollar business and IP is being
protected more than ever, will opening the
IP rights be beneficial for project success?
• What is the most effective networked
collaboration structure?
• Do users and peripheral developers help
or hinder?
The definition of software
development success
• OM / NPD: Product and Project success
• IS: Team outcomes as success indicator
• CS: Static metrics can be obtained
• Product Success
– Popularity
– Quality
• Process Success
– Developer Productivity
– Development speed
– Developer permanence
Research Model
Research Question 1: IP / Licenses
All Licenses: Warranty Disclaimer
BSD: The names of original contributors cannot be used to endorse or promote derivatives
GPL: Derivatives to be licensed under GPL regardless of original license (modified + non-modified parts)
LGPL: Only modifications to LGPL’ed part in derivatives need to be licensed under LGPL
Research Question 1: IP / Licenses
• Key concept: Copyleft
• The tenets of “hacker ethics” supports
better the concept of non-appropriability
than the concept of “viral licenses”.
• Copylefted projects are more strongly
identified with archetypal moral norms of
OSS developers
Research Question 2: Collaboration Structure
Research Question 2: Collaboration Structure
• Individualistic paradigm
– Structure is defined through the aggregation of individuals
into meaningful categories (e.g. managers vs. staff)
• Network paradigm 
– Structure is defined by observed interaction patterns
– Actors are the core developers
– Ties exist between two developers when they work in one or
more common files
– The relation is undirected, valued by the number of files
worked in common
Research Question 2: Collaboration Structure
Research Question 2: Collaboration Structure
Temporal Dispersion
Research Question 3: Community
Hypotheses
1
2
3
Sampling
• OSS projects in Source Forge.net
– Written in 100% pure “C”
– 5 or more core developers
–  ~ 120 projects
– Quarterly snapshots from beginning of the
project
– Information retrieval needs custom-written
spidering / scraping software
Some Measures
Construct
Definition
Metric
Source
Network Density
Degree of connectedness
Network density
(Wasserman and Faust 1994)
Network Centralization
The extent to which core
developers differ in their
importance
Degree centralization
(Freeman 1977)
Boundary Spanning
Activity
Communication activity that
spans the project’s
boundaries
The project’s degree
centrality in the inter-project
network
New metric
Quality
Number of potential defects
Number of pre-test
estimated bugs
(Ottenstein 1981)
Productivity
Code written per core
developer per unit time
Core developer factor
score
New metric
Development Speed
Time to produce working
release
Inter-release time
(Stewart et al. 2005)
Product popularity
People using the software
Downloads
(Crowston 2003)
Software complexity
Logic intricacy,
understandability
Cyclomatic complexity
(McCabe 1976)
Software size
Size of the project
Source lines of code
Very common measure in CS
Data Collection
Analysis
• Regression Models (OLS / Time Series /
Robust Regression / Cox )
– Appropriate for exploratory stage studies
– Many diagnostic tests exist that can be used
to assess sensitivity of results
– Can be adjusted for non-linearities, censored
data, etc.