Transcript Slide 0

Process Mining Software
Repositories
Master project midterm presentation
Wouter Poncin, [email protected]
Agenda
•
•
•
•
•
Introduction
Project goal
Example analysis project
FRASR features
Future work
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 1
Introduction
• Software development
• Various repositories
− Version control systems
− Bug databases
− Mailing lists
− Wiki articles
−…
• Analysis tools
− Most only work on a single type of repository
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 2
Introduction
Problem description
• The goal of this project is to develop an application which
facilitates process analysis of data from various software
repositories, in an easy manner.
•
•
•
•
Facilitate  export data to log
Process analysis  global process overview
Various repositories  combine data
Easy manner  add a data source by URL
• Open source & closed source projects
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 3
Introduction
FRASR
• FRASR (FRamework for Analyzing Software Repositories)
• Single data source analysis
− How has the source code grown?
• Multiple data source analysis (combined)
− What are the roles of the developers of a project?
− How has the project evolved in terms of developer
participation?
• Multiple data source analysis (extended)
− Projects switch between software repositories
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 4
Process mining software repositories
S.E. Question
FRASR
ProM
Answer
Define data
sources
Define case
mapping
Calculate
developer
matching
Export
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 5
Example Analysis Project
Single data source analysis - S.E. Question
• TU/e Software engineering project
• Project manager
• Senior management
• Analysis question:
• Has the prototype been used as (a part of) the final
implementation?
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 6
Example Analysis Project
Single data source analysis - FRASR
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 7
Example Analysis Project
Single data source analysis - FRASR  Data sources
• Define data sources
•
•
•
•
SVN
TRAC tickets
TRAC wiki
Mailing lists
• Provide:
• URL
• (Authentication)
FRASR
Data
sources
/ Department of Mathematics and Computer Science
Case
mapping
Developer
matching
Export
18-7-2015
PAGE 8
Example Analysis Project
Single data source analysis - FRASR  Case mapping
• Define case mapping
• Determines analysis options
• Selected mapping:
• Data field case
− Map each event to a data field of the data source
− Example: “/trunk/documents/srd/srd.tex”
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 9
Example Analysis Project
Single data source analysis - FRASR  Case mapping
• Define case mapping
• Determines analysis options
• Selected mapping:
• Data field case
− Map each event to a data field of the data source
− Example: “/trunk/documents/srd/srd.tex”
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 10
Example Analysis Project
Single data source analysis - FRASR  Case mapping
• Attach data sources to case definition
• Detailed binding:
• Per element from the data hierarchy:
− Determine whether to include it
− Which field should be used as the event name
• Example:
• Revision (not included)
− Modification (included  modification type)
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 11
Example Analysis Project
Single data source analysis - FRASR  Case mapping
• Attach data sources to case definition
• Detailed binding:
• Per element from the data hierarchy:
− Determine whether to include it
− Which field should be used as the event name
• Example:
• Revision (not included)
− Modification (included  modification type)
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 12
Example Analysis Project
Single data source analysis - FRASR  Developer matching
• Developer matching
• (Different) aliases per data source
• Automatic matching
− Matches (parts of) names / userId’s / email addresses
• Manual modifications
FRASR
Data
sources
/ Department of Mathematics and Computer Science
Case
mapping
Developer
matching
Export
18-7-2015
PAGE 13
Example Analysis Project
Single data source analysis - FRASR  Developer matching
• Developer matching
• (Different) aliases per data source
• Automatic matching
− Matches (parts of) names / userId’s / email addresses
• Manual modifications
FRASR
Data
sources
/ Department of Mathematics and Computer Science
Case
mapping
Developer
matching
Export
18-7-2015
PAGE 14
Example Analysis Project
Single data source analysis - FRASR  Export
• Export data to log
• Apply filters to reduce the log size
• Multiple export formats (CSV, MXML,…)
• Choose export format based on the analysis application
• ProM requires MXML
FRASR
Data
sources
/ Department of Mathematics and Computer Science
Case
mapping
Developer
matching
Export
18-7-2015
PAGE 15
Example Analysis Project
Single data source analysis - ProM
Blue
White
Red
= file added
FRASR
modified
= file deleted
S.E.
= file
Question
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 16
Example Analysis Project
Single data source analysis - ProM
Blue
White
Red
= file added
FRASR
modified
= file deleted
S.E.
= file
Question
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 17
Example Analysis Project
Multiple data sources (combined) analysis - S.E. Question
• TU/e Software engineering project
• Senior management
• Analysis question:
• Did the Project Manager (PM) perform his tasks?
− Communicate the agenda’s
− Create and update the SPMP
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 18
Example Analysis Project
Multiple data sources (combined) analysis - FRASR
• Case mapping:
• Constant-case
− Instance 1: Mails
− Instance 2: Subversion
• Data source bindings:
• Component-binding
− URD, SRD, ADD, Code, Agenda, Minute, …
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 19
Example Analysis Project
Multiple data sources (combined) analysis - ProM
• Agenda’s were regularly emailed
• But only 11 out of 14 times
• The SPMP was updated
• But not created by the PM
SR
UR
AD
Agenda (black), SPMP (blue), Minutes (red), Other ( white
whit )
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 20
Example Analysis Project
Answer
• The analysis mainly provides indications instead of
‘concrete evidence’
• The problem owner can decide whether something is to
be considered as ‘undesired’
S.E.
Question
FRASR
/ Department of Mathematics and Computer Science
ProM
Answer
18-7-2015
PAGE 21
Process mining software repositories
S.E. Question
FRASR
ProM
Answer
Define data
sources
Define case
mapping
Calculate
developer
matching
Export
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 22
FRASR
Case mapping
• Available case types:
• Component case
− Map each event to a “component”
− Example: “/trunk/documents/srd/srd.tex”  “SRD”
• Constant case
− Manually define case instances
− Example: “SVN”, “Mails”, “Bugreports”, …
• Date/time case
− Map each event to “a function of” the timestamp
− Example: “2010-02-12 08:32:55”  “2010-02”
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 23
FRASR
Case mapping
• Available case types (cntd):
• Data field case
− Map each event to a data field of the data source
− Example: “bug-created”  “high-priority”
• Originator case
− Use the originator of an event as the case instance
FRASR
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
Export
18-7-2015
PAGE 24
FRASR
Case mapping
• Attach data sources to case definition
• Types of bindings
Event-name
•
•
•
•
•
•
•
FRASR
Constant binding
Data source category binding
Data source type binding
Data source name binding
Detailed binding
<DataSource> binding
<DataSourceComponent> binding
Data
sources
Case
mapping
/ Department of Mathematics and Computer Science
Developer
matching
CONST
Bug tracker
TRAC Tickets
PRIFES
“event per update”
“create/update/close/…”
map fields to
“components”
Export
18-7-2015
PAGE 25
FRASR Features
•
•
•
•
•
Add resources by URL
Various types of case definitions
Automatic developer matching
Originator anonymization
Multiple export formats
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 26
Future work
• Extra features
• Caching of (downloaded) data
• Enable the use of attributes of log elements
• Optional
• GUI layout
• Configurable developer matching
• Batch exports
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 27
Questions
?
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 28
References
• http://frasr.hogeq.nl
Official website of FRASR
• http://www.win.tue.nl/~aserebre/2IS55/
Software Evolution Course
• http://netbeans.org/
NetBeans Website
• http://sourceforge.net/projects/amsn/
aMSN SourceForge project
• http://sourceforge.net/projects/gallery/
GALLERY SourceForge project
• [Nak02] Nakakoji, K., Yamamoto, Y., Nishinaka, Y., Kishida, K., Ye,
Y. Evolution patterns of open-source software systems and
communities. In IWPSE '02: Proceedings of the International
Workshop on Principles of Software Evolution, pages 76-85, New
York, NY, USA, (2002). ACM.
/ Department of Mathematics and Computer Science
18-7-2015
PAGE 29