Extreme Grammaring

Download Report

Transcript Extreme Grammaring

Extreme Grammaring
Development of an
industrial strength
ISO VDM-SL Grammar
Introduction
Need of building a parser for VDM from a VDMSL grammar (VooDooM project)
Although parsing is a subset of a well studied area like
compilers, grammars were always looked upon as
the “ugly duckling”.
Solution: Extreme Programming + Engineering of Grammars
=
Extreme Grammaring
Background

VDM:



Vienna Development Method (VDM) is one of the
most mature formal methods
Primarily intended for formal specification and
development of functional aspects of software
systems.
The importance of a VDM-SL Grammar:


Documentation
Build a parser (metric generators, language
generators, ...)
Starting point

Previous work



VDM-SL grammar in Happy + Alex
Some problems
State of Art (Hacking v.s. Engineering)




Grammar was encoded directly
Difficult to maintain/change (300 rules)
Lack of tool support
...
Principles of Grammar
Engineering
Introduced by Lämmel in “Towards an engineering
discipline for grammaware”
1.
2.
3.
4.
5.
6.
Start from specifications - base-line grammar
Implement by customization - technology,
implementation
Separate concerns - modularization
Enable evolution - minimize impact of changes
Ensure quality - metrics, correctness
Automate - traceability and scalability
Extreme Programming
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
The Planning Game - scope, priorities, technical estimates
Small Releases - very short release cycle
Metaphor - shared story
Simple Design - remove complexity when found
Testing - continuous unit testing, test-first design
Refactoring - restructure without functionality changes
Pair Programming - two programmers one machine
Collective Ownership - change each others code (anytime)
Continuous Integration - build and test
40-Hour Week - work more = produce less
On-Site Customer - user in team
Coding Standards - no irrelevant personal preferences
Extreme Grammaring
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
The Planning Game - scope, priorities, technical estimates
Small Releases - very short release cycle
Metaphor - shared story
Simple Design - remove complexity when found
Testing - continuous unit testing, test-first design
Refactoring - restructure without functionality changes
Pair Programming - two programmers one machine
Collective Ownership - change each others code (anytime)
Continuous Integration - build and test
40-Hour Week - work more = produce less
On-Site Customer - user in team
Coding Standards - no irrelevant personal preferences
The Planning Game
Scope


Follow strictly the ISO VDM-SL grammar spec
Priorities

1.
2.
3.
Disambiguate types
Disambiguate full grammar
Tree construction
Technical estimates


Not defined...
Small Releases

Programmed releases (completed):




0.0.1 - Grammar typed from standard
0.0.2 - Disambiguated grammar
0.0.3 - AST construction
Future Releases


1.0 - Haskell front-end (finished - 29-12-2004)
2.0 - Java front-end
Testing

White box



Structural testing
Full visibility into how system works
Black box


Functional or behavioral testing
Only the interface with exterior is available
Grammar Unit Testing

Unit test


Different types of unit tests:



Test a single method, function, etc...
Parsing (succeeds, fails)
Well-formness of the tree
Test suite

Combination of all unit test
Test Coverage

Rule coverage




Introduced by Purdon (1971)
Explores all rules of a grammar
Simple measure but doesn’t cover all cases
Context-dependent rule coverage



Introduced by Lämmel in “Grammar Testing”
Generalization of the above in which the context
is taken in account
No known implementations
Test Coverage Metrics
version
KP
KPr
S
RSa
RSm
RC
0.0.1
0.0.2
0.0.3
1005
946
900
815
2753
2753
2278
3396
3518
2
2
2
17
17
17
52%
54%
55%






KP - Kernel Productions
KPr - Kernel Priorities
S
- States
RSa - Rule Size average
RSm - Rule Size maximum
RC - Rule Coverage Percentage
Test Coverage Metrics (2)


version
Generics
expressions
functiontypes
All
0.0.1
0.0.2
0.0.3
52%
51%
50%
0%
24%
25%
0%
6%
5%
52%
54%
56%
Although the “Generics” test-suite does not change
the coverage gets lower (Injections, total nr rules)
The “expressions” and “functiontypes” were only
added in 0.0.2 version.
Refactoring

Semantic preserving transformations


Study made by Lämmel in “Grammar Adaptation”
Operators:






preserve - replace a phrase by an equivalent
fold
- replace for its definition
unfold
- extract definition
introduce - introduction of non-terminals
eliminate - elimination of non-terminals
rename - renaming non-terminals
Continuous Integration


The integration test suite is a set of generic real
world examples
Only 52% coverage



Examples are difficult to find
Most of the examples use language extensions
Examples:


Found on internet
Used a pre-processor for extracting code from literal
programs.
Code Standards


Nothing found about the subject
The following can be applied:







Limiting the number of children in a rule
Limiting the number of alternatives in a rule
Prefer some sort of constructs than other
Convention for the non-terminal names
Convention for syntax specification
Limit module size
...
Supporting the Methodology
SDF - Syntax Definition Formalism
 Purely declarative
 Very expressive with natural and concise syntax
 Modular structure
 Supported by Scannerless Generalized LR Parsing



Supports compositional grammars
Allows parsing with ambiguities (allows earlier testing)
Disambiguation is separated from the grammar using
priority and associative rules
SDF - Technology

Parsing


Testing


tree2graph, graph2dot
Transformation


test-unit, ambtracker, SdfCoverage
Tree visualization


sdf2table, sglr
trm2baf, implodePT
Haskell Generation

Sdf2Haskell (AST, Pretty Printer)
Setting up the bases


Hard copy of the ISO VDM-SL standard
(ISO/IEC 13817-1)
Initial test suite



Real world examples (loc: 1970)
Exercises from Formal Methods course (loc: 507)
Software:





CVS to keep track of all changes
parse-unit (sdf unit testing tool)
Sdf2 software bundle (sdf2table, sglr)
SdfCoverage
Starndard unix tools (text editor, make, ...)
Development cycle
Initial grammar
Correction
1.
2.
1.
2.
Correct grammar rules
Correct test suite
Disambiguation
3.
1.
2.
Add filters
Change grammar
shape
Steps 2 and 3 should
make heavy use of
testing
Grammar correction
Isolate problem
1.


Source location
Grammar rules involved
Correct grammar
2.



Change syntax (test suite)
Run to verify test succeeds
Run entire test battery
Commit Change
3.

Document change in message
Grammar Disambiguation
Isolate problem
1.


Source location
Grammar rules involved
Create unit test
2.


Captures error
Run to guarantee this
Correct grammar
3.



Add disambiguation filter (change syntax)
Run to verify unit test succeeds
Run entire test battery
Commit Change
4.

Document change in message
Grammar Metrics

Simple metrics



Total Number of Terminals (AVG per rule)
Total Number of Non-terminals (AVG per rule)
Complex metrics

Introduced by Malloy in “A metrics suite for
grammar-base software”



McCabe Cyclomatic complexity
Halstead Effort
...
Problems found


ISO Document has ambiguities in its
specification
Syntax

Expressions:



EqualsDefinition


Apply v.s. RecordConstructor
Apply v.s. IsDefTypeExpr
CallStatement v.s. Expression
Lexical

Quotes are allowed in strings and in characters
Future plans

Short-term (VDM parser “clients”):





VooDooM
Formal methods projects
MCZ Objectifier
Camila revival?
Long-term

Topic open for discussion...
What’s next?

Test set completion (fill the rest 44%)



Analyze the rules that were not covered



Test generation
Add examples manually
Try to find pathologies
Compute Grammar Metrics
Test the methodology developing other
grammars.
Conclusion




Work was completed in only 3 weeks
A complete grammar of the ISO VDM-SL is
for the first time public available (parser)
A strong methodology for grammar
developing was defined
Grammar testing were put to practice


Different types of tests
Test coverage
Thank you!
Questions / Discussion