Documentation Costs Avoided using Python and
Download
Report
Transcript Documentation Costs Avoided using Python and
Documentation Costs Avoided
using Python and other Open
Standards
Andrew Jonathan Fine
Operating Systems Software Organization
Engines, Systems, and Services
Honeywell International
Original Core Data Flow
Generator
Application
Paragraphs,
Tables,
Pictures
Company
Document
Translators
Inserter
Raw.doc
Formatter
Company
Database
Company
Template.dot
Single Python application
• set of front end translators
• content inserter
• post-processing formatter
Final.doc
Front End Translator
• Selected by caller
• Caller specifies input file
containing corporate data
• Extracts components from file
Pictures
Tables
Paragraphs
• Saves to Python dictionary
Inserter
• Caller selects components from Python
dictionaries made by front-ends for
respective documents.
• Inserter creates a Word document
• Inserter uses Python/Com to insert
components into document
Back End Formatter
• Scans corporate Word document template
• Scans Word document made by inserter
• Makes final style corrections.
Why?
The flow was designed to cope with
changes in requirements!
•
•
•
•
New projects
New teams
New data source formats
New standards for existing formats
First front-end translator
Take pictures, tables, and data from a
recursive property list constructed by an
aerospace
industry
software
visual
programming tool called BEACON.
(… actual design of translator outside the scope
of this paper…)
Initial Design of Inserter
• Straightforward
use
of
principles
demonstrated by Mark Hammond's book,
Python Programming in Win32.
• Chapter containing a thorough treatment
of how to have Python use the Word 97
COM object model to create and
manipulate a Word Document.
Problems!!!
• Must cope with huge amounts of corporate
data such as table cells..
• Speed of COM interface for new individual
elements.
• Reuse issues for detailed typesetting of
elements.
What I wanted:
• Faster conversion
• Existing standard
• Callable from Python
What I found:
• Faster conversion (OpenJade)
• Existing standard (DocBook SGML)
Why Call from Python?
• New scripting language to replace islands of
automation (Perl, MSDOS, internal test stand
controller language).
• Easier to connect islands after writing in Python.
• Open source thus continuously peer reviewed.
• Tremendous user base! Plenty of wrappers
written in Python around open source libraries
supporting open standards.
… so I wrote a Python wrapper around some DocBook rules …
Revised Core Data Flow
Generator
Application
Translators
Company
Database
Typesetting
Text in
DocBook
SGML
Paragraphs,
Tables,
Pictures
Company
Document
DocBook.py
OpenJade
DocBook.smgl
Result.rtf
DocBook
SGML
definition and
default
stylesheets
Local
Docbook
DSSSL
stylesheets
\usr\packages\sgml
Local.dsl
Local.dtd
Cleanup.py
Company
Template.dot
• Python wrapper writes DocBook SGML
• OpenJade translates DocBook SGML
to Word RTF
Final.doc
A DocBook Informal table
rendered by OpenJade into Word
Name
Type
statex
Integer
statey
Long
Input to OpenJade as local
DocBook SGML
<!DOCTYPE informaltable SYSTEM "C:\Local.dtd">
<informaltable frame='all'>
<tgroup cols='2' colsep='1' rowsep='1' align='center'>
<colspec colname='Name' colwidth='75' align='left'></colspec>
<colspec colname='Type' colwidth='64' align='center'></colspec>
<thead>
<row>
<entry><emphasis role='bold'>Name</emphasis></entry>
<entry><emphasis role='bold'>Type</emphasis></entry>
</row>
</thead>
<tbody>
<row>
<entry><phrase role='xe' condition='italic'>statex</phrase></entry>
<entry>Integer</entry>
</row>
<row>
<entry><phrase role='xe' condition='italic'>statey</phrase></entry>
<entry>Long</entry>
</row>
</tbody>
</tgroup>
</informaltable>
from DocBook import DocBook
class ItalicIndexPhrase (DocBook.Rules.Phrase):
"italic indexible text phrase"
TITLE
= DocBook.Rules.Phrase
def __init__
(self, text):
DocBook.Rules.Phrase.__init__ (self, 'xe', 'italic')
self.data = [ text ]
class NameCell
(DocBook.Rules.Entry):
"table row cell describing name of identifier (italic and indexible text!)"
TITLE
= DocBook.Rules.Entry
def __init__
(self, text):
DocBook.Rules.Entry.__init__ (self)
self.data = [ ItalicIndexPhrase (text) ]
class StorageCell
(DocBook.Rules.Entry):
"table row cell describing storage type of identifier (ordinary text)"
TITLE
= DocBook.Rules.Entry
def __init__
(self, text):
DocBook.Rules.Entry.__init__ (self)
self.data = text
class TRow
(DocBook.Rules.Row):
"each row in application's informal table body"
TITLE
= DocBook.Rules.Row
def __init__
(self, binding):
(identifier, storage) = binding
DocBook.Rules.Row.__init__ (self, [ NameCell
(identifier),
StorageCell (storage)
])
class TBody
(DocBook.Rules.TBody):
"application's informal table body"
TITLE
= DocBook.Rules.TBody
def __init__
(self, items):
DocBook.Rules.TBody.__init__ (self, map (TRow, items))
class TGroup
(DocBook.Rules.TGroup):
"application's informal table group"
COLSPECS = [ DocBook.Rules.ColSpec ('Name', 75, 'left'),
DocBook.Rules.ColSpec ('Type', 64, 'center')
]
SHAPE
TBODY
= [ '2', '1', '1', 'center' ]
= TBody
class InformalTable
(DocBook.Rules.InformalTable):
"application's informal table"
TGROUP
= TGroup
class Example
(DocBook):
'example application of DocBook formatting class'
SECTION = str (InformalTable)
def __call__
(self):
self.data = [ InformalTable ()(self.data) ]
return DocBook.__call__ (self)
if __name__ == '__main__':
print Example ([('statex', 'Integer'), ('statey', 'Long')]) ()
Python code to
translate data into
OpenJade input in
local DocBook SGML
(based on Python to
DocBook sample
wrapper class
DocBook)
Using class DocBook
• class DocBook from DocBook.py in
Appendix F is the top-level interface
callable class
• Application inherits from class DocBook
• Contents of application inherit from
classes contained by DocBook.Rules
• Use overrides to specify structure,
formatting, and text.
OpenJade
• OpenJade is an open source DSSSL execution
engine available from SourceForge.
• DSSSL is an ISO standard for typesetting
specification and document conversion.
• OpenJade reads DocBook DSSSL stylesheets
and our local DSSSL stylesheets if any.
• The DSSSL is executed by OpenJade upon
SGML source text to write a final document for
later loading into a word processor.
DocBook Post-Processing
using Word Automation
with Python/COM
• DocBook/OpenJade emits RTF with
different Word document style identifier
names than in corporate Word DOT file.
• Much faster to change document using
Python/COM than to create document!
• Cannibalized Python code from inserter
first draft to create post-processor.
• Reads RTF, changes, saves as final DOC.
Return on Investment
5 projects ranging from 30 BEACON files to 150, average about 75 files
Each project has 2 releases per year where each file must generate hard copy.
Previously (cut/paste by hand):
Each project release:
1/5 * 75 * 4 hours
3/5 * 75 * 8 hours
1/5 * 75 * 16 hours
=
=
=
Two releases per year:
Five projects needing releases:
Two year period (2002-2003)
* 2
* 5
* 2
Total effort avoided:
60
360
240
----660
hours
hours
hours
= 1,320
= 6,600
= 13,200
-----13,200
hours
hours
hours
hours
hours
Automated:
Automated releases over 2 year period:
My effort (12 * 140 hours per labor month):
Total investment:
Net effort avoided, 2002-3:
Net avoided by customers 2002-3 at $100/hour:
160 hours
1 680 hours
1 840 hours
11 360 hours
1 136 000 dollars
Net labor years avoided 2002-3 at 1680 hours/year:
Headcount avoided per year:
ROI (Total effort avoided / total invested) 2002-3:
6.76 years
3.38 people
7.17
Python and DocBook together
• Python
connects
our
department’s
engineering specific islands of automation.
• Python with DocBook created Word
documents from engineering data.
• The combination of an open language with
an open standard eliminated a real-world
business process bottleneck.
• The return on investment was substantial.