DOM Presentation

Download Report

Transcript DOM Presentation

CDT DOM Roadmap
Doug Schaefer
Oct 26, 2005
Parser History

CDT 1.0
►
JavaCC based parser
 Used to populate CModel and Structure Compare
►
ctags based indexer
 Used for open declaration, text hover, content assist

CDT 1.1
►
Introduced first handwritten recursive descent parser
 Used for CModel and Structure Compare
Parser History

CDT 1.2
►
Ctags based indexer replace with parser based indexer
 Parser symbol table added to capture semantic info
►
Added C/C++ Search that used the index
 Hooked up content assist to search engine

CDT 2.0
►
Added content assist as parser client (more accurate)
►
Added Type Cache to cache types for Open Type, Class Browser
►
Added parsing of text selection for search features
CDT 3.0 The Dawn of the DOM


Previous architecture used callbacks to
communicate with clients
►
Passed in objects describing grammar rules that got accepted
►
Left it to client responsibility to create necessary data structures
►
Thought it would reduce memory consumption
►
However, made it very difficult to build clients
►
That and the objects the parser was passing were almost an AST
Wanted to support advanced features with parser
►
Refactoring
►
Code analysis such as call hierarchies
►
i.e. JDT catch up…
CDT 3.0 The Dawn of the DOM


Better approach is to follow traditional architecture of
compilers
►
Abstract Syntax Tree captures structure of code
►
Symbol Table captures semantic information
►
No more callbacks, clients get root node of AST and go from there
We also added links from AST nodes back to source
locations
►
Including navigation through macros and inclusions
►
Facilitate refactoring
CDT DOM Architecture
Abstract Syntax Tree
Locations
Bindings
declarations
references
Names
CDT 3.0 DOM Clients

DOM (Full) Indexer

Search Actions
►
Open Declaration, Open Definition

Content Assist

Refactoring
CDT 3.0 Clients Still on Old Parser



CModel and Structure Compare
►
Requires very fast parsing to satisfy Views
►
Generally only cares about contents of a given file
Type Cache
►
Used by Open Type and Class Browser
►
Previously required since it needs all types in workspace
►
Also needs to be updated when types are added, removed, or
changed
Objective: Move these clients to DOM
►
Need to make sure DOM meets their requirements
►
Then we can get rid of the old parser
Problems with the DOM


DOM is complete but requires a lot of processing and
memory
►
Caching DOM parse results would exacerbate the memory problem
►
Optimized algorithms as much as we could
►
DOM Indexer is faster than CDT 2.x indexer but still takes a long
time on large projects
No rewrite capability
►
JDT DOM supports translating DOM changes into TextEdits
►
Required to properly support refactoring
Solving Performance with PDOM

PDOM – Persisted DOM
►

Persist highly used parts of the DOM in a database
Assumption:
►
Many clients do not require 100% completeness
 Some do
►
Header files always produce the same AST Nodes
 That’s not 100% true (e.g., stddef.h)
►
Declarations do not span files
 I have seen that not true (includes in middle of function)
►
Database lookups faster than parsing header files
 We’ll see but so far I’ve seen that to be true with embedded
Derby
PDOM Explained

Skip over header files that have up-to-date
information already stored in the PDOM

Persist Names and Bindings in PDOM to satisfy
►
resolveBinding and resolvePrefix
 Navigate from Names to Bindings
►
getDeclarations, getDefinitions, getReferences
 Navigate from Bindings to Names
CDT PDOM Architecture
Abstract Syntax Tree
Locations
Bindings
declarations
references
Names
PDOM objects
in black
PDOM Database Engine

Want to be flexible to allow ISVs to plugin in their
own embedded database technology

Default implementation is on Apache Derby

►
Embedded SQL database engine
►
Apache licensed
►
Already used by other Eclipse projects (BIRT, TPTP?)
Big worry is performance of database writes
►
Will need to tune caching to make sure it is fast enough
►
Objective: populate PDOM with Mozilla source in 20 minutes
 On my Athon XP 2800 512MB FC4 Linux
 Current full index time is 90 minutes
Final Migration

Need to move all features to the DOM
►

Code reduction exercise
Need for indexer removed
►
May still want to populate PDOM using ctags

Migrate Search Engine to query the PDOM

Migrate CModelBuilder and CStructureBuilder to
DOM
►
Since we can skip header files, this should be pretty fast.
►
Need to monitor it though to make sure.
Final Migration

Remove Type Cache
►

PDOM queries should be fast enough
Migrate Class Browser, Type Hierarchy, and Open
Type to PDOM
►
Use queries to find list of types and relationships