here - Computer Science

Download Report

Transcript here - Computer Science

Efficient Representation of Data
Structures on Associative Processors
Jalpesh K. Chitalia
(Advisor Dr. Robert A. Walker)
Computer Science Department
Kent State University
Presentation Outline







ASC Processor Architecture
Associative Features
Structure Codes
Represent Data Structures
Structure Code Operations
Implementation of Structure Codes
Summary and Future Work
The ASC Processor
 A scalable design implemented on a
million gate Altera FPGA
 SIMD-like architecture
 Currently, 36 8-bit Processing
Elements (PE) available
 8-bit Instruction Stream (IS) control
unit with 8-bit Instruction and Data
addresses, 32-bit instructions
The ASC Architecture
PE and Me mory
Network
Data
Bus
Responder
Resoluti on
Uni t
PE and Me mory
Instruction
Bus
memory
and supporting circuitry
Con trol
Uni t
From Control Unit
PE and Me mory
Common
Registers
PE and Me mory
PE Array
The ASC Architecture
 Each PE listens to the IS through the
broadcast and reduction network
 PEs can communicate amongst
themselves using the PE Network
 PE may either execute or ignore the
microcode instruction broadcast by IS
under the control of the Mask Stack
The ASC Features
 Associative Search
 Each PE can search its local memory for
a key under the control of IS
 Responder Resolution
 A special circuit signals if ‘at least one’
record was found
 Masked Operation
 Local Mask Stacks can turn on or off the
execution of instruction from IS
The ASC Example
Select * from Students where Grade > 90
The ASC Features
 Constant Time Associative Operations
 Associative Search
 Finding minimum or maximum in a field
 Ideal for Database processing
 Data is organized in tabular format
 Each tuple in a table can be processed by one PE
 PE Network
 Many parallel algorithms require all PEs to move
contents in a regular pattern
 E.g.: Image Convolution, Matrix Multiplication
Non Tabular Data Structures
 Many applications use linked list based data
structures
 For example, plain HTML parsing can be done
using tree-structure
 Similarly, XML or Object-relational databases
can be represented using trees only
 Game programming uses tree-based algorithms,
and demand much of processing power
 Complier construction uses directed acyclic
graphs and tree structures
Data Structure Codes
 A unique coding scheme
 Allows representation of any data structure into
a tabular format
 Tabular format allows division of data amongst
the PEs
 Also known as “structure code”
 Different coding schemes for different data
structures may be required
 Uses Associative Search feature of the ASC
Processor
Simple List-based Structures
 The left figure shows a possible representation of
1D and 2D arrays using structure codes
 The right figure represents stack and queue,
depending on the use of appropriate functions
Complex List-based Structures
 Each digit-position
indicates the level
of a tree
 Each value in that
position indicates
the position of that
child from the left
 Discussions
henceforth are
confined to trees
 Graphs are read in
a slightly different
manner
Structure Code Operations
 Two sets of constant-time operations:
scalar and parallel
 Scalar Instructions are simple mathematic
operations
 Search parent, child or root
 Finds value of structure code for next or
previous nodes
 Parallel Instructions use complex
associative operations
 Finds code for next, previous or both siblings
 ‘Locates’ the required node for further
processing
Scalar Operations
fstcd: leftmost child
nxtcd: right sibling
prvcd: left sibling
trncd: truncate
(parent) node
 trnacd: truncate all
(root) node




 Can be used to
allocate a new node
 Limited use in
searching records
Parallel Operations
 Index instructions:
 Flags a node of the result to ease further
processing
 nxtdex (next or right), prvdex (previous or left)
and sibdex (siblings or both left and right)
 Value instructions:
 Returns the exact structure code of the result
 nxtval (next or right) and prvval (previous or
left)
 Can be used to ‘locate’ nodes in any tree
 Uses parallel and associative hardware
resources
Applications of DSC Ops
 Scalar
 Associative searching: the parent or the
root in a given data structure space
 Allocating elements in a data structure
space
 Value
 Finding the value of resultant structure
code
 Index
 Most useful in associative search (eg, tree
traversal)
Implementation Requirements
 Hardware functionality
 Debugging and developing I/O functionality
 Debugging few other instructions
 Structure Code Operations
 Scalar Operations: nxtcd, prvcd
 Parallel Operations: nxtdex, nxtval, prvdex,
prvval, sibdex
 Parallel Operations
 Most complex set amongst ASC operations
Implementation – Value Ops
 Processing of Parallel Value Ops
 Associative search, associative min/max
 Input Cycle
 All the elements of data structure are
distributed amongst the PEs sequentially
 Reference node is broadcast to them
 Output Cycle
 Multi-byte structure code is stored in
destination address (scalar memory)
Implementation – Index Ops
 Processing of Parallel Index Ops
 The resultant element(s) is (are)
‘marked’ for further processing
 Note: sibdex may select two results
 Input and Output
 Input: same as in value operations
 Output: Bit flags for all the input nodes
are stored sequentially in destination
address (scalar memory)
Summary
 SIMD-based computers are more suited for
database processing
 ASC processor with its associative
operations makes them more efficient
 Structure codes translate non-tabular data
structures into a tabular format
 Tree and Graphs can be represented and evenly
divided like records in a table
 Object-relational databases, HTML processing
 Stacks, Queues, multi-dimensional arrays can be
efficiently processed
 Structure codes not required for even
representation
Future Work
 Efforts are in place to clean the architecture
 To allow multiplier/divider on each PE
 To accommodate RISC instruction set
 Unpacked bytes are used to support
variable-length structure code
 Can be avoided with an efficient divider unit
 Input/Output
 Special structure code operations are defined
but not implemented in this work
 Parallel input/output is also required
 Developing applications that use structure
codes
Acknowledgements




Professor Walker
Professor Potter
Committee members for their time
Kevin Schaffer, Hong Wang, Meiduo
Wu, Lei Xie, Ping Xu
Questions?
Thank
You!