Transcript NNPSC

An Optoelectronic
Neural Network Packet
Switch Scheduler
K. J. Symington, A. J. Waddie, T. Yasue,
M. R. Taghizadeh and J. F. Snowdon.
http://www.optical-computing.co.uk
Outline
• Packet switch scheduler.
• Previous demonstrator has proven system
feasibility.
• Current demonstrator enhances functionality
and performance.
• Motivation.
• Implementation and scalability.
• Conclusions.
The Assignment Problem
Can be found in situations such as:
• Network service management.
• Distributed computer systems.
• Work management systems.
• General scheduling, control or resource
allocation.
Solution is computationally intensive.
Neural networks are capable of solving the
assignment problem.
Their inherent parallelism allows them to
outperform any other known method at higher
orders.
Crossbar Switching
Crossbar Switching
A size N crossbar switch has the same
number of inputs as outputs: i.e. m=n=N.
Crossbar Switching
• Packets stored in
buffer until output
free.
• Packets can request
any output line.
• Buffer depth
important.
very
• Real traffic tends to
be ‘bursty’.
Crossbar Switching
• Channel operation
exclusive.
• Maximum capacity
of N packets per
switch cycle.
Crossbar Switching
• Packet can only pass when crosspoint set.
• N2 crosspoint switches required.
• Generic crossbar switch architecture.
Crossbar Switching
• Neural network chooses optimal set of packets.
• One neuron required for every crosspoint.
Crossbar Switching
Banyan Switching
Solution Optimality
4
2
2
r
a
r
a
r
a
• Routing input 2 to output 2 allows only 1 packet to pass.
Solution is sub-optimal.
• Routing input 2 to output 4 and input 4 to output 2 allows 2
packets to pass. Solution is optimal.
The Neuron
• Inputs taken from the outputs of
other neurons.
The Neuron
• Inputs taken from the outputs of
other neurons.
• Synaptic weights
multiply inputs.
The Neuron
• Inputs taken from the outputs of
other neurons.
• Synaptic weights
multiply inputs.
• Inputs are summed and bias
added.
The Neuron
• Inputs taken from the outputs of
other neurons.
• Synaptic weights
multiply inputs.
• Inputs are summed and bias
added.
• Transfer function f(x) performed
before output.
Neural Algorithm
Next state defined by:
n
n


C
xij t   xij t  1  t  A
yik  B
ykj  

2 
k j
k i

xij: Summation of all the inputs to the neuron referenced by ij:
including the bias.


yij: Output of neuron ij.
A, B and C: Optimisation parameters.
‘Iterations to Convergence’ is an important parameter.
Iterations related to, but not necessarily equal to, time.
Neural transfer function:
1


y ij  f x ij 
  x ij
1 e
: Controls gain of neuron.
Neural Interconnect
Convergence Example
Start state – all requested neurons are on.
Convergence Example
1/3 Evolved: Neurons (2, 4) and (4, 2) are
beginning to inhibiting neuron (2, 2).
Convergence Example
2/3 Evolved: Neuron (2, 2) is nearly off.
Convergence Example
Fully Evolved. Optimal solution reached.
Why Optoelectronics?
• Neural network scalability limited in silicon.
• Optoelectronics allows scaleable networks.
• Free-space optics can be used to perform
interconnection.
• Only transfer function
performed in silicon.
f(x)
need
be
• Input summation is done in an inherently
analogue manner.
• Noise added naturally.
The VCSEL Array
• Optical output element.
• A laser that emits from the
surface of the substrate.
• High optical output powers.
The VCSEL Array
• Each neuron has one VCSEL for
optical output.
• Performance: Capable of >1GHz
operation.
• Scalability: Currently N=16.
Detector Arrays
• Optical input element.
• Available in a wide range
off the shelf.
• Performance: >1GHz.
• Caveat: faster detectors
require more power.
Diffractive Optic Elements
(DOEs)
• Large fan-out
possible.
• Efficiency:
~50-60%.
• Non-uniformity:
<3%.
• Period Size:
90µm.
These elements are used as array generators and
interconnection elements.
Optical Interconnect
DOE interconnect
Crossbar
Banyanisswitch
spaceinterconnect.
invariant.
Optical System
Photodetector Array
DOE Ø15mm
Period 90µm
VCSEL Array
Lens Ø10mm
f=100mm-150mm
First Generation System
• Constructed using discrete components.
• Lacked ability to prioritise packets: can lead to channel saturation.
• Uses similar optical system (~330mm).
Current System
• System uses 4×40MHz Texas
Instruments 320C5x DSPs.
• DSPs perform transfer function.
• Transfer function fully
programmable.
• Reduction of hardware by digital
thresholding.
System Scalability
Digital vs. Analogue
Analogue: Optimal ~97%. Digital: Optimal ~91%.
Crossbar Switch Results
Histogram of packets routed successfully in a crossbar switch.
Banyan Switch Results
Histogram of packets routed successfully in a banyan switch.
Mean Packet Delay
Mean Packet Delay
• ISLIP4 cannot be implemented larger than N=16.
Mean Packet Delay
• ISLIP4 cannot be implemented larger than N=16.
Engineering Issues
3 Major effects to consider:
• Active effects: <1Hz thermal changes and component
creep.
• Static effects: Tolerances in fabricated components
could lead to misalignment in final system.
• Adaptive effects: Vibrational effects >1Hz - e.g.
10kHz.
Solutions:
• Measurement and correction of focusing and positional
error in real time (active optic alignment or adaptive
optics).
• Commercially viable: e.g. personal CD player, ASDA
£22:95.
• Pre-packaged, pre-aligned modules.
Encapsulated System
R. Stone, J. Kim and P. Guilfoyle,
“High Performance Shock Hardened
Optoelectronic Communications
Module”, OC2001, Lake Tahoe,
pp. 105-107.
Conclusions
•
•
•
•
Performance of 100MHz feasible, 1GHz foreseeable.
Scalability mainly limited by VCSEL array size (N=16).
Scalability independent of number of inputs/outputs (N).
A digital system running at 1GHz could supply 2.5 million
switch configurations per second.
• Second generation builds on first in that it supports
prioritisation.
• What good is a truck without a steering wheel?
• Further work:
•
•
•
•
Smart pixel implementation and packaging.
Examination of QoS provided by scheduler.
FPGA or custom ASIC implementation using optical interconnects.
Novel neural algorithms and learning.