USATLAS_WorkshopJun15_MLx

Download Report

Transcript USATLAS_WorkshopJun15_MLx

Machine Learning and ATLAS
Paolo Calafiura
What is ML: data-driven statistical modeling of complex systems
Why now: GPU/MIC/FPGA/Neuromorphic hardware evolution
pushes us towards computing with many simple elements
Applications: many! Today will focus on
• Neuromorphic computing for low-power scalable tracking
• Deep Neural Networks for data analysis, data movement
1
US ATLAS SCTPM
Motivating Example: Tracking
During Run 4 we have to process ~60M tracks/s (~20x Run 2)
I/O will likely constrain us to run full tracking on line, and to write
only an xAOD-like format
Guesstimating x86 cost & performance evolution, we should find
x2-5 CPU offline, (much) more @ Point 1
Surely we can parallelize our way out of trouble?
2
US ATLAS SCTPM
Why is it so hard to do particle tracking
in parallel?
Algorithms: Iterative (propagation, fitting), irregular
(combinatorial searches with lots of branch points)
Data: sparse (hits), non-local access (B-field integration)
Can ML allow us to train Neural Networks (NN) that use regular,
trivial algorithms, and a naturally data parallel approach?
3
US ATLAS SCTPM
Neural Networks:
Computing with simple elements
‘neuron’
Simple computing elements…
by themselves, limited functional repertoire.
(Kristofer Bouchard, LBNL)
US ATLAS SCTPM
Feed-forward NN: Classification
‘neuron’
Simple computing elements…
as a network, learn to perform diverse functions
Flow of information
1
2
Classification
US ATLAS SCTPM
Convolutional NN: Feature Extraction
‘neuron’
Simple computing elements…
as a network, learn to perform diverse functions
60
Feature Extraction
US ATLAS SCTPM
Classification
Recurrent NN: Time-varying Functions
‘neuron’
Simple computing elements…
as a network, learn to perform diverse functions
Dynamics
F(t)
Feature Extraction
US ATLAS SCTPM
Classification
Time-varying Functions
Tracking Kaggle Challenge
Inspired by the very successful Higgs Challenge
Competition among ML experts:
• Problem: Given a list of space-points produce a list of track
candidates
• Figure of merit: efficiency for given fake rate and CPU budget
(still under discussion)
More next week during C&S plenary
8
US ATLAS SCTPM
LHCb Trigger Retina Processor
fw
q
hits
receptors
x
hits
receptors
Track parameter space
22K bins, one “receptor” per bin
s
z
m
s
ur e 1. Schematic representation of the detector mapping (more details in the text). The grid in parameter
e (left) and the corresponding receptors in the detector (right).
FPGA implementation
centre of each cell identifies a track in the detector space, that intersects detector layers in spa1mus tracking
points that wecall receptors. Therefore each (mi , q j )-cell of the parameter space corresponds to
t of receptors
{ xki j } , where k = 1, . . . , nperformance
runs over the detector layers, as shown in figure 1. This
Offline-quality
cedure is called detector mapping and it is done for all the cells of the track parameter space,
Certainly good enough for
ering all the detector acceptance. For each incoming hit, the algorithm computes the excitation
seeding
nsity, i. e. the
response of the receptive field, of each (mi , q j )-cell as follows:
z
⇣
s2i jkr ⌘
Ri j = Â exp −
,
2s 2
k, r
(2.1)
ng the distance
ij
si jkr = xk,r − xk ,
US ATLAS
SCTPM
(2.2)
ere xk,r is the r-th hits on the detector layer k, while s is a parameter of the retina algorithm, that
Neuromorphic Computing
“Spikey” from
Electronic Visions
group in Heidelberg
Qualcomm’s NPU’s
for robots.
IBM’s
TrueNorth
SpiNNaker’s 1B
neuron machine
Stanford’s Neurogrid
Intel’s concept design...
US ATLAS SCTPM
(Peter
Nugent, LBNL)
IBM TrueNorth
•
•
•
•
•
•
Merolla+ Science (2014)
US ATLAS SCTPM
1 million programmable neurons
256 million synapses
4096 neurosynaptic cores
Uses 70mW per chip
5.4 billion transistors
Spiking rate >1000Hz
A single chip can process color video
in real-time while consuming 176,000
times less energy than a current Intel
chip performing the exact same
analysis. Note the Intel chip can not do
this analysis in real-time and is in fact
300 times slower!
Neuromorphic Kalman Filters
(LBNL LDRD FY16 proposal)
Paolo Calafiura (CRD), Kristofer Bouchard (Life Sciences),
David Donofrio (CRD), Rebecca Carney (Physics),
Maurice Garcia-Sciveres (Physics), Craig E. Tull (CRD)
Implement Kalman filters on neuromorphic chips for
low-power, high-throughput, real-time data processing
Brain-machine interfaces
US ATLAS SCTPM
Charged particle tracking
Deep
Learning
Data
deep
l ear ning
f or par t for
icl e col
l iderAnalysis
dat a anal ysis
Moti
va
tedbysuccesses ofdeeplea
r
ni
ngi
n vi
si
on a
ndspeech.
∙ H ug
epr
og
r
ess on benchma
r
k
super
vi
sedlea
r
ni
ngta
sk
s
∙ Repla
cementofengineered f
ea
tur
es w i
thlearned f
ea
tur
es
Engineered f
ea
tur
es
US ATLAS SCTPM
Peter Sadowski (UCI)
Learned f
ea
tur
es
Optimizing Higgs Detection
det ect ing t he higgs boson
∙ Cur
r
enta
ppr
oa
ch:sha
llow models
∙ Boosteddeci
si
on tr
ees*(
BDT)
∙ Sha
llow neur
a
lnetw or
k
s(
NN)
∙ Oura
ppr
oa
ch:deepneur
a
lnetw or
k
s(
DNN)
BDT
NN
DNN
Peter Sadowski (UCI)
*
Usedf
orH i
g
g
s di
scover
yi
n 20
1
2
4
US ATLAS SCTPM
det ect ing t he higgs boson
Optimizing Higgs Detection
Shallow networks
Deep neworks
Peter Sadowski (UCI)
US ATLAS SCTPM
dianahep (NSF S2I2 project)
Improve ML tools in ROOT
US ATLAS SCTPM
Machine Learning and Data Management
Vast and growing amount of data on user access patterns.
Combine engineered and learned features to:
• Pre-fetch data and pre-allocate resources
• Optimize data clustering and replication
• Suggest related data sets
17
US ATLAS SCTPM
Conclusions
• ML will be predominant in Run 4 analysis (wager #1)
• Deep neural networks in tracking, jet reco, and clustering will
allow us to exploit GPUs and FPGAs, and possible
neuromorphic architectures (wager #2)
• 2025 grad-students will wonder why we wrote all that C++
junk instead of training a few good networks (wager #3)
ML learning experts are not formed overnight (and command
high 6-digits salaries, so tend to disappear fast)
US ATLAS should start developing ML expertise by supporting
pilot projects in all relevant areas, e.g. data analysis,
reconstruction (tracking), and data/job management
18
US ATLAS SCTPM
Thanks
•
•
•
•
•
•
•
•
Kristofer Bouchard
David Clark
Kyle Cranmer
Maurice Garcia-Sciveres
Peter Nugent
Peter Sadowski
Tracking Kaggle group
…
]
US ATLAS SCTPM
Backup
US ATLAS SCTPM
Possible Tracking Network
n: Si Seeds
Dynamics
Track
Candidates
Seeding
eds not constrained by vertex
nificantly more seeds
e time consuming
more efficient for some channels with
sely constrained primary vertex (h->gg)
US ATLAS SCTPM
eseen default in 13.0.0
Selection
F(t)
Fitting
Kalman Filters and Recurrent NNs
Classic Data Assimilation algorithm (1960, NASA)
Iteratively track evolution of a dynamic system
Data
State
Kalman
Filter
Dynamics
Recurrent
Neural
Network
Dyn
Data
State
US ATLAS SCTPM