How Transfarable are Features in Deep Neural Networks

Download Report

Transcript How Transfarable are Features in Deep Neural Networks

Deep Neural Networks Are Easily Fooled:
High Confidence Predictions for Unrecognizable
Images
Article by Nguyen, Yosinski and Clune (CVPR 2015)
Presented by: Bella Fadida Specktor
Human vs. Computer Object Recognition
Given the near-human
ability of DNNs to
classify objects, what
differences remain
between human and
computer vision?
Find the differences
Changing an image, originally correctly classified in a
way imperceptible to human eyes, can cause a DNN to
label the image as something else entirely.
(Szegedy et al, 2013)
Recognize the object
It is easy to produce images that are completely
unrecognizable to humans, but the state-of-the-art
DNNs believe to be recognizable objects with over 99%
confidence.
Experimental setup
1. LeNet model trained on MNIST dataset– “MNIST DNN”
2. AlexNet DNN trained on ImageNet dataset- ”ImageNet
DNN” (larger dataset, bigger network)
Fooling Images Generation - Evolution
Algorithms (EA)
Highest prediction
score for any class
Selection
Crossover
Fitness
Evaluation
DNN Prediction Score
Mutation
Direct Encoding
Erayscale values for MNIST, HSV values for ImageNet.
Each pixel value is initialized with uniform random
noise within the [0, 255] range. Those numbers are
independently mutated.
CPNN (Compositional PatternProducing Network) Encoding
 This encoding is more likely to produce “Regular” images – e.g.
contains symmetry and repetition
 Similar to ANNs, but with different nonlinear functions.
CPNN (Compositional PatternProducing Network) Encoding
 Begins with population of small, simple genomes (no
hidden units) and elaborates them over generations by
adding new genes – “Complexification“.
 Evolution determines the topology, weights and the
activation function of each CPPN network in the
population.
Fooling images via Gradient Ascent
Calculating the gradient of the posterior probability for a
specific class — here, a softmax output unit of the DNN
— with respect to the input image using backprop, and
then following the gradient to increase a chosen unit’s
activation.
Simonyan et al, 2013
Results – MNIST Dataset
In less than 50 generations, each run of
evolution produces unrecognizable
images classified by MNIST DNNs with ≥
99.99% confidence. By 200 generations,
median confidence is 99.99%
MNIST DNNs labelled unrecognizable
images as digits with 99.99% confidence
after only a few generations. By 200
generations, median confidence is 99.99%.
Certain patterns repeatedly evolve in some digit classes that appear
indicative of that digit, e.g. image classified as 1 tend to have vertical bars.
Results – ImageNet Dataset
Directly encoded EA
CPNN Encoding
Even
20,000
generations,
images withinDNN
Dogsafter
and Cats
categories
which are Many
overrepresented
the
evolution
to produce high- confidence scores ≥99.99%, but
ImageNetfailed
dataset
confidence
fortomany
thatthus
are unrecognizable.
1. Bigger images
size leads
less overfitting,
more difficult toAfter
fool categories,
produced
images
5000 generations, Median
> largerbut
datasets
result
in less fooling
with
≥ 99%
forfinding
45
88.11%,
2. The
EAconfidence
had difficulty
an Confidence
images thatscore
scoresis high
on a
categories.
Median-> Data with
similar
that for
natural
specific21.59%
dog category
moretoclasses
can
help images
Confidence
ameliorate fooling
CPNN Encoding on ImageNet
Evolution needs only to produce features that are unique
to, or discriminative for, a class, rather than produce an
image that contains all of the typical features of the class.
CPNN Encoding on ImageNet – cont.


Many images are related to each other phylogenetically,
which leads evolution to produce similar images for closely
related categories.
Different runs of evolution produce different image types,
revealing that there are different discriminative features per
class that evolution exploits.
Repetition Contribution Test
To test whether repetition improves
the confidence scores, some of the
repeated elements were ablated to
see if the DNN confidence score for
that image drops.
Results:
In many images, ablation leaded to a performance drop,
but a small one
Results suggest that DNNs tend to learn low and middle
level features rather than the global structure of the object!
Do Different DNNs Learn the Same
Discriminative Features?
CPPN images were evolved on one DNN (e.g. 𝐷𝑁𝑁𝐴 ), and then
used as inputs to another DNN (𝐷𝑁𝑁𝐵 ).
a. Many images are given
the same top-1 prediction
label, but some images
labeled differently by the
two networks
b. Among those, many are
given ≥ 99.99%
confidence scores by both
networks
c. Higher confidence scores
are given by the original
DNN
Adding “Fooling Images” class
1. MNIST Dataset: evolution still
produces many unrecognizable
images with high confidence scores,
even after 15 iterations, although the
negative class at this point is
overrepresented - (25%) of training
set.
2. ImageNet Dataset: With
overrepresented negative class, the
median confidence score significantly
decreased from 88.1% for 𝐷𝑁𝑁1 to
11.7% for 𝐷𝑁𝑁2
Suspect that it is easier to learn to tell CPPN images apart from
natural images than it is to tell CPPN images from MNIST digits.
Why High-Confidence Unrecognizable Images?
In a high-dimensional space, the area a Discriminative model
(model that learns 𝑝(𝑦|𝑋)) allocates to a class may be much
large than the area occupied by the training examples.
Since Generative models compute also 𝑝 𝑥 , they may be
more difficult to fool, but currently they don’t scale well.
Additional Notes
1. Images displayed at art museum
2. Potentially DNNs could be combined with EA to
produce open-ended, creative search algorithms
3. Can be considered as a visualization technique,
indicating the diversity of features learnt for a
specific class.
4. Caution! - such false positives could be exploited
References
• Secretan, Jimmy, et al. "Picbreeder: evolving pictures collaboratively
online."Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems. ACM, 2008.
• Nguyen, Anh, Jason Yosinski, and Jeff Clune. "Deep neural networks are easily
fooled: High confidence predictions for unrecognizable images." arXiv preprint
arXiv:1412.1897 (2014).
• Stanley, Kenneth O., and Risto Miikkulainen. "Competitive coevolution through
evolutionary complexification." J. Artif. Intell. Res.(JAIR) 21 (2004): 63-100.
• Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv
preprint arXiv:1312.6199 (2013).
• Stanley, Kenneth O. "Compositional pattern producing networks: A novel
abstraction of development." Genetic programming and evolvable machines 8.2
(2007): 131-162.
• Stanley, Kenneth O., and Risto Miikkulainen. "Competitive coevolution through
evolutionary complexification." J. Artif. Intell. Res.(JAIR) 21 (2004): 63-100.