Adversarial Attacks and Defences
Papers
Intriguing properties of neural networks
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
- intro: CVPR 2015
- arxiv: http://arxiv.org/abs/1412.1897
- github: https://github.com/Evolving-AI-Lab/fooling/
Explaining and Harnessing Adversarial Examples
- intro: primary cause of neural networks’ vulnerability to adversarial perturbation is their linear nature
- arxiv: http://arxiv.org/abs/1412.6572
Distributional Smoothing with Virtual Adversarial Training
- arxiv: http://arxiv.org/abs/1507.00677
- github: https://github.com/takerum/vat
Confusing Deep Convolution Networks by Relabelling
Exploring the Space of Adversarial Images
Learning with a Strong Adversary
Adversarial examples in the physical world
- author: Alexey Kurakin, Ian Goodfellow, Samy Bengio. Google Brain & OpenAI
- arxiv: http://arxiv.org/abs/1607.02533
DeepFool: a simple and accurate method to fool deep neural networks
- arxiv: http://arxiv.org/abs/1511.04599
- github: https://github.com/LTS4/DeepFool
Adversarial Autoencoders
- arxiv: http://arxiv.org/abs/1511.05644
- slides: https://docs.google.com/presentation/d/1Lyp91JOSzXo0Kk8gPdgyQUDuqLV_PnSzJh7i5c8ZKjs/edit?pref=2&pli=1
- notes(by Dustin Tran): http://dustintran.com/blog/adversarial-autoencoders/
- TFD manifold: http://www.comm.utoronto.ca/~makhzani/adv_ae/tfd.gif
- SVHN style manifold: http://www.comm.utoronto.ca/~makhzani/adv_ae/svhn.gif
Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization
(Deep Learning’s Deep Flaws)’s Deep Flaws (By Zachary Chase Lipton)
Deep Learning Adversarial Examples – Clarifying Misconceptions
- intro: By Ian Goodfellow, Google
- blog: http://www.kdnuggets.com/2015/07/deep-learning-adversarial-examples-misconceptions.html
Adversarial Machines: Fooling A.Is (and turn everyone into a Manga)
How to trick a neural network into thinking a panda is a vulture
Assessing Threat of Adversarial Examples on Deep Neural Networks
- intro: pre-print version to appear in IEEE ICMLA 2016
- arxiv: https://arxiv.org/abs/1610.04256
Safety Verification of Deep Neural Networks
Adversarial Machine Learning at Scale
- intro: Google Brain & OpenAI
- arxiv: https://arxiv.org/abs/1611.01236
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks
https://arxiv.org/abs/1704.01155
Parseval Networks: Improving Robustness to Adversarial Examples
- intro: Facebook AI Research
- arxiv: https://arxiv.org/abs/1704.08847
Towards Deep Learning Models Resistant to Adversarial Attacks
- intro: MIT
- arxiv: https://arxiv.org/abs/1706.06083
NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles
- intro: CVPR 2017 Spotlight Oral Workshop
- arxiv: https://arxiv.org/abs/1707.03501
One pixel attack for fooling deep neural networks
- intro: Kyushu University
- arxiv: https://arxiv.org/abs/1710.08864
- github: https://github.com/Hyperparticle/one-pixel-attack-keras
Enhanced Attacks on Defensively Distilled Deep Neural Networks
https://arxiv.org/abs/1711.05934
Adversarial Attacks Beyond the Image Space
https://arxiv.org/abs/1711.07183
On the Robustness of Semantic Segmentation Models to Adversarial Attacks
https://arxiv.org/abs/1711.09856
Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser
https://arxiv.org/abs/1712.02976
A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
https://arxiv.org/abs/1712.02779
Training Ensembles to Detect Adversarial Examples
https://arxiv.org/abs/1712.04006
Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models
- arxiv: https://arxiv.org/abs/1712.04248
- openreview: https://openreview.net/forum?id=SyZI0GWCZ
Where Classification Fails, Interpretation Rises
- intro: Lehigh University
- arxiv: https://arxiv.org/abs/1712.00558
Query-Efficient Black-box Adversarial Examples
https://arxiv.org/abs/1712.07113
Adversarial Examples: Attacks and Defenses for Deep Learning
- intro: University of Florida https://arxiv.org/abs/1712.07107
Wolf in Sheep’s Clothing - The Downscaling Attack Against Deep Learning Applications
https://arxiv.org/abs/1712.07805
Note on Attacking Object Detectors with Adversarial Stickers
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
- intro: UC Berkeley
- arxiv: https://arxiv.org/abs/1712.05526
Awesome Adversarial Examples for Deep Learning
https://github.com/chbrian/awesome-adversarial-examples-dl
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
https://arxiv.org/abs/1712.05526
Exploring the Space of Black-box Attacks on Deep Neural Networks
https://arxiv.org/abs/1712.09491
Adversarial Patch
https://arxiv.org/abs/1712.09665
Adversarial Generative Nets: Neural Network Attacks on State-of-the-Art Face Recognition
- intro: CMU & University of North Carolina at Chapel Hill
- arxiv: https://arxiv.org/abs/1801.00349
Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
https://arxiv.org/abs/1801.00553
Spatially transformed adversarial examples
https://arxiv.org/abs/1801.02612
Generating adversarial examples with adversarial networks
- intro: University of Michigan & UC Berkeley & MIT CSAIL
- arxiv: https://arxiv.org/abs/1801.02610
Adversarial Spheres
- intro: Google Brain
- arxiv: https://arxiv.org/abs/1801.02774
LaVAN: Localized and Visible Adversarial Noise
- intro: Bar-Ilan University & DeepMind
- arxiv: https://arxiv.org/abs/1801.02608
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
- intro: ICML 2018 Best Paper Award. MIT & UC Berkeley
- arxiv: https://arxiv.org/abs/1802.00420
- github: https://github.com/anishathalye/obfuscated-gradients
Adversarial Examples that Fool both Human and Computer Vision
- intro: Google Brain & Stanford University
- arxiv: https://arxiv.org/abs/1802.08195
On the Suitability of Lp-norms for Creating and Preventing Adversarial Examples
https://arxiv.org/abs/1802.09653
Protecting JPEG Images Against Adversarial Attacks
- intro: IEEE Data Compression Conference
- arxiv: https://arxiv.org/abs/1803.00940
Sparse Adversarial Perturbations for Videos
https://arxiv.org/abs/1803.02536
DeepDefense: Training Deep Neural Networks with Improved Robustness
- intro: Tsinghua National Laboratory for Information Science and Technology (TNList) & Intel Labs
- arxiv: https://arxiv.org/abs/1803.00404
Improving Transferability of Adversarial Examples with Input Diversity
Adversarial Attacks and Defences Competition
- intro: Google Brain & Tsinghua University & The Johns Hopkins University
- arxiv: https://arxiv.org/abs/1804.00097
Semantic Adversarial Examples
https://arxiv.org/abs/1804.00499
Generating Natural Adversarial Examples
- intro: ICLR 2018
- arxiv: https://arxiv.org/abs/1710.11342
- github: https://github.com/zhengliz/natural-adversary
An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks
- intro: Northeastern University & MIT-IBM Watson AI Lab & IBM Research AI
- keywords: Deep Neural Networks; Adversarial Attacks; ADMM (Alternating Direction Method of Multipliers)
- arxiv: https://arxiv.org/abs/1804.03193
On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses
https://arxiv.org/abs/1804.03286
VectorDefense: Vectorization as a Defense to Adversarial Examples
- keywrods: MNIST
- arxiv: https://arxiv.org/abs/1804.08529
On the Limitation of MagNet Defense against L1-based Adversarial Examples
https://arxiv.org/abs/1805.00310
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
- intro: ICLR 2018
- arxiv: https://arxiv.org/abs/1805.06605
- github: https://github.com/kabkabm/defensegan
Siamese networks for generating adversarial examples
https://arxiv.org/abs/1805.01431
Generative Adversarial Examples
- intro: Stanford University & Microsoft Research
- arxiv: https://arxiv.org/abs/1805.07894
Detecting Adversarial Examples via Key-based Network
https://arxiv.org/abs/1806.00580
Adversarial Attacks on Variational Autoencoders
https://arxiv.org/abs/1806.04646
Non-Negative Networks Against Adversarial Attacks
- intro: Laboratory for Physical Sciences & Nvidia
- arxiv: https://arxiv.org/abs/1806.06108
Gradient Similarity: An Explainable Approach to Detect Adversarial Attacks against Deep Learning
https://arxiv.org/abs/1806.10707
Adversarial Reprogramming of Neural Networks
- intro: Google Brain
- arxiv: https://arxiv.org/abs/1806.11146
Defend Deep Neural Networks Against Adversarial Examples via Fixed andDynamic Quantized Activation Functions
- intro: University of Central Florida & JD AI Research & Tencent AI Lab
- arxiv: https://arxiv.org/abs/1807.06714
Motivating the Rules of the Game for Adversarial Example Research
- intro: Google Brain & Princeton
- arxiv: https://arxiv.org/abs/1807.06732
Defense Against Adversarial Attacks with Saak Transform
https://arxiv.org/abs/1808.01785
Are adversarial examples inevitable?
https://arxiv.org/abs/1809.02104
Open Set Adversarial Examples
https://arxiv.org/abs/1809.02681
Towards Query Efficient Black-box Attacks: An Input-free Perspective
- intro: 11th ACM Workshop on Artificial Intelligence and Security (AISec) with the 25th ACM Conference on Computer and Communications Security (CCS)
- arxiv: https://arxiv.org/abs/1809.02918
SparseFool: a few pixels make a big difference
https://arxiv.org/abs/1811.02248 https://github.com/LTS4/SparseFool
Lightweight Lipschitz Margin Training for Certified Defense against Adversarial Examples
https://arxiv.org/abs/1811.08080
Adversarial Defense by Stratified Convolutional Sparse Coding
https://arxiv.org/abs/1812.00037
Learning Transferable Adversarial Examples via Ghost Networks
- intro: Johns Hopkins University & University of Oxford
- arxiv: https://arxiv.org/abs/1812.03413
Feature Denoising for Improving Adversarial Robustness
- intro: Johns Hopkins University & Facebook AI Research
- arxiv: https://arxiv.org/abs/1812.03411
Defense-VAE: A Fast and Accurate Defense against Adversarial Attacks
- intro: Georgia State University
- arxiv: https://arxiv.org/abs/1812.06570
Curls & Whey: Boosting Black-Box Adversarial Attacks
- intro: CVPR 2019 Oral
- arxiv: https://arxiv.org/abs/1904.01160
Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks
https://arxiv.org/abs/1904.00887
Black-box Adversarial Attacks on Video Recognition Models
https://arxiv.org/abs/1904.05181
Interpreting Adversarial Examples with Attributes
https://arxiv.org/abs/1904.08279
On the Design of Black-box Adversarial Examples by Leveraging Gradient-free Optimization and Operator Splitting Method
- intro: ICCV 2019
- arxiv: https://arxiv.org/abs/1907.11684
Deep Neural Rejection against Adversarial Examples
https://arxiv.org/abs/1910.00470
Unrestricted Adversarial Attacks for Semantic Segmentation
https://arxiv.org/abs/1910.02354