EagleEye: Attack-Agnostic Defenses against Adversarial Inputs



Deep learning systems are inherently vulnerable to adversarial inputs, which are maliciously crafted samples to trigger deep neural networks (DNNs) to misbehave, leading to disastrous consequences in security-critical applications. The fundamental challenges of defending against such attacks stem from their adaptive and variable nature: adversarial inputs are tailored to target DNNs, while crafting strategies vary greatly with concrete attacks. This project develops EagleEye, a universal, attack-agnostic defense framework that (i) works effectively against unseen attack variants, (ii) preserves predictive power of deep neural networks, (iii) complements existing defense mechanisms, and (iv) provides comprehensive diagnosis about potential risks in deep learning outputs.

In particular, EagleEye leverages a set of invariant properties underlying most attacks, including the “minimality principle”: to maximize attack evasiveness, an adversarial input is generated by applying the minimum possible distortion to a legitimate input. By exploiting such properties in a principled manner, EagleEye effectively discriminates adversarial inputs (integrity checking) and even uncovers their correct outputs (truth recovery).


Code & Datasets

nsf We are grateful for the National Science Foundation (NSF) to support our research.

rss facebook twitter github youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora