Every day, neural networks are being integrated into a wide range of high impact systems, from self driving cars to bio-medical screening and diagnosis. This raises the extremely important question —just how secure are these systems from attack?
The unfortunate answer is, they are quite vulnerable. Researchers at Georgia Tech and Intel have recently demonstrated how attackers can trick computer vision systems (i.e., neural networks) so they see things that don’t actually exist. This has serious implications for self driving cars and other safety critical systems, where human life is at stake.
In order to preempt these attacks, Georgia Tech’s Polo Club of Data Science and Intel are working to defend deep learning systems from adversarial attacks through DARPAs Guaranteeing Artificial Intelligence Robustness against Deception (GARD) program.
Our research is an attempt to detect adversarial attacks in real-time before an attacker can cause significant damage. Currently, these deep learning systems do not distinguish objects in ways that humans would. For example, when humans see a bicycle, we see its handlebar, frame, wheels, saddle, and pedals (Figure 2, top). Through our visual perception and cognition, we synthesize these detection results with our knowledge to determine that we are actually seeing a bicycle.
However, when a stop sign or a bicycle is modified to fool a model into misclassifying it as a bird, to humans, we still see the bicycle’s robust features (e.g., handlebar). On the other hand, the deep learning systems fails to perceive these robust features, and is often tricked into misclassifying the image.
The question is, how do we incorporate this intuitive detection capability natural to human beings, into deep learning models to protect them from harm?