In this work, we show how to jointly exploit adversarial perturbation and
model poisoning vulnerabilities to practically launch a new stealthy attack,
dubbed AdvTrojan. AdvTrojan is stealthy because it can be activated only when:
1) a carefully crafted adversarial perturbation is injected into the input
examples during inference, and 2) a Trojan backdoor is implanted during the
training process of the model. We leverage adversarial noise in the input space
to move Trojan-infected examples across the model decision boundary, making it
difficult to detect. The stealthiness behavior of AdvTrojan fools the users
into accidentally trust the infected model as a robust classifier against
adversarial examples. AdvTrojan can be implemented by only poisoning the
training data similar to conventional Trojan backdoor attacks. Our thorough
analysis and extensive experiments on several benchmark datasets show that
AdvTrojan can bypass existing defenses with a success rate close to 100% in
most of our experimental scenarios and can be extended to attack federated
learning tasks as well.

By admin