Adversarial attacks have been expanded to speaker recognition (SR). However,
existing attacks are often assessed using different SR models, recognition
tasks and datasets, and only few adversarial defenses borrowed from computer
vision are considered. Yet,these defenses have not been thoroughly evaluated
against adaptive attacks. Thus, there is still a lack of quantitative
understanding about the strengths and limitations of adversarial attacks and
defenses. More effective defenses are also required for securing SR systems. To
bridge this gap, we present SEC4SR, the first platform enabling researchers to
systematically and comprehensively evaluate adversarial attacks and defenses in
SR. SEC4SR incorporates 4 white-box and 2 black-box attacks, 24 defenses
including our novel feature-level transformations. It also contains techniques
for mounting adaptive attacks. Using SEC4SR, we conduct thus far the
largest-scale empirical study on adversarial attacks and defenses in SR,
involving 23 defenses, 15 attacks and 4 attack settings. Our study provides
lots of useful findings that may advance future research: such as (1) all the
transformations slightly degrade accuracy on benign examples and their
effectiveness vary with attacks; (2) most transformations become less effective
under adaptive attacks, but some transformations become more effective; (3) few
transformations combined with adversarial training yield stronger defenses over
some but not all attacks, while our feature-level transformation combined with
adversarial training yields the strongest defense over all the attacks.
Extensive experiments demonstrate capabilities and advantages of SEC4SR which
can benefit future research in SR.

By admin