【时间】2018.12.22
【题目】对抗攻击(Adversarial attacks)的常用术语
概述
本文是论文《Threat of Adversarial Attacks on Deep Learning in Computer Vision:
A Survey》中Section 2 的翻译,主要讲述了对抗攻击(Adversarial attacks)的常用术语。
一、 对抗攻击(Adversarial attacks)的常用术语
In this section, we describe the common technical terms used in the
literature related to adversarial attacks on deep learning in Computer Vision.
在本节中,我们描述了计算机视觉中与对抗性攻击有关的文献中常用的技术术语。
1.1 Adversarial example/image
Adversarial example/image is a modified version of a clean image that is
intentionally perturbed (e.g. by adding noise) to confuse/fool a machine
learning technique, such as deep neural networks.
对抗样本/图像是干净图像的一个修改版本,它被故意干扰(例如通过添加噪声)来混淆/愚弄机器学习技术,如深层神经网络。
1.2 Adversarial perturbation
Adversarial perturbation is the noise that is added to the clean image to
make it an adversarial example.
对抗性扰动是指加到干净的图像中,使其成为一个对抗样本的噪声。
1.3 Adversarial training
Adversarial training uses adversarial images besides the clean images to
train machine learning models.
对抗训练是指除了使用干净的图像外,还使用对抗性图像来训练机器学习模型
1.4 Adversary
Adversary more commonly refers to the agent who creates an adversarial
example. However, in some cases the example itself is also called adversary.
对抗者更多的是指创造对抗样本的代理人。然而,在某些情况下,这个对抗样本本身也被称为对抗者。
1.5 Black-box attacks & ‘semi-black-box’ attacks
Black-box attacks feed a targeted model with the adversarial examples
(during testing) that are generated without the knowledge of that model. In
some instances, it is assumed that the adversary has a limited knowledge of the
model (e.g. its training procedure and/or its architecture) but definitely
does not know about the model parameters. In other instances, using any
information about the target model is referred to as ‘semi-black-box’attack.
We use the former convention in this article.
黑箱攻击
向目标模型提供了不了解该模型而生成的对抗样本(在测试期间)。在某些情况下,假定对抗者对模型的了解有限(例如,训练过程和/或其结构),但肯定不知道模型参数。在其他情况下,使用任何关于目标模型的信息都被称为
“半黑箱”攻击。
1.6 White-box attacks
White-box attacks assume the complete knowledge of the targeted model,
including its parameter values, architecture, training method, and in some
cases its training data as well.
白箱攻击假定(对抗者)完全了解目标模型,包括其参数值、体系结构、训练方法,在某些情况下还包括其训练数据。
1.7 Detector
Detector is a mechanism to (only) detect if an image is an adversarial
example.
检测器是一种用于(仅)检测图像是否是对抗样本的工具。
1.8 Fooling ratio/rate
Fooling ratio/rate indicates the percentage of images on which a trained
model changes its prediction label after the images are perturbed.
欺骗率是指一个经过训练的模型在受到干扰后改变其预测标签的图像百分比。
1.9 One-shot/one-step methods & iterative methods
One-shot/one-step methods generate an adversarial perturbation by
performing a single step computation, e.g. computing gradient of model loss
once. The opposite areiterative methods that perform the same computation
multiple times to get a single perturbation. The latter are often
computationally expensive.
一次/一步方式通过执行一步计算,例如计算模型损失梯度一次来产生对抗扰动。相反的是迭代方式
,它们多次执行相同的计算以获得单个扰动。后者通常在计算上很昂贵。
1.10 Quasi-imperceptible perturbations
Quasi-imperceptible perturbations impair images very slightly for human
perception.
准不可察觉的扰动会轻微地损害图像,就人类感知方面而言。
1.11 Rectifier
Rectifier modifies an adversarial example to restore the prediction of the
targeted model to its prediction on the clean version of the same example.
整流器(校正器)修改对抗样本,以将目标模型的预测恢复到其对同一示例的干净版本的预测。
1.12 Targeted attacks & non-targeted attacks
Targeted attacks fool a model into falsely predicting a specific label for
the adversarial image. They are opposite to the non-targeted attacks in which
the predicted label of the adversarial image is irrelevant, as long as it is
not the correct label.
目标攻击欺骗了模型,使其错误地预测对抗性图像为特定标签。它们与非目标攻击相反,在非目标攻击中,对抗性图像的预测标记是不相关的,只要它不是正确的标记。
1.13 Threat model
Threat model refers to the types of potential attacks considered by an
approach, e.g. black-box attack.
威胁模型是指一种方法所考虑的潜在攻击类型,例如黑匣子攻击。
1.14 Transferability
Transferability refers to the ability of an adversarial example to remain
effective even for the models other than the one used to generate it.
可转移性是指对抗性范例即使对生成模型以外的模型也保持有效的能力。
1.15 Universal perturbation & universality
Universal perturbation is able to fool a given model on ‘any’ image with
high probability. Note that, universality refers to the property of a
perturbation
being ‘image-agnostic’ as opposed to having good transferability.
普遍扰动能够以很高的概率在任意图像上欺骗给定模型。请注意,普遍性是指扰动的性质是“图像不可知论”,而不是具有良好的可转移性。
热门工具 换一换