IBM research has been looking at data model poisoning for some time and open sourced an Adversarial Robustness Toolbox [0]. They also made a game to find a backdoor [1]
i would guess that it might be possible to poison a model by perturbing training examples in a way that is imperceptible to humans. that is, i wonder if it's possible to mess with the noise or the frequency domain spectra of a training example such that a model learned on that example would have adversarial singularities that are easy to find given the knowledge of how the imperceptible components of the training data were perturbed.
Yes, much of research on adversarial examples is essentially about how to generate adversarial examples with minimally perceptible perturbations. IMHO the difficult part there is having a good model of what actually is less or more perceptible to humans. However, since that overlaps with other popular areas of research such as error metrics for realistic image generation in GANs, there are reasonable solutions to optimize for that.
On the other hand, a seemingly benign perturbation does not necessarily correlate with it being imperceptible. A larger, visually obvious perturbation with a plausible explanation can be less suspicious than a smaller but weirder perturbation.
hasn't that already been studied in psychophysics, specifically as applied to lossy/perceptual compression?
i suppose the real goal would be a training procedure that tries to ignore stuff outside of the human percept. metamers, masking, noise and attention... oh my.
that's the idea! we know about adversarial inputs at inference time, this paper talks about adversarial perturbation of the model itself during training. what about undetectable adversarial training inputs where people do their own training but the model still ends up with hard to find (except for the adversary) weaknesses?
You should really consider things from a “what can humans perceive” standpoint.
There are things you can do with ML and eye saccades that you will literally never see because of perceptual delay. If I can push a saccadic event below 50ms you will never notice it.
https://en.wikipedia.org/wiki/Saccade
IBM research has been looking at data model poisoning for some time and open sourced an Adversarial Robustness Toolbox [0]. They also made a game to find a backdoor [1]
[0] https://art360.mybluemix.net/resources
[1] https://guessthebackdoor.mybluemix.net/