Disclosure: I'm an IBMer IBM research has been looking at data model poisoning f...

a-dub · on April 19, 2022

i would guess that it might be possible to poison a model by perturbing training examples in a way that is imperceptible to humans. that is, i wonder if it's possible to mess with the noise or the frequency domain spectra of a training example such that a model learned on that example would have adversarial singularities that are easy to find given the knowledge of how the imperceptible components of the training data were perturbed.

has anyone done this or anything like it?

PeterisP · on April 19, 2022

Yes, much of research on adversarial examples is essentially about how to generate adversarial examples with minimally perceptible perturbations. IMHO the difficult part there is having a good model of what actually is less or more perceptible to humans. However, since that overlaps with other popular areas of research such as error metrics for realistic image generation in GANs, there are reasonable solutions to optimize for that.

On the other hand, a seemingly benign perturbation does not necessarily correlate with it being imperceptible. A larger, visually obvious perturbation with a plausible explanation can be less suspicious than a smaller but weirder perturbation.

a-dub · on April 20, 2022

hasn't that already been studied in psychophysics, specifically as applied to lossy/perceptual compression?

i suppose the real goal would be a training procedure that tries to ignore stuff outside of the human percept. metamers, masking, noise and attention... oh my.

lIIIllllIIII · on April 19, 2022

given that AI is primarily trained on web data I wonder if it's possible to attack other people's ML training in that way :-)

a-dub · on April 19, 2022

that's the idea! we know about adversarial inputs at inference time, this paper talks about adversarial perturbation of the model itself during training. what about undetectable adversarial training inputs where people do their own training but the model still ends up with hard to find (except for the adversary) weaknesses?

postingposts · on April 20, 2022

Disclosure: I won’t say what I do.

You should really consider things from a “what can humans perceive” standpoint. There are things you can do with ML and eye saccades that you will literally never see because of perceptual delay. If I can push a saccadic event below 50ms you will never notice it. https://en.wikipedia.org/wiki/Saccade

That’s one example.