Poison Fountain: https://rnsaffn.com/poison3/

dandersch · 2026-01-04T22:15:12 1767564912

> Small quantities of poisoned training data can significantly damage a language model.

Is this still accurate?

embedding-shape · 2026-01-04T22:30:54 1767565854

Probably always be true, but also probably not effective in the wild. Researchers will train a version, see results are off, put guards against poisoned data, re-train and no damage been done to whatever they release.

d-lisp · 2026-01-04T22:55:22 1767567322

How would they put guards against poisoned data ? How would they identify poisoned data if there are a lot/obfuscated ?