Hacker Newsnew | past | comments | ask | show | jobs | submitlogin





> Small quantities of poisoned training data can significantly damage a language model.

Is this still accurate?


Probably always be true, but also probably not effective in the wild. Researchers will train a version, see results are off, put guards against poisoned data, re-train and no damage been done to whatever they release.

How would they put guards against poisoned data ? How would they identify poisoned data if there are a lot/obfuscated ?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: