LLMs become more covertly racist with human intervention
Since their inception, it’s been clear that large language models like ChatGPT absorb racist views from the millions of pages of the internet they are trained on. Developers have responded by trying to make them less toxic. But new research suggests that those efforts, especially as models get larger, are only curbing racist views that…