In short: many have pointed out that AI progress is decreasing the cost to “polish” or “embroider” all sorts of artifacts. This means that communities that once used polish (broadly construed) as a cheap proxy for community-specific measures of “real value” are being forced to consider other signals of provenance and verifiability (or are simply losing the ability to curate information). This is a problem that communities and institutions face every time there’s new ways to get “cheap polish”, but it’s especially bad right now. Even communities with careful review processes to prevent “false positives” are affected.
Lace. Samuel L. Goldenberg, Public domain, via Wikimedia Commons
This is a topic that’s already been discussed a lot!1 This post is an attempt to connect ongoing, well-trodden discussions to ideas for improving curation and introspecting, with a focus on peer review in particular. It’s also an opportunity to highlight the data leverage angle. Threats to the production of high quality information artifacts is also a problem for the AI field itself; the communities and institutions that curate knowledge are providing data for pre-training and for some kinds of post-training. And the training of information producers like scientists is itself a critical process for the eventual evaluation of AI systems. Finally, at the end of the post, I’ll foreshadow another possible solution to this problem: what if we go “full transparency” and just post the full AI-interaction transcripts for any AI-touched artifacts we produce?