Preprint server arXiv has announced a new policy targeting AI-generated content: authors whose papers contain unchecked model output — including hallucinated references or unfilled model instructions left in the text — will receive a one-year submission ban, and every future submission will require prior acceptance at a peer-reviewed journal. The policy was described on May 15, 2026 by Thomas Dietterich, emeritus professor at Oregon State University and a member of both arXiv's editorial advisory council and moderation team, in a thread on X.
Key takeaways
- One-year arXiv submission ban for unchecked AI content — applied to all listed co-authors
- After the ban: peer-review journal acceptance required before any future arXiv submission
- Examples of "incontrovertible evidence": hallucinated references, model metacomments (e.g., "fill in this table with real data")
- Policy derives from existing arXiv moderation standards — not a new rulebook
- Appeal process exists — arXiv has a procedure for contesting moderation decisions
Background: AI slop in scientific literature
The problem of AI slop in academic publications has been growing for years. Unchecked output from LLMs has appeared in peer-reviewed articles — fake citations, non-existent passages, and in one widely cited 2024 incident, a diagram of a rat with anatomically implausible genitalia passed through review in a biomedical journal. ArXiv, operating before formal peer review as a preprint server, had until now been only an informal first barrier.
Existing arXiv moderation standards required authors to exercise care in preparation — appropriate structure, figures, tables, references — without specific reference to AI tools. The new policy fills that gap by interpreting the existing "scrupulousness and care" requirement in the context of model-generated content.
The penalty mechanism and what violates the rules
The penalty operates in two stages. Stage one: a one-year block on arXiv submissions, applied to all listed co-authors — not just the person responsible for a specific passage. Stage two: after the ban expires, the author may resume publishing on arXiv, but only works previously accepted by a peer-reviewed journal.
Central to the policy is the concept of "incontrovertible evidence." Dietterich named two types: first, hallucinated references — citations to papers that do not exist, generated by a language model as plausible-sounding titles. Second, model metacomments left in the text — phrases such as "here is a 200-word summary, would you like changes?" or table instructions reading "this data is illustrative, fill in the real numbers from your experiments."
Both types point to the same conclusion: the author did not check the model's output before submission. In arXiv's view, this makes the entire paper untrustworthy — if one part was left unverified, no part can be trusted.
Consequences for preprint-dependent fields
In physics, mathematics, and theoretical computer science, an arXiv preprint is effectively the first public form of publication. Papers are cited, commented on, and built upon before formal peer review. A one-year ban means a year of absence from this layer of scientific circulation — with potentially serious consequences for doctoral students and early-career researchers.
There is also a risk of abuse: a malicious actor could add people as listed authors who had no involvement in the work, then submit a contaminated paper to trigger a ban. ArXiv's appeal process is designed to handle such scenarios, though Dietterich's announcement did not detail the mechanism.
Why it matters
ArXiv is critical infrastructure for rapid scientific knowledge exchange — particularly in AI and physics, where the speed of result sharing matters more than in traditional publication cycles. Introducing hard consequences for unchecked AI-generated content is the clearest signal yet that key scientific repositories do not intend to passively watch the degradation of submission quality.
The policy also sets a precedent: if arXiv sustains and enforces these rules, it becomes a model for other preprint servers (bioRxiv, medRxiv, SSRN) and journal editors, who until now have largely limited themselves to declarative guidelines without enforcement mechanisms. The key open question is feasibility — detecting hallucinated references is relatively straightforward, but identifying subtle factual errors generated by models still requires expert review.
What's next?
- Policy is in effect — arXiv confirmed it through moderator Dietterich in a public thread on May 15, 2026
- Ars Technica received a response from arXiv leadership suggesting implementation details are still being finalized — the final form of the policy may be modified
- Other preprint servers (bioRxiv, medRxiv, SSRN) have not announced similar policies — industry response may follow in coming months

