Skip to main content
OpenAI's Text Watermarking Gamble: High Accuracy, Higher Stakes for Europe's Non-Native Speakers
· 6 min read

OpenAI's Text Watermarking Gamble: High Accuracy, Higher Stakes for Europe's Non-Native Speakers

OpenAI has developed text watermarking technology capable of identifying ChatGPT-generated content with claimed accuracy of 99.9%, but the company is holding back from release. Concerns centre on circumvention by bad actors and disproportionate harm to non-English speakers, a category that encompasses a significant share of European users.

OpenAI has built a text watermarking system that can identify ChatGPT-generated content with near-perfect accuracy in controlled tests, yet the company refuses to ship it. That restraint is deliberate, and for once it is probably the right call. The technology embeds an invisible signature into ChatGPT's word-selection process during generation, creating a detectable pattern that survives light editing. What it cannot survive is determined circumvention, and that gap matters enormously for European institutions weighing whether to trust it.

How the Technology Actually Works

Below 30%
Accuracy of OpenAI's previous AI detector

OpenAI shut down its earlier AI text detection tool in July 2023 after real-world accuracy rates fell below 30%, making the stakes considerably higher for the watermarking successor.

Source
800+
European universities represented by the EUA

The European University Association, which represents more than 800 institutions across 48 countries, has called for nuanced AI use policies that distinguish between AI-assisted drafting and full AI authorship.

Source
24
Official EU languages

With 24 official languages across the EU alone, any watermarking rollout that disproportionately flags non-native English speakers would affect a substantial share of European students and professionals.

Source

Text watermarking does not tag a document after the fact. Instead, it subtly biases which tokens ChatGPT selects during generation, producing a statistical fingerprint that a detection tool can later read. Because the signature is baked into the generation process itself rather than appended as metadata, it is invisible to the reader and resistant to routine paraphrasing.

OpenAI's own research shows the system holds up well against localised tampering. Run the output through a thesaurus or shuffle a few sentences, and the watermark largely survives. However, feed the text through a translation engine or a rival large language model and the signature degrades significantly. That vulnerability is not a minor implementation detail; it is a fundamental weakness that any determined bad actor can exploit with minimal technical knowledge.

The company shut down its previous AI text detector in July 2023 after accuracy rates proved embarrassingly low, reportedly below 30 per cent in real-world conditions. The stakes for this successor are therefore considerably higher. A second public failure would not merely damage OpenAI's credibility; it would set back legitimate efforts to develop reliable provenance tools across the industry.

"The text watermarking method we're developing is technically promising, but has important risks we're weighing whilst we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers," an OpenAI spokesperson told TechCrunch.

A wide editorial photograph taken inside a modern European university library, rows of students at laptops, bookshelves and large windows in the background suggesting an ETH Zurich or Sorbonne reading

The European Dimension: Non-Native Speakers at Risk

For European audiences, the non-English speaker concern is not an abstract worry about distant markets. The EU has 24 official languages. Across France, Germany, Poland, Italy, Spain, and beyond, millions of students and professionals routinely use ChatGPT as a writing aid when composing in English, a language that is not their first. Under a naive watermarking rollout, their documents could be flagged as AI-generated simply because they leaned on ChatGPT to polish grammar or smooth phrasing, even when the underlying ideas and arguments were entirely their own.

Sandra Wachter, Professor of Technology and Regulation at the Oxford Internet Institute, has consistently argued that automated classification systems carry discriminatory risks when they treat non-native speakers differently from native ones. Her research on algorithmic decision-making in high-stakes contexts applies directly here: a detection system with a differential false-positive rate along language lines is not a neutral tool. It is a mechanism that penalises people for where they were born.

The concern extends into higher education, where European universities are still negotiating what responsible AI use looks like. The European University Association, which represents more than 800 institutions across 48 countries, has called for nuanced approaches that distinguish between AI-assisted drafting and wholesale AI authorship. A blunt watermarking system risks collapsing that distinction entirely.

Regulatory Pressure and the EU AI Act

OpenAI's caution also reflects the regulatory environment it faces in Europe. The EU AI Act, which entered into force in August 2024, imposes transparency obligations on providers of general-purpose AI models. Article 50 of the Act requires that AI-generated content be marked in a machine-readable format where technically feasible. Watermarking is one candidate mechanism, but the Act does not mandate a specific technical approach, leaving room for alternatives such as cryptographic provenance or metadata-based labelling.

Dragos Tudorache, the Romanian MEP who co-led the European Parliament's negotiations on the AI Act, has emphasised that provenance tools must be evaluated not only for technical efficacy but for their societal impact. Writing and speaking about the Act's implementation, he has argued that a tool which reliably catches bad actors whilst systematically disadvantaging non-native speakers would fail both goals simultaneously. That framing neatly captures OpenAI's dilemma.

The broader European regulatory posture also matters here. Ofcom in the UK is developing online safety frameworks that intersect with AI-generated content, and the Information Commissioner's Office has flagged data-processing concerns around AI detection tools. Any watermarking system deployed at scale in the UK would need to satisfy both bodies, adding compliance complexity on top of the technical challenges.

Industry Alternatives Under Consideration

OpenAI is not the only organisation working on this problem, and the alternatives being explored are instructive about where the industry is heading:

The collaborative disclosure approach is gaining traction in European academic circles. Rather than trying to catch students using AI after the fact, institutions including ETH Zurich have piloted assignment structures that require students to document how they used AI tools, treating the interaction as part of the learning record rather than evidence of misconduct.

Detection Accuracy in Context

The 99.9 per cent accuracy figure that has circulated in coverage of OpenAI's watermarking research requires careful interpretation. That figure applies to controlled conditions where the original ChatGPT output is tested without significant post-processing. In real-world deployment, where users routinely edit, translate, and reformat AI-assisted text, accuracy drops sharply. Earlier statistical analysis methods achieved 40 to 60 per cent accuracy, whilst hybrid approaches combining multiple signals have reached 70 to 85 per cent, still well below the threshold required for consequential decisions such as academic disciplinary proceedings.

The gap between laboratory performance and field performance is a recurring problem in AI detection. OpenAI's previous detector failed largely because its training data did not reflect the diversity of editing and rewriting that real users apply to AI-generated text. There is no reason to assume the watermarking system will be immune to the same degradation once it meets the messiness of actual usage.

What Comes Next

OpenAI has offered no release timeline. The company's messaging suggests that launch depends on resolving the circumvention vulnerability and the non-native speaker impact, neither of which has an obvious near-term technical fix. That timeline ambiguity is frustrating for European educational institutions seeking clarity, but it is preferable to a rushed release that entrenches a flawed standard.

The more productive near-term path is the one already emerging from the European regulatory and academic ecosystem: invest in transparent disclosure frameworks, build AI literacy into curricula, and use provenance tools as one signal among many rather than as a binary verdict. Detection technology will improve, but the institutional and pedagogical frameworks needed to use it responsibly need to be built now, regardless of when OpenAI decides to ship.

Updates

AI Terms in This Article 4 terms
tokens

Small chunks of text (words or word fragments) that AI models process.

at scale

Applied broadly, to a large number of users or use cases.

ecosystem

A network of interconnected products, services, and stakeholders.

responsible AI

Developing and deploying AI with consideration for ethics, fairness, and safety.

Advertisement

Comments

Sign in to join the conversation. Be civil, be specific, link your sources.

No comments yet. Start the conversation.
Sign in to comment