Skip Navigation
Robot hands typing on a keyboard

The Long-Awaited Answer to “How to Identify AI-Generated Text” Is Here: Google’s SynthID Text Watermark Tool

Author: Andrea Mondello
Date: October 28, 2024

A paper published October 23, 2024, titled “Scalable watermarking for identifying large language model outputs” discusses a new watermarking scheme called SynthIDAccording to these researchers, SynthID Text successfully distinguishes AI-generated text from human-written text and does it at scale. 

SynthID is a watermarking that was tested across multiple large language models (LLMs) and showed improved detectability compared to existing methods. It was also evaluated in a live experiment with nearly 20 million responses using Google’s Gemini and Gemini Advanced chatbots, confirming that it does not degrade text quality.

In the simplest terms, when the AI writes something, SynthID adds a hidden “signature” to the text. It slightly adjusts its word choices to include this signature. It’s like choosing specific words that fit a pattern only the tool can recognize, but it still makes sense to us humans since text still reads naturally and smoothly. Later, if someone wants to check if the text was written by AI, they can use a special detector that looks for the hidden signature in the text.  

The authors of the paper note that “generative watermarks require coordination between actors running the LLM text generation services to apply the watermark”. They also have concern over open-source models, since enforcing watermarks on decentralized deployments will be difficult. 

Finally, bad actors have been trying to find ways around generative text watermarking; further research will be needed to determine how SynthID compares with other watermarking schemes in terms of vulnerability.   

Are you ready to join the AI revolution? Early and effective AI adoption is crucial for maintaining a competitive edge. 

Despite the noted limitations, the development of watermarking for large language model schemes like SynthID will go a long way toward improving AI text generation through: 

  1. Enhanced Transparency: It will be easier to distinguish between human-written and AI-generated content, promoting transparency and trust in digital communications. 
  2. Combatting Misinformation: By identifying AI-generated text, it becomes possible to mitigate the spread of misinformation and deepfakes, which can be particularly harmful in areas like news, social media and political discourse. 
  3. Ethical AI Use: This technology supports the ethical use of AI by ensuring that AI-generated content is clearly marked, helping to prevent misuse in areas such as academic writing, journalism and content creation. 
  4. Regulatory Compliance: As regulations around AI use become stricter, watermarking can help organizations comply with legal requirements by providing a clear method to identify and manage AI-generated content. 
  5. Content Moderation: Platforms can use watermarking to better manage and moderate content, ensuring that AI-generated text adheres to community guidelines and standards. 

The code to generate and detect text watermarks with SynthID is available right here on GitHub.  

Understanding how to detect AI generated text is crucial. Text watermark tools like Google’s SynthID are at the forefront of this effort, providing innovative solutions so we can better navigate the complexities of the digital landscape and maintain trust in the content we encounter. SynthID can be a big step in the right direction toward a safer, more compliant and more transparent AI future.