ADVERTISEMENT
As AI text detection gets better, so does AI textTools to identify whether a piece of text was written by AI have started to emerge in recent months, including one created by OpenAI
International New York Times
Last Updated IST
ChatGPT logo is seen in this illustration taken, February 3, 2023. Credit: Reuters Photo
ChatGPT logo is seen in this illustration taken, February 3, 2023. Credit: Reuters Photo

It may soon become common to encounter a tweet, essay or news article and wonder if it was written by artificial intelligence software. There could be questions over the authorship of a given piece of writing, like in academic settings, or the veracity of its content, in the case of an article.

There could also be questions about authenticity: If a misleading idea suddenly appears in posts across the internet, is it spreading organically, or have the posts been generated by AI to create the appearance of real traction?

Tools to identify whether a piece of text was written by AI have started to emerge in recent months, including one created by OpenAI, the company behind ChatGPT. That tool uses an AI model trained to spot differences between generated and human-written text.

When OpenAI tested the tool, it correctly identified AI text in only about half of the generated writing samples it analysed. The company said at the time that it had released the experimental detector “to get feedback on whether imperfect tools like this one are useful.”

Identifying generated text, experts say, is becoming increasingly difficult as software like ChatGPT continues to advance and turns out text that is more convincingly human. OpenAI is now experimenting with a technology that would insert special words into the text that ChatGPT generates, making it easier to detect later. The technique is known as watermarking.

The watermarking method that OpenAI is exploring is similar to one described in a recent paper by researchers at the University of Maryland, said Jan Leike, the head of alignment at OpenAI.

If someone tried to remove a watermark by editing the text, they would not know which words to change. And even if they managed to change some of the special words, they would most likely only reduce the total percentage by a couple of points.

Tom Goldstein, a professor at the University of Maryland and co-author of the watermarking paper, said a watermark could be detected even from “a very short text fragment,” such as a tweet. By contrast, the detection tool OpenAI released requires a minimum of 1,000 characters.

Like all approaches to detection, however, watermarking is not perfect, Goldstein said. OpenAI’s current detection tool is trained to identify text generated by 34 different language models, while a watermark detector could only identify text that was produced by a model or chatbot that uses the same list of special words as the detector itself.

That means that unless companies in the AI field agree on a standard watermark implementation, the method could lead to a future where questionable text must be checked against several different watermark detection tools.

To make watermarking work well every time in a widely used product like ChatGPT, without reducing the quality of its output, would require a lot of engineering, Goldstein said.

Leike of OpenAI said the company was still researching watermarking as a form of detection, and added that it could complement the current tool, since the two “have different strengths and weaknesses.”

Still, many experts believe a one-stop tool that can reliably detect all AI text with total accuracy may be out of reach. That is partly because tools could emerge that could help remove evidence that a piece of text was generated by AI. And generated text, even if it is watermarked, would be harder to detect in cases where it makes up only a small portion of a larger piece of writing.

Experts also say that detection tools, especially those that do not use watermarking, may not recognize generated text if a person has changed it enough.

“I think the idea that there’s going to be a magic tool, either created by the vendor of the model or created by an external third party, that’s going to take away doubt — I don’t think we’re going to have the luxury of living in that world,” said David Cox, a director of the MIT-IBM Watson AI Lab.

Sam Altman, CEO of OpenAI, shared a similar sentiment in an interview with StrictlyVC last month.

“Fundamentally, I think it’s impossible to make it perfect,” Altman said. “People will figure out how much of the text they have to change. There will be other things that modify the outputted text.”

Part of the problem, Cox said, is that detection tools themselves present a conundrum, in that they could make it easier to avoid detection. A person could repeatedly edit generated text and check it against a detection tool until the text is identified as human-written — and that process could potentially be automated. Detection technology, Cox added, will always be a step behind as new language models emerge, and as existing ones advance.

“This is always going to have an element of an arms race to it,” he said. “It’s always going to be the case that new models will come out and people will develop ways to detect that it’s a fake.”

Some experts believe that OpenAI and other companies building chatbots should come up with solutions for detection before they release AI products, rather than after. OpenAI launched ChatGPT at the end of November, for example, but did not release its detection tool until about two months later, at the end of January.

By that time, educators and researchers had already been calling for tools to help them identify generated text. Many signed up to use a new detection tool, GPTZero, which was built by a Princeton University student over his winter break and was released January 1.

“We’ve heard from an overwhelming number of teachers,” said Edward Tian, the student who built GPTZero. As of mid-February, more than 43,000 teachers had signed up to use the tool, Tian said.

“Generative AI is an incredible technology, but for any new innovation we need to build the safeguards for it to be adopted responsibly, not months or years after the release, but immediately when it is released,” Tian said.

How artificial intelligence generates text

When artificial intelligence software like ChatGPT writes, it considers many options for each word, taking into account the response it has written so far and the question being asked.

It assigns a score to each option on the list, which quantifies how likely the word is to come next, based on the vast amount of human-written text it has analysed.

ChatGPT, which is built on what is known as a large language model, then chooses a word with a high score, and moves on to the next one.

The model’s output is often so sophisticated that it can seem like the chatbot understands what it is saying — but it does not.

Every choice it makes is determined by complex math and huge amounts of data. So much so that it often produces text that is both coherent and accurate. But when ChatGPT says something that is untrue, it inherently does not realise it.

This article originally appeared in The New York Times.

ADVERTISEMENT
(Published 27 February 2023, 11:56 IST)