So far, even AI companies have had a hard time making tools that can reliably spot when a piece of writing was created using an extensive AI language system. Now, a team of scientists has come up with a clever way to guess how much these AI systems are being used in lots of science papers. They did this by looking at which "extra words" started showing up way more often when these AI writing tools became popular (that is, in 2023 and 2024). What they found suggests that "at least 10 percent of paper summaries from 2024 were made with help from AI," according to the study.
In a paper shared online earlier this month, four experts from a German university and an American university explained their new method. They got the idea from studies that looked at how many more people died during Covid-19 compared to standard times. By looking at "extra word use" in a similar way after AI writing tools became easy to use in late 2022, the team found that "the arrival of these AI tools led to a sudden jump in how often certain words were used" in a way that was "unlike anything seen before in terms of how big and fast the change was."
Digging into the Numbers
To measure these word changes, the scientists looked at 14 million paper summaries published on a science website between 2010 and 2024. They kept track of how often each word showed up each year. Then, they compared how usually they expected words to show up (based on what happened before 2023) to how often they showed up in summaries from 2023 and 2024 when AI writing tools were used a lot.
They found some words that were very rare in science summaries before 2023 but suddenly became very popular after AI writing tools came out. For example, the word "explores" showed up 25 times more often in 2024 papers than you'd expect based on earlier trends. Words like "highlights" and "emphasizes" were used nine times more often. Even common words became more popular in AI-assisted summaries: the word "implications" was used 4.1 percentage points more, "results" 2.7 percentage points more, and "significant" 2.6 percentage points more.
These kinds of word changes could happen naturally as language evolves. But, the scientists found that before AI writing tools, such extensive and sudden changes in word use only happened for words related to significant world health events. For example, "ebola" in 2015, "zika" in 2017, and words like "coronavirus," "lockdown," and "pandemic" from 2020 to 2022.
But after AI writing tools came out, the scientists found hundreds of words that suddenly became much more common in science writing, and these words weren't related to world events. During Covid-19, the extra words mainly were nouns, but the words that became more common after AI writing tools were mostly "style words" like verbs, adjectives, and adverbs. Some examples are: "across, additionally, comprehensive, crucial, enhancing, exhibited, insights, notably, particularly, within".
How Words Work Together
By pointing out hundreds of "marker words" that became much more common after AI writing tools came out, it can sometimes be easy to spot when AI was used. Look at this example sentence from a paper summary, with the marker words in bold: "A thorough understanding of the complex interactions between [...] and [...] is essential for effective treatment approaches."
After doing some math on how often these marker words showed up in individual papers, the scientists guess that at least 10 percent of the papers published after 2022 were written with some help from AI. They warn that this number could be even higher because their method might miss some AI-assisted summaries that don't use any of the marker words they identified.
The percentages change a lot for different groups of papers. Interestingly, papers written in countries like China, South Korea, and Taiwan used AI marker words 15 percent of the time. This suggests that "AI might help non-native English speakers edit their writing, which could explain why they use it so much." On the other hand, the scientists think that native English speakers "might be better at noticing and removing unusual words from AI-generated text," which could hide their AI use from this kind of analysis.
"AI is known for making up sources, giving wrong summaries, and making false claims that sound true and convincing"
Spotting AI use is essential, the scientists stress, because "AI is known for making up sources, giving wrong summaries, and making false claims that sound true and convincing." However, as more people learn about these AI marker words, human editors might get better at removing these words from AI-generated text before it's published.
Looking ahead, future AI writing tools might do this kind of word analysis themselves, using marker words less often to make their writing seem more human-like. Soon, we might need to call in experts (like in the movie "Blade Runner") to find the AI-generated text hiding in our science papers.
This groundbreaking research shows how AI is being used more and more in science writing and opens up new ways to spot and understand how AI is changing academic writing. Studies like this will be crucial for keeping science communication honest and transparent as it gets harder to tell the difference between human and AI-generated content.
What do you think about this? Is this now how important publication will be written?
Comments