What is AI writing detection?
AI writing detection involves using artificial intelligence and machine learning technologies to analyze and identify text produced by generative AI—a rapidly growing influence in the education community.
The introduction of Large Language Models (LLMs) in late 2022 has revolutionized content creation, offering both opportunities and challenges for academia. While many see the potential for tools like ChatGPT to enhance creativity, others worry about the implications for academic integrity.
In a recent study by Casal and Kessler, 72 linguistics experts were only able to correctly identify AI-generated content 38.9% of the time, highlighting the difficulty of distinguishing AI-written text from human-authored work. As educators face the struggle of sorting through student papers, AI writing detection tools have become an additional data point in helping identify content that may have been generated by AI (2023).
The rise of more sophisticated AI writing tools has spurred ongoing advancements in detection technology. Many institutions are now exploring how organizations like Turnitin can bridge the gap between AI-generated and ethically produced writing, and how these tools will continue to evolve in the future.
Why is AI writing detection important?
Many industries, particularly academia, are recognizing that generative AI is here to stay. In Turnitin’s 2023 study, conducted by Tyton Partners*, it was found that three times as many students as faculty reported regularly using generative AI writing tools like ChatGPT.
Following ChatGPT’s launch, many institutions attempted to prohibit its use. However, as Kevin Roose, technology columnist for the New York Times comments, “Sure, a school can block the ChatGPT website on school networks and school-owned devices. But students have phones, laptops and any number of other ways of accessing it outside of class.” The ubiquitousness of this emerging technology makes banning it potentially obstructive to student skills development. In the same Turnitin and Tyton Partners’ study*, 46% of students reported they would use generative AI tools even if prohibited by instructors or institutions.
While opinions on AI’s role in academic assignments vary, some educators are now embracing AI writing as a teaching aid, helping students overcome creative blocks and generate ideas. Despite these benefits, there is a clear need to manage the risks AI poses to academia. AI writing detection provides a tool to assist educators in maintaining trust in student integrity.
To stay ahead of AI advancements, educators are now being actively encouraged to adapt their practices. This includes revising assessments, rethinking proof of learning, and considering adoption of AI writing detection tools.
Reflective feedback from students who’ve used AI for learning, collected by Marc Watkins, assistant director of academic innovation at the University of Mississippi, indicates that AI can enhance creativity when used cautiously. However, this also highlights a crucial consideration: how can educators balance the benefits of AI with the need to maintain academic integrity?
While generative AI has the potential to boost creativity and productivity, it also poses risks, including blurring the lines between human and synthetic content (Beyer). Since the launch of Turnitin’s AI writing detection tool in April 2023, over 250 million submissions have been reviewed, with 8.4 million flagged as having at least 80% potential AI writing.
Integrating AI writing detection into the classroom can serve as both a deterrent and a learning tool, helping educators navigate this new dimension of education. Faculty can use detection as a data point to determine if AI use aligns with their institution’s guidelines and academic integrity policy.
How does Turnitin's AI writing detection tool work?
When a paper is submitted to Turnitin, the submission is first broken into segments of text that are roughly a few hundred words (about five to ten sentences). Those segments are then overlapped with each other to capture each sentence in context.
The segments are run against our AI detection model and we give each sentence a score between 0 and 1 to determine whether it is written by a human or by AI. If our model determines that a sentence was not generated by AI, it will receive a score of 0. If it determines the entirety of the sentence was likely generated by AI, it will receive a score of 1.
Using the average scores of all the segments within the document, the model then generates an overall prediction of how much text in the submission we believe has been generated by AI.
Currently, Turnitin’s AI writing detection model is trained to detect content from the GPT-3.5 and GPT-4 language models, which are used in the ChatGPT application. We are actively working on expanding our model to enable us to better detect content from other AI language models.
Does Turnitin’s AI writing detection work in non-English languages?
For the first iteration of Turnitin’s AI writing detection capabilities, English was the only detectable language. Now, Turnitin has released the same capabilities in Spanish to provide a tool that helps educators uphold academic integrity while ensuring that students are treated fairly. The tool shows an overall percentage of the document that AI may have generated and the indicator further links to a report that highlights the text segments that our model predicts were likely written by AI.
What is the false positive rate for Turnitin’s AI writing detection tool?
AI writing detection tools are designed to give you a guide to start conversations with a submitting student, but we want to be clear that the AI writing report is not an absolute proof or disproof of AI writing.
At Turnitin, we strive to maximize the effectiveness of our detector, and we consider false positive rates when assessing the AI writing score attached to a student paper. False positives occur when fully human-written text is identified as being AI generated.
While the risk of false positives from Turnitin’s AI writing indicator are less than 1% for a document with over 20% likely AI-generated content, their presence highlights the importance of using AI writing detection as a signaling tool and one piece of an investigative puzzle. The AI writing score exists to facilitate formative conversations with students and has much more impact when assessed in combination with other supporting factors.
To avoid potential incidence of false positives, no score or highlights are attributed for Turnitin AI detection scores in the 1% to 19% range. When AI is detected below the 20% threshold in the report, it is indicated with an asterisk (*%) and no percentage is attributed.
As large language models grow, we are focused on adapting and optimizing our AI writing indicator based on our learnings.
How can institutions introduce AI writing detection to students and faculty members?
The introduction of a new AI writing detection tool is likely to be daunting for both students and faculty members across an institution. Students with positive intent may be wondering why their integrity is under scrutiny. Faculty members may be hesitant to take on new technologies due to time constraints, or even fail to fully understand the changing landscape around generative AI writing and why being on high alert for this type of misconduct is necessary.
Remaining transparent with members of your institution about how an AI writing detection tool will be adopted is key to preventing confusion and panic.
Communicate the impact of generative AI with both educators and students
As generative AI tools become even more accessible and sophisticated, there are both benefits and drawbacks associated with their presence in the classroom. Being aware of the impact that generative AI stands to have on reforming education in the coming years will give your institution a head start in modernizing its pedagogical approaches, including proof-of-learning methods and safeguard their assignments against misuse. This insight will also help students to understand the impact of generative AI in order to minimize their exposure to unintentional academic misconduct.
Learn how to use and interpret the AI writing detection tool
Educators and investigators should take time to understand the capabilities and limitations of AI writing detection before looking to adopt it as part of their assessment process. Due to a potential presence of false positives, we do not claim that our AI writing indicator is foolproof. When initially approaching the issue of false positives in a student paper, we recommend offering the benefit of the doubt to your students. It is only when further, more definitive evidence has been gathered should an investigator opt to move forward with next steps according to the institution’s academic integrity policy.
Be transparent about how and when the AI writing detection tool will be used
Whilst the AI writing score is visible only to Turnitin instructors and administrators, this does not discount the importance of openly discussing the usage of AI writing detection with students, whether this be verbally, via email, or by updating your institution’s academic integrity policy. Educators must be open about when and how the AI writing score will be closely monitored. Will it be used during formative assessment, or summative only? Will students be able to see the tool in action ahead of time to understand its capabilities? How will the institution manage potential AI writing misuse cases?
Susan D’Agostino quotes Nestor Pereira, Vice Provost of Academic and Learning Technologies at Miami Dade College (USA), as describing AI writing detection tools as “a springboard for conversations with students.” Pereira goes on to say that students who are inclined to use generative AI to replace their writing may think twice about it if an AI writing detection tool is in place within the institution.
How can an AI writing detection tool be used to aid an investigative process?
Having an investigative process in place is indispensable should you encounter a paper potentially written by a generative AI tool, or if potential false positives arise when using an AI writing detection tool. While the risk of false positives is less than 1% for a document with over 20% likely AI-generated content, being prepared to have a direct conversation with a student can make the investigation as pain-free as possible—both for the student and educator alike.
We must remember that although AI writing detection tools are an investigative aid, “...as we all glide into an artificially drafted future, it's clear that a human questioning mindset will be needed. Indeed, our investigative skills and critical thinking techniques could be in more demand than ever before” (O'Brien, 2023).
Download the AI writing report
As a first step in preparing for a conversation with a student, we recommend downloading the AI writing report and thoroughly analyzing it in conjunction with any other available findings and the expertise of the educator. Our goal is to equip reviewers with insights that aid them in informed decisions; however, it’s crucial to recognize that the AI writing report is just one piece of the puzzle. Context—such as the writing style and any historical work submitted—is equally important in understanding the overall picture. By considering both the AI score and the broader context alongside the academic misconduct policy, educators can view the totality of the data to make an assessment of the student work.
Rely on educator relationships
In the academic environment, an educator’s relationship with their student can be a valuable starting point when investigating potential AI writing in a student’s paper. Whatever the score highlighted by the AI writing detection tool, an accusation should never be made without a respectful dialogue with the student in question. Has the educator worked with the student throughout the semester? Has feedback been offered during the writing process? As part of their interactions with the student, do they recall the student having sufficient subject knowledge? Educators will have built relationships with the student and can use that as well in evaluating student work.
Ask for proof of critical thinking
If a student is the true author of the paper they’ve submitted, they are likely to have several items that can corroborate their thought process as they wrote the paper. For example, research notes, outlines, document version histories and metadata, and previous draft print-outs. They may also have received feedback from you, another educator, a peer, a trusted reviewer, or software that is able to track the writing process. This could be a solid indicator for AI writing versus human writing.
Assess previous writing samples
If the assessment does not permit generative AI as part of the writing process, you may wish to use previous writing samples to compare writing style, grammar sophistication, and vocabulary complexity to the paper in question. Do they match?
We understand that acquiring a writing sample from before generative AI existed may be challenging as time progresses, and if that is the case, we recommend asking students to write a piece under test conditions at the start of the term (and make it clear to them why you are doing this). Also ask yourself if you can put guardrails in place to protect future assignments from AI misuse. This can aid in deterring students from using it to write their papers when it is not permitted, whilst offering educators reassurance that they have original writing to compare against should the issue of potential AI writing—as well as ghostwriting—arise in future.
Assume positive intent
If you are unable to reach a definitive conclusion of AI versus human writing, this may be the right time to start assuming positive intent. If there is any question of uncertainty during your investigation—from AI writing detection to having all of the right conversations and asking the right questions—we recommend moving forward without accusation or penalty. The good news is that this experience alone should act as a powerful deterrent for any potential future misuse.
Overview: What academic leaders need to know about AI writing detection
Generative AI is developing at speed, but so too is AI writing detection. At Turnitin, we recognize the needs of our education community and are already hard at work building detection systems for future Large Language Models.
However, AI writing detection is just one component of a larger process. Additional evidence-based practice also plays a crucial role in investigations and decision-making for the sensitive area of academic integrity; it’s important to approach academic integrity matters with thoroughness and care, ensuring that every avenue of investigation is considered before drawing conclusions.
An AI writing detection tool provides one data point, but it cannot offer a definitive conclusion. We encourage prioritizing human judgment when evaluating a student’s work, taking into account the potential for false positives, the student’s intent, and, most importantly, understanding of their skills and abilities.