With generative AI capable of producing exam answers that outperform medical students, educators face a pressing challenge – how to mitigate the risk of AI misuse in online exam settings without having to go back to traditional pen-and-paper assessment. This issue is particularly critical in fields like healthcare and law, where genuine knowledge and understanding are essential for professional competence.
In this blog post, we’ll explore the implications of using generative AI to produce exam answers. We'll discuss how this practice can potentially undermine academic and professional growth, the damage it can inflict on institutional reputations, and the extent to which AI can accurately answer exam questions.
We’ll also outline effective strategies to help you safeguard exam integrity in the face of advancing AI technologies – and it isn’t all about a return to invigilated exams and handwritten papers.
How prevalent is using AI to generate exam answers?
AI-generated exam answers are a growing concern for educators. The availability of generative AI tools – paired with the increasing use of online examination to make assessment more accessible and scalable – has created a new front for academic misconduct.
Institutions are rightly vigilant. Glasgow University and the University of Oxford are just two institutions tightening their approach to AI and assessment to mitigate the risk of exam misconduct.
There are numerous online videos discussing the use of AI in education. Examples include "The correct way to cheat with AI," which has 2.7 million views, and "Writing my research paper with AI without cheating," which has 2 million views. [Interestingly, comments on the latter video query whether or not the poster is engaging in academic misconduct. This highlights the complexity students face when trying to integrate AI into their learning practices responsibly.]
There are also students taking to online forums asking whether using AI in an open-book exam is cheating, reflecting a growing uncertainty and a lack of clear guidelines about the ethical use of AI in academic settings. At the same time, educators are criticizing institutions for failing to adapt assessment to the new age of AI, highlighting the urgent need for educational reforms that align with technological advancements and ensure fair and effective evaluation of student knowledge and skills.
Furthermore, a report from the UK’s Joint Council for Qualifications quotes a student sanctioned for AI-based cheating as saying using ChatGPT was ‘no different to asking a teacher for advice’. It’s clear that student use of AI is on the rise – whether it is intentional or through a lack of clarity around what constitutes misuse.
But is there evidence of exam cheating using AI?
In a student survey conducted by BestColleges, results showed that 56% of students admit to having used AI in assignments or exams—this is a steep 34% increase when compared to the same survey carried out just a year earlier. Interestingly, 21% of students surveyed did not believe that using AI in assignments or exams constitutes cheating.
With ChatGPT released in November 2022, last year’s cohort was the first that could use the gen-AI tool to create exam answers. However, FE Week reports that, while cases of exam malpractice resulting in penalties rose from 4,105 in 2022 and 4,895 in 2023, the proportion relating to tech devices remained the same at 44%. Could this be down to AI use in exams flying under the radar?
What are the risks of relying on generative AI for exam answers?
The first and most obvious risk of AI-generated exam answers is that they allow people to pass exams and qualify for professions without developing the necessary knowledge and competence.
Using AI to generate exam answers undermines academic integrity in any discipline – ChatGPT has passed law and business school exams, for example – and it poses a serious risk to critical professions.
When you visit a doctor or consult a lawyer, you expect them to have genuine expertise in their discipline. When you cross a bridge or walk into a building, you trust the engineers know what they’re doing. But if students rely on AI to pass their exams, they risk graduating without the knowledge and skills required to perform their duties effectively. This can lead to dire consequences – from misdiagnoses and flawed legal advice to structural failures in public spaces.
Long-term, it may erode public confidence in these professions and the institutions that qualify them. To maintain trust, institutions must safeguard academic and professional rigor, and continue to produce high-caliber graduates and professionals.
Over-reliance on AI can undermine learning habits, cognitive development, and critical thinking skills – attributes that are needed beyond the classroom for future success. Plus, it creates an unfair advantage for people willing to bend the rules, which compromises the integrity of the educational system overall.
A further risk of AI exam answers is that they’re not always right. As this is more of a risk to the student using AI, you might be tempted to let them lie in the bed they’ve made for themselves. However, student misconduct can be linked to academic pressure, as well as mental and emotional health challenges. Timely intervention to identify misconduct could nurture struggling students to succeed on their own merits. This benefits the individual, the institution, and the integrity of professional services.
How effective is AI at generating exam answers?
Generative AI is very effective at creating exam answers. It can produce responses with a high level of accuracy and pass degree-level examinations. Following 95 multiple-choice questions and 12 essay questions, ChatGPT scored a B in the final exam of Wharton University of Pennsylvania’s MBA course. It also received a ‘low passing grade’ at the University of Minnesota Law School, achieved the highest score on an AP Biology exam, and passed a freshman year at Harvard.
However, it can equally create generic or over-simplistic answers. And, as shown in research from The University of Bath, simply makes things up, known as AI hallucinations.
Despite the hype around the ‘intelligence’ part of AI, it doesn’t think for itself.
Generative AI is a large language model trained to provide responses based on analysis of vast amounts of data. While highly sophisticated, it doesn’t possess genuine understanding or knowledge. It simply compiles text based on patterns in the data it has been trained on. It is essentially predictive text – but at length.
As such, AI-generated answers can be fairly easy to spot. Generative AI has a habit of repeating a question at the start of the answer, for example, and providing fairly superficial responses. So while it can be used to create exam answers, they are easy to detect even to the human AI.
Researchers from Wharton concluded that, “Chat GPT3 does an amazing job [...] Not only are the answers correct, but the explanations are excellent. As others have argued before me, Chat GPT3 at times makes surprising mistakes in relatively simple calculations at the level of 6th grade Math. These mistakes can be massive in magnitude.”
This begs the question – how can institutions make exams ChatGPT-proof?
What strategies are available for securing exam answers, deterring AI-generated exam answers, and detecting AI-generated exam answers in the age of AI?
How can institutions deter students from using AI to generate exam answers?
While some institutions are responding to the risk of AI usage by returning to on-premise, handwritten exams, it isn’t the only answer. Here are some practical steps you can take to mitigate the issue of AI-generated exam responses.
Promoting academic integrity
The first step to securing exam answers should be education and support around academic integrity. Institutions need to acknowledge that students can cheat with AI but encourage them to see the value in completing exams and assignments using their own academic effort.
Educating on ethical AI use
As we’ve seen from the social media posts above, many students struggle to understand when and how to integrate AI into their learning and assessments. Rather than ignore AI, you should include AI writing in your honor code and curriculum.
Provide guidance on appropriate and ethical uses of AI in education – to enhance rather than undermine learning – and educate students on the limitations of AI and its frequent factual inaccuracies.
Amending assessment design
Generative AI is better at some types of assessment than others. Institutions can consider moving to harder-to-replicate exam formats; for example, using open-ended questions rather than multiple-choice, or adding oral exams to your assessment portfolio.
Discover more about what makes effective test questions and answers for assessments and how to uphold integrity in online assessment design.
Providing interventions and support
If misconduct is detected, institutions should seek to intervene and understand the root cause. It may be caused by academic pressure or poor mental health. Institutions can offer support in the form of study skills and tutoring to encourage students back to the right path. Read more on how to nurture academic integrity under academic pressure.
Publishing transparent policies
It is also important to have clear policies outlining the consequences of using AI to generate exam answers, following due process, being transparent in investigations, and providing a right to appeal.
How can technology prevent the use of AI for generating exam answers?
As intractable as it may seem, educators have plenty of options for mitigating the risk of AI-generated exam answers in their assessments. As technology continues to evolve, it’s important for institutions to keep pace with change and implement tech-based solutions to tech-based problems. Here are three ways you can maintain exam integrity in the age of AI.
Implement monitoring and proctoring
Online exam proctoring tools use a range of methods to monitor student behavior during online exams to discourage misconduct. For example, using:
- Live monitoring via the device webcam and microphone, to detect any unusual activity
- Screen recording software to monitor what websites the student visits during the exam
- Keystroke analysis to detect unusual keyboard activity (such as copying and pasting text)
Integrate AI writing detection tools
It might seem very meta, but educators can use AI to mitigate the risk of AI in exams and assessment. AI writing detection tools can analyze long-form student exam responses at scale and identify any patterns or anomalies that suggest the use of ChatGPT or other generative AI tools.
The UK Joint Council for Qualifications advises institutions to use more than one detection tool and consider ‘all available information’ when assessing potential exam misconduct, utilizing each item of information as one piece of an investigative puzzle.
Next steps for institutions
As technology advances, and without the right tools in place, AI exam answers are likely to become as prevalent as AI-generated coursework. A key challenge for educators is how to prevent, detect, and correct AI misuse in online examination – to support academic integrity, professional competence, and public trust.
At Turnitin, we believe it starts with integrity education and awareness, to help students make positive choices about their approach to learning and self-development. But we know this isn’t enough on its own. That’s why we also provide an extensive suite of sector-leading tools to support academic integrity and student outcomes.
Turnitin ExamSoft provides educators with a fully secure offline exam platform. You can administer various digital exam types – including multiple choice, hot spot, matching, and more – without requiring an internet connection during exam time. This prevents students from accessing AI tools during their exams, letting their own learning shine and providing you with reliable insights into their progress.