What do you know about the Turnitin Similarity Report and how it works?
In 2000, Turnitin.com launched in its most basic form, leveraging database pattern-matching technology developed from Berkeley students’ doctoral research. Originally designed to detect pre-internet “frat file” plagiarism, the Turnitin Similarity Report was later adapted to deal with internet plagiarism following the ease at which technology made—and continues to make—plagiarism more accessible. Fast-forward over twenty years, and the Similarity Report is no longer just about plagiarism.
Join us as we navigate the end-to-end experience of the Similarity Report—learn what the report is (and isn’t), and how you can leverage this constantly evolving tool to keep integrity at the core of all student and researcher submissions.
What is the Turnitin Similarity Report, and does it detect plagiarism?
The Turnitin Similarity Report is a powerful tool that quantifies how similar a student’s work is to other pieces of writing by highlighting similarities to the world’s largest collection of internet, academic, and student paper content.
Contrary to the common misconception that the Turnitin Similarity Report is a plagiarism detection tool, Each Similarity Report generates a similarity score which is the percentage of matching or similar text that has been uncovered.
While the Similarity Report is unable to conclusively determine plagiarism (only humans can do this), it can flag matching text for further review, enabling assessors to conclude whether similarities are acceptable or potential cases of improper referencing. Some matches against the Turnitin database are perfectly normal. Quotations and citations are generally acceptable matches since they illustrate research findings and extend a second voice to a piece of work.
The Similarity Report features on-paper highlights, filtration options, and flag insights that allow users to scrutinize the source of text matches and identify discrepancies, such as replaced characters or hidden text. These features not only help to identify unoriginal content but also serve as educational resources. For instance, by allowing students to resubmit papers and conduct full source analyses, the Similarity Report promotes formative assessment practices and helps students develop their academic writing and citation skills.
Research supports the educational benefits of using the Similarity Report. Li et al. (2021) found that tools like the Turnitin Similarity Report encourage students to improve their writing and paraphrasing skills, especially when paired with explicit instruction on academic integrity. Similarly, Daoud et al. (2019) observed a reduction in plagiarism among students who used the Turnitin Similarity Report, highlighting the importance of coupling the tool with hands-on education in academic writing.
For academic institutions, researchers, and publishers, the Similarity Report provides a means to uphold scholarly standards, helping to identify potential plagiarism or improper citation before final submission. This ensures that manuscripts meet ethical standards and protects the integrity and reputation of published work.
How does Turnitin generate a Similarity Report?
When a paper is submitted to Turnitin, it delivers a comprehensive Similarity Report by breaking the text into phrases and assigning unique IDs to each phrase, excluding common words like “and” or “the.” These phrases are then compared against seven trillion possible matches in the Turnitin database. Turnitin employs natural language processing and strict matching techniques to minimize false positives and ensure accuracy. Additionally, the algorithm flags any unusual content for further review.
However, the accuracy of the Similarity Report depends not only on Turnitin’s default settings but also on the advanced settings configured by users before submissions are made. These settings allow parameters to be defined that refine the similarity score, ensuring that the report meets the specific needs of each assignment.
How can publishers and educators effectively interpret the Similarity Report?
Possibly one of the Similarity Report’s most-talked-about features is the similarity score, which indicates the percentage of a paper’s content that matches Turnitin’s database. While some institutions and publishers may set thresholds for acceptable scores, the score alone doesn’t tell the whole story. Context is key to accurately interpreting the score and determining next steps.
A high similarity score doesn’t always indicate plagiarism, just as a low score cannot always rule out academic misconduct. For instance, a low similarity score might suggest original content, but it could also raise concerns about contract cheating or AI writing, where the student submitting the paper is not the original author. These forms of misconduct highlight the need for institutions to reconsider how they assess proof of learning and measure originality and critical thinking skills.
Relying solely on the similarity score can be misleading. Mphahlele and McKenna (2019) warn that using Turnitin primarily as a policing tool can deny students the opportunity for pedagogical growth and may negatively impact their behavior.
A comprehensive evaluation that considers writing genre, assignment length, and the presence of direct quotes is essential for a fair assessment. For example, quantitative research papers often produce different similarity scores than qualitative analyses due to their citation volume. Even a 0% score might mask questionable content due to rounding, while a high score from excessive quoting—even if properly cited—could signal the need for better instruction on paraphrasing.
To aid in accurate interpretation, Turnitin’s advanced exclusion settings allow users to filter out certain types of content, such as bibliographic material, which refines the similarity score. By dynamically excluding non-original material, the Similarity Report can focus on truly original writing, helping educators and publishers make informed decisions. This process not only reduces false positives and unnecessary scrutiny but also ensures that the similarity score reflects genuine originality, creating a level playing field for all types of academic work.
How does the new Similarity Report enhance the user experience?
Turnitin is excited to introduce the enhanced experience of the Similarity Report, elevating our commitment to innovation. Featuring a long-awaited, intuitive interface, this update promises an improved experience for all.
The new tab navigation allows easy access to and switching between key functionalities, like AI-writing detection, similarity matching, and the Flags Panel.
With a streamlined layout, optimized for readability and simplicity, the updated Similarity Report design saves time and awards efficiency in interpreting report results. It also meets the latest accessibility standards, ensuring an inclusive experience for all Turnitin users.
Our updated technology lays a robust foundation for faster delivery of new features, ensuring that institutions and publishers have access to the most advanced tools. We view this as a significant step toward enhancing the effectiveness of the writing journey and making academic integrity a core value among students and researchers.
How does the new Similarity Report support formative feedback and publication quality?
Unintentional plagiarism lends itself well to developmental opportunities at all educational levels—even postgraduate. But data and insights have long been difficult to gather, making it a challenge to determine intentionality when it comes to academic misconduct.
Turnitin’s ongoing enhancements to the Similarity Report support the opportunity for continuous feedback by providing students with this ongoing feedback and the chance to revise and resubmit, they can better practice making informed decisions about their writing and the sources they cite.
Match Groups
Turnitin’s new Similarity Report has been thoughtfully redesigned with a new intuitive interface and match categorization panel, making it easier to draw the line between citation mistakes and deliberate omissions. Match Groups categorize matches based on the extent that a student or researcher has cited or quoted throughout their paper:
- Not Cited or Quoted: Text matches are not quoted, or the original source is not cited. These matches could suggest plagiarism and require further investigation.
- Missing Quotations: Text matches are cited, but the match is so exact that it may also require quotation marks. For students, these matches may be an opportunity to provide formative feedback on how to properly cite and attribute sources.
- Missing Citation: Text matches are quoted, but the original source is not cited. For students, these matches may be an opportunity to provide formative feedback on how to properly cite and attribute sources.
- Cited and Quoted: Text matches are quoted and cited to a source. These matches are a great opportunity to spotlight student strengths, as well as highlighting a well-polished research manuscript.
In an age where there is an abundance of information available to quote and cite, match categorization makes interpreting and sorting through matches easier than ever, helping to quickly discern between intent, areas for improvement, and writing success.
Source cards
A key component of the new Similarity Report is its source cards, which present detailed information about a highlighted match and its source material. Educators, students, publishers, and researchers can gain insights from source cards, such as the percentage of matched text, the number of matched text blocks, and the Match Group associated with the source. Source cards provide context-specific feedback that enriches the learning and evaluation process.
The flags panel
The Similarity Report also features the Flags Panel that highlights text manipulations, such as replaced or hidden characters. While these forms of match evasion may seem more deliberate than missing citations or quotations, they can indicate a student’s struggle and need for additional support.
For publishers, the Flags Panel serves as a quality control measure, ensuring that submitted manuscripts adhere to ethical standards and citation practices before publication. This feature not only helps to uphold a publisher’s reputation but also reinforces the reliability and credibility of the academic record. By incorporating the Flags Panel into their workflow, both researchers and publishers can enhance the accuracy, originality, and scholarly value of academic writing.
Does the new Similarity Report address AI writing?
In April 2023, Turnitin’s AI-writing detection capabilities launched across many of our integrity solutions—a milestone in combating the improper use of AI writing tools, such as ChatGPT.
As a first step towards accessing Turnitin’s AI Writing report, we asked you to open a separate window and leave the Similarity Report experience, requiring you to change how you worked. But with its updated technology, the new Similarity Report provides a new foundation to deliver AI writing detection, at a faster pace.
We are proud to now have the means to provide institutions with the latest and most advanced tools to support their teaching and assessment needs, hosting a fully integrated experience that gathers similarity, flag insights, and AI writing detection tools, and brings them into one cohesive workflow.
How can users access the new Similarity Report?
The new Similarity Report is now automatically available to all customers with an Originality Check (OC), Originality Check Plus (OC+), SimCheck, Similarity, or Originality standalone license. Depending on your institution’s needs, administrators can choose from three different options for how this new experience is implemented via a toggle in the administrator account settings. The toggle allows existing customers to transition smoothly to the new Similarity Report with minimal disruption, where administrators have control over when and how this transition occurs.
- Set the new Similarity Report as the view for all users
- Enable the new Similarity Report view as optional
- Do not enable the new Similarity Report view for now
While the enhanced Similarity Report experience is now the automatic default view for all customers who have an eligible license, if your institution is not ready to transition, you can opt out at any time by adjusting your administrator account settings.
All iThenticate 2.0 customers, alongside new customers of eligible licenses, will see the new Similarity Report without access to toggle options.
Overview: How the new Similarity Report supports integrity and feedback
A full understanding of the multi-faceted role of the Similarity Report has promise to bring new meaning to the way that institutions and publishers adopt it for teaching, assessment, manuscript review, and the plagiarism investigation process.
With an increased number of data insights now readily available, the illustrious similarity score can take on a more subsidiary role in determining intent, reinforcing our position as a similarity checker that supports—rather than defines—academic misconduct inquiries.
Turnitin’s upgraded technology gives rise to deliver Similarity Report functionality at a faster pace, paving a new way for how institutions and publishers around the world approach academic integrity and manage the threat of academic misconduct.