On Tuesday, OpenAI unveiled AI Text Classifier(link), which is designed to detect text written by artificial intelligence versus human-generated words. This latest tool comes in the wake of criticism surrounding the ChatGPT text generator that was released in November 2022.
Despite its popularity, the chatbot has been accused of plagiarism and has been barred from some schools. Additionally, there have been reports that students used ChatGPT to compose essays for homework assignments.
OpenAI’s AI Text Classifier is an intriguing model, trained on a vast amount of text from the web. It has been fine-tuned to identify whether a piece of text was created by an AI or not – more specifically, it was trained with 34 models from five different organizations, including OpenAI itself. This included text from Wikipedia, websites collected from Reddit links and human demonstrations for a previous OpenAI system.
However, there is a chance that some AI-written content may have been mistakenly classified as human due to the sheer volume of AI-generated content online.
OpenAI’s Text Classifier cannot process just any text; it requires a minimum of 1,000 characters or 150-250 words. Unfortunately, the classifier does not have the capability to detect plagiarism, which is a major drawback considering AI-generated text has been proven to repeat information from its training data.
Additionally, OpenAI has warned that accuracy may suffer when working with texts written by children or in languages other than English as its dataset primarily consists of English content.
OpenAI reported that the classifier incorrectly labeled human-written text as AI-written 9% of the time, but I did not encounter this mistake in my testing due to the small sample size. Additionally, OpenAI stated that the Classifier has a 26% rate of correctly identifying an AI text as likely being AI-written (true positives).
In the future, they aspire to share more advanced techniques and make their classifier even better at detecting AI-generated content such as English text written by adults. Text classification through machine learning is becoming ever more crucial for diverse purposes like email spam filtering and sentiment classification; thus, this tool to detect AI-generated text is a significant stride forward.
OpenAI’s commitment to releasing a GPT-Classifier, a free detection tool for AI-text generation, is important as more people use these tools. The GPT-Classifier has its limitations but it is still seen as a reliable way to detect AI-generated text.