For this week's blog post, I was given the task to write an open-ended post about Chat GPT. My mom sent me a message in January about an article on Chat GPT, so I decided to start there. The article was discussing the new software written by Edward Tian that is intended to detect if material was written by Chat GPT. I decided to take a look at this topic of detecting the chat bot's material, particularly in regards to assignments written by students.
First, for exploration's sake, I asked Chat GPT to write a blog post on Chat GPT and this is a picture of the response that I got:
I could probably have been more specific about what I wanted it to focus on regarding the software, but this gave me an idea of what this bot was capable of. I followed this up by looking at a number of articles and posts written about the topic of detecting the software (links to these pages at the end). There are a number of existing softwares, including the one written by Edward Tian.
OpenAI AI Text Classifier - Predicts likelihood written by AI (uses words like “likely” to describe probability)
OpenAI GPT-2 Output Detector Demo - Gives a numeric probability and a real v. fake percentage
GPTZeroX by Edward Tian (Princeton University) - Looks at the "perplexity" and "burstiness" (human patterns versus computer patterns - different degrees of randomness) and gives scores for each, tells you if likely written by AI
DetectGPT by Stanford University - Specifically looks at detecting GPT-2, doesn’t work as well for other things
Corrector App’s AI Content Detector - Detects GPT-3, syntax and semantics, and then percentage of “fake”
I also found a page that assesses the likelihood that text you wrote was written by an AI (https://writer.com/ai-content-detector/), specifically because certain search engines are apparently penalizing sites (lowering your search score). So I took my blog post written by Chat GPT and pasted it in there and here is what it put out:
While I didn't write any of that text (as opposed to the suggested 5%) it did accurately tell that this was written by an AI.
Additionally, of importance for students and educators, Turn It In has been developing (for years) and will release a version that tries to detect AI written material.
From a more qualitative perspective, authors say that lack of surprise and a sort of "rote" style to the writing make it detectable. As the Wired article wrote, the writing of Chat GPT "lacks a certain chutzpah."
Some people have discussed that while we are able to detect AI right now, it is likely that as NLP (Natural Language Processing) evolves (which is an active field of CS research), the AIs will get better. Others have suggested embedding a "watermark" or means of detecting if it was written by an AI directly into the code of the AI. But it was rightfully pointed out that a non-watermarked version of the software would inevitably get out.
A few of the articles also discussed that one thing that stands out as sort of "current problem" with Chat GPT that the software hasn't fixed, is that the computer "hallucinates" (this is the term the phenomenon has been given). Essentially, the bot will tell you factually inaccurate information and make it seem entirely correct. It will refer to references that straight up don't exist.
My research so far on this topic was intriguing, but I was actually surprised and curious to see an article where a teacher was specifically talking about how he uses Chat GPT in his classroom. He is essentially having the students in class use Chat GPT to do things like suggest project ideas. They then have to dig into those ideas (still using Chat GPT) and the end result of the whole process is a great project idea. For essays in the class the students are told to use Chat GPT and tell him how they are using it. He then tells his students that because they are free to use Chat GPT, he then expects great essays, and grades accordingly. He compares Chat GPT to the way that we use a calculator to speed things up in math, even as we still teach students how to do things by hand. The teacher also notes that so many people are already cheating or are going to cheat anyway that we should learn how to work with the system.
Overall, I don’t know how to react to this. Apparently a lot of people are cheating already, and now their capability is just higher. But I never cheated in school and certainly not when writing essays. Is this an example of work smarter not harder? Is this an example of me (in comparison to others) truly learning something and now I’m able to write my thesis etc.? But as this evolves, will we continue to need to be able to write at a professional level? Will valuing that ability be seen as an elitist sort of perspective? Comparable to being able to do long division by hand or play the violin or write in cursive? Can we use Chat GPT to learn to write?
But on the flavor of positive perspectives on Chat GPT, it has made me wonder about its ability to really work for us. I wondered about how I could use it productively in my research. Here is the resulting conversation of my first attempt. I asked Chat GPT to summarize an article, given a URL:
And it seems great right?? WRONG.
It quickly became apparent that there was a problem. First, this is not the article that is attached to that URL. Second, the article it is summarizing straight up does not exist. I looked. Thoroughly. Now, interestingly, the nice summary is an entirely believable synopsis of a non-existent paper. And it is a planetary science paper supposedly published in the same journal, suggesting that it was able to access some information but not all of it. So then I did this:
Now, this initially seems right, but this summary does not actually make the same arguments or conclusions as those in the actual journal article.
As a last attempt, I copied in the text of the actual article and asked it to summarize it:
Except it turns out that the whole article was apparently way too long. So I changed it and copied in the Discussion and Conclusions and asked it to summarize. Still too long. So I copied in just the 6-7 paragraph long conclusions and asked for a short summary and here is what I got:
Which is actually pretty good and kind of helpful as a slightly more lay person version of the points in the paper. So. This could potentially be useful for digesting complicated prose or material in a more accessible way. But it probably still needs further exploration. We would need to look for the "hallucinations"!
I'm still undecided about Chat GPT as a whole, but I am intrigued by the possibilities of things we haven't even thought of to ask it it to do and how it might move humanity forward. At the same time, some companies, such as Buzzfeed, are going to begin using Chat GPT/AI written content, lessening the need for human employees, which could potential begin a harmful trend.