Monday, September 30, 2024

Employers: Your AI-Detection Checker May Be Bogus

 So, if you can't detect that it was written by AI, how do you expect a computer to do it for you? There was a person who posted a job stating s/he would be using several AI detection programs to ensure that if you were using ChatGPT, you were, at least, "humanizing" it. Well, I decided to take the garbage ChatGPT put out for me and check it. Here's what I got: 


Grammarly

Said the AI work was 100% written by a human. Granted, this is supposed to be for writers checking to see if their AI generated work will pass, but I would not use it. I have paid for a subscription to Grammarly in the past ($30/mo) to check for typos. It did a good job of that, but it would not let me choose which style manual I wanted to use for each check. Since different style manuals have different rules, it would sometimes flag me for a typo when it was not. I also did not feel the plagiarism checker was that great and so I used a different service for that. 

Quillbot 

Actually detected the AI generated writing and my writing correctly. The free checker allows up to 25,000 words. They also offer other free services and paid services for $19.95/mo.

ZeroGPT Said only 80% of the AI generated writing was written by a computer. For my writing it said it was only 1.72% AI generated. Only allows 15,000 characters for free. Paid plans are $7.99/$18.99 per month.

Merlin  Similar results to ZeroGPT. This one only gives you three free credits and a max of 10,000 characters each time. It costs $15 per month to get full access--which I had to click my "credits" button to see--and more expensive plans exist.

GPTZero Correctly detected AI generated writing as being 100% AI. They were sure that my writing was human--but it was a 3% chance it was generated by AI. This one only allowed 5000 characters without signing up. For $10/mo you can process 150,000 words each month. There are more expensive pricing plans as well.

Copyleaks Does not give you a percentage--it just says AI generated or not. It correctly identified the AI and mine. The AI detector alone is $7.99/mo.

Undetectable AI This one said the AI generated writing was likely written by a human. They are selling an AI humanizer so it makes sense their detector would say an AI generated text is human. 

Phrasly AI Said only 80% of the AI generated writing was written by a computer. For mine, it was 100% sure it was written by a human. They only let you do 2000 words per check. This is another AI "humanizer." 

Leap Said the AI text was "possibly" written by a human and gave 57%. My content was "very likely human" with a score of 16%. They allowed 20,000 characters. They are also in the business of generating AI content. 

Originality AI This one said the AI content was 100% written by a computer and accurately said mine was 100% human. This website gives you three free scans (less than 300 words each), but it requires a credit card to use it. You can pay $60 for 600,000 scanned words that expire in 2 years if unused or pay $15.95/ $136.58 for the two monthly plans.

Brandwell (formerly Content at Scale): This predicted the AI content and human content for me but did horrible in tests below. They only allow 5000 characters to test their detector. At $249/month for 1-user, I don't think its worth it.

GPT-2 Output Detector Demo: This said the ChatGPT article was 99% real. It also said my work was 99% real. This is open source, but I am not seeing a benefit to using it because its results are highly variable.

Turnitin: I did not test this myself because they do not have a place to test their software on the website. You have to contact an agent to get any information. They also don't have prices posted that I could find because they have a monopoly on schools--and they probably shouldn't. However, this tool was tested here. The results are concerning because 100% human produced content was flagged as AI 11% of the time, and this is the tool most teachers use. It is very important for teachers getting a positive with this software to double check the content a couple of other places. Or--better yet--why don't we just train our teachers to detect garbage? When a high school student turns in garbage, make him/her redo it. Ask questions in the margins that must be answered by the student or addressed in the redo. Make the student redo it during class without a computer. (Whoa--writing something by hand--scary.) If s/he did the research, recalling important points without looking them up is necessary. I, personally, believe making students redo poor work (1) teaches them to do it better because they have to apply what the teacher criticized; (2) if they end up having to do the work any way, AI generated schoolwork does not help them; (3) it is possible that some students do write as bad as AI--they will probably grow up to be computer programmers--why fail them for skills they never learned solely on the basis of another computer program?

In a recent study using five of the above, Originality.ai was 100% accurate for AI but 10% of the human written were classified as AI, Turnitin had huge variability--ironically none of the human generated articles were flagged and Turnitin is notorious for flagging human work as AI generated. GPT-2 Output detector did awful for me, but in the study it apparently performed well. GPTZero, which did well in my limited test, did awful in the study along with Content at Scale (which is now Brandwell). Like my tests, these five in the study were only tested in one subject. Notably, humans trained to look for AI scored 96%. 

I discussed the biggest problem with these AI detectors being used in academics, but in freelancing, they can also cause problems. For an employer to use these tools to solely decide if a person should be paid for their work or not is just as wrong as the teachers who fail students based on them. If you read the article and it sounds fine to you, does it matter the source? I have no clue how an AI-generated article could sound fine, but if you think it's okay, there shouldn't be a problem. If you do have a problem with the work, I would take the same approach as the teacher--make the freelancer rewrite it and include questions and critique. 

I allow two free rewrites for my work (I can't think of a time when someone asked me for a second). Every freelancer should allow at least one--make sure you are upfront in asking about this. However, if the freelancer can fix the issues, the should be allowed a chance to do so before you refuse to pay. This is why I like Guru's SafePay--the freelancer will get paid if s/he does the work and you will get your money back if s/he does not.

No comments:

Post a Comment