Understanding how AI Functions to Better Spot its use.

My name is Lucas, and I'm an engineering major taking a class on AI this semester. It is steadily becoming more predominant in all aspects of media and life. We, as writing center consultants, need to be able to find how AI-written papers and student-written papers differ from each other.

In the past, I've explored various strengths and weaknesses of AI through research papers that I wish to briefly educate everyone on.

To start, the type of AI we encounter is called Large Language Models or LLMs. This type of AI specializes in text generation, similar to ChatGPT.

How does an LLM work?

An LLM's process can be generally chunked up into three steps:

The first step is giving the LLM documents. The language model breaks the document into tokens.. From there, it will chunk tokens together before encoding them.

Secondly, the AI will receive an input prompt, where it will tokenize and encode it in its entirety.

The final step is for the AI to choose the encoded chunks that are most similar to the prompt. After this, it selects a token that has a high probability of going next in the line of text. The token is then decoded and printed. This process repeats until the prompt finishes.

Tokens: Broken segments of sentences in a document can be divided into whole words, syllables, or even characters, depending on their length and uniqueness. Hugging Face is a good site to play around with tokens

Chunks: Grouping of a set length of tokens to reduce.

Encoding: A process that converts chunks into numbers that can be used in conditional probability.

What does this make AI good at?

Defined styles, ideas, and rules are easy for an AI to copy. This is because the AI has plentiful data to work from, likely even having the direct examples in its data.

What you should NOT look at to spot AI:

MLA, APA, and Chicago.
Scientific and medical papers.
General spelling and grammar.
Simple math processes ~ anything below calculus.

Pitfalls AI tends to fall into

AI is designed to give you an answer concisely and confidently. According to Elizabeth Steere, a lecturer at Georgia University, Words like "should", "feels", or "I think" are very unlikely to be present in AI generations. Other verbiage that LLMs don't like is "I" or "You"s, as they are primarily trained on third-person papers and media.

But the easiest thing to catch is when an AI hallucinates. This is where the AI will make up information seemingly from out of nowhere.

Hallucinations happen with specific information in the paper, such as quotes or references. As I stated earlier, AI selects a prompt based conditional probability distribution. Here is an example:

The AI selects the token by randomly selecting one of the tokens in its database.

This is primarily done to make outputs unique. If it just selected the highest percentage token, the prompts would become repetitive and boring.

With all these ideas in mind, here are some examples of what should tell you whether a paper is AI-written or not:

Incorrect direct quotes.
Incorrect information or reasoning.
Their paper is extremely concise and direct.
URLs being incorrect - Nabi, Javaid. “All You Need to Know about LLM Text Generation.” Medium, 9 Apr. 2024,medium.com/@javaid.nabi/all-you-need-to-know-about-llm-generation-03b138e0ee19.
If applicable, the version history will show you their timeline of making their paper.
Talk to them if you have concerns; people will trust us more because we are students, not staff or faculty.

Closing Thoughts

While AI still is a major pain in the butt to deal with, it is important for us to not become automatically biased against it.

My mother is a nurse; in her job, she spends hours alone writing reports on procedures both in and out of the workplace. Cutting that time short, even by a small amount, can go a long way in improving people's lives.

It certainly is an issue for schools because of academic integrity. AI generating text for students removes them from the creation process, compromising the whole premise of their teaching. Educators need to teach students the downsides of AI while not strictly tabooing it. For real-world applications, it has the potential to save people lots of time. People like my mom can use that time to see her family and possibly save others' lives.

References

Cooperman, Stephan R, and Roberto A Brandão. “AI Assistance with Scientific Writing: Possibilities, Pitfalls, and Ethical Considerations.” Foot & Ankle Surgery: Techniques, Reports & Cases, Elsevier, 14 Dec. 2023, www.sciencedirect.com/science/article/pii/S2667396723000885.

Nabi, Javaid. “All You Need to Know about LLM Text Generation.” Medium, 9 Apr. 2024, medium.com/@javaid.nabi/all-you-need-to-know-about-llm-text-generation-03b138e0ed19.

Steere, Elizabeth. “Ways to Distinguish AI-Composed Essays from Human-Composed Ones (Opinion).” Inside Higher Ed, 2 July 2024, www.insidehighered.com/opinion/career-advice/teaching/2024/07/02/ways-distinguish-ai-composed-essays-human-composed-ones?utm_source=Inside%2BHigher%2BEd&utm_campaign=191810af3e-DNU_2021_COPY_02&utm_medium=email&utm_term=0_1fcbc04421-191810af3e-197771653&mc_cid=191810af3e&mc_eid=60070b9ae3.

Coe Writing Center Blog

Search This Blog

Understanding how AI Functions to Better Spot its use.

Pitfalls AI tends to fall into

Comments

Post a Comment