Decoding Human and AI Persuasion in National College Debate: Analyzing Prepared Arguments Through Aristotle's Rhetorical Principles

Reading time: 5 minute
...

📝 Original Info

  • Title: Decoding Human and AI Persuasion in National College Debate: Analyzing Prepared Arguments Through Aristotle’s Rhetorical Principles
  • ArXiv ID: 2512.12817
  • Date: 2025-12-14
  • Authors: Mengqian Wu, Jiayi Zhang, Raymond Z. Zhang

📝 Abstract

Debate has been widely adopted as a strategy to enhance critical thinking skills in English Language Arts (ELA). One important skill in debate is forming effective argumentation, which requires debaters to select supportive evidence from literature and construct compelling claims. However, the training of this skill largely depends on human coaching, which is labor-intensive and difficult to scale. To better support students in preparing for debates, this study explores the potential of leveraging artificial intelligence to generate effective arguments. Specifically, we prompted GPT-4 to create an evidence card and compared it to those produced by human debaters. The evidence cards outline the arguments students will present and how those arguments will be delivered, including components such as literature-based evidence quotations, summaries of core ideas, verbatim reading scripts, and tags (i.e., titles of the arguments). We compared the quality of the arguments in the evidence cards created by GPT and student debaters using Aristotle's rhetorical principles: ethos (credibility), pathos (emotional appeal), and logos (logical reasoning). Through a systematic qualitative and quantitative analysis, grounded in the rhetorical principles, we identify the strengths and limitations of human and GPT in debate reasoning, outlining areas where AI's focus and justifications align with or diverge from human reasoning. Our findings contribute to the evolving role of AI-assisted learning interventions, offering insights into how student debaters can develop strategies that enhance their argumentation and reasoning skills.

💡 Deep Analysis

Figure 1

📄 Full Content

Debate is a powerful tool in English Language Arts (ELA) for developing students' critical thinking, argumentation, and rhetorical skills. However, selecting effective arguments typically relies on one-on-one or small-group coaching, which is labor-intensive and hard to scale. Large language models (LLMs) like GPT offer a scalable alternative, but their ability to generate arguments on par with human debaters remains uncertain. This study investigates GPT's capacity to select compelling evidence and craft high-quality arguments, focusing on its use in generating evidence cards during debate preparation. We employed both quantitative and qualitative methods to evaluate GPT's rhetorical effectiveness. Qualitatively, we developed a codebook grounded in Aristotle's classic rhetorical appeals, including ethos, pathos, and logos (Braet, 1992). Quantitatively, we used indicators derived from prior research (Moon et al., 2024) to identify patterns in student-produced content reflecting these rhetorical strategies. Our integrated framework combines thematic analysis with measurable subdimensions to systematically assess argument quality. Given the abstract nature of ethos, pathos, and logos, automating their evaluation is complex. Moreover, while LLMs can produce fluent, persuasive arguments, these often lack credibility or sound logic (Kjeldsen, 2024). Thus, we aimed to establish objective criteria and stable evaluation methods for assessing rhetorical quality in GPT-generated content.

Competitive debate is a widely practiced extracurricular activity at both the high school and college levels, serving as an effective intervention for fostering critical thinking skills (Anderson & Mezuk, 2012). Few studies have explored the specific mechanisms involved or the potential of LLMs to support students in debate preparation. This study addresses that gap by focusing on evidence preparation in competitive debate, a process central to argument quality and participant success. Competitive debate in the United States consists of multiple formats, each following a structured format where debaters engage in rounds discussing a predetermined resolution. Participants, whether individuals or teams, adhere to set speaking times and are judged on the quality of their arguments. A key skill in debate is the persuasive presentation of high-quality evidence. This is demonstrated through the essential task of “cutting card”, which involves the systematic collection and organization of evidence from sources such as news articles, blogs, peer-reviewed journals, and think tank reports. The “cutting card” process entails selecting relevant excerpts (referred to as full text thereafter), underlining key points (referred to as key points thereafter), highlighting sections for oral presentation (referred to as highlights thereafter), and crafting a “tag” as a concise, argumentatively framed summary of the evidence (Roush et al., 2024). Effective card-cutting requires advanced cognitive skills, including synthesis, emotional appeal, logical reasoning, and strategic framing, making it a high-level critical thinking exercise (Naqia et al., 2023). It also requires justifying selected arguments by logically linking premises to outcomes through accurate cause-and-effect or correlational reasoning. Hence, it is challenging for both human and AI debaters with a highly complex, multi-dimensional, and multi-step decision-making process.

The current study explores GPT’s ability to “cut cards”, in which it is prompted to identify and extract effective arguments from full text. Specifically, we collected debate data consisting of the resolution and the cards that students created when preparing for a debate (Roush et al., 2024). Using the full text from a card, we prompted GPT to extract key points, highlights, and tags as if it were a student preparing for a debate. We analyzed the texts extracted by GPT and compared them to the texts selected and produced by students using a rubric developed based on Aristotle’s rhetorical principles, which evaluate the persuasiveness of an argument based on three dimensions: ethos (credibility), pathos (emotional appeals), and logos (logical reasoning) (McCormack, 2014). We coded the three principles based on definitions and further developed eight highly associated indicators. This explorative study provides insights into the effectiveness of GPT in supporting debate preparation, highlighting its strengths and limitations in selecting persuasive arguments. Understanding how GPT’s selections compare to those of human debaters can inform the development of AI-assisted debate tools and contribute to discussions on the role of AI in critical thinking and rhetoric.

Our data is drawn from the OpenDebateEvidence dataset (Roush et al., 2024), a large-scale resource for argument mining and summarization, containing 3.5 million documents. As an explorative study, we randomly selected 30 evidence cards from the dataset generat

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut