Toward Pedagogically Effective AI for Introductory Computer Science

Michael Wu

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2025-119

May 16, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-119.pdf

The rise of large language models (LLMs) like ChatGPT has begun to revolutionize the educational landscape. Specifically in computer science, this has led to the emergence of LLM-based tutors intended to guide students through the learning process, much like how a human instructor would. However, these systems still face major challenges: hallucination and inaccuracy can mislead students, and over-helping can stifle independent learning. In this thesis, we investigate different design strategies for building more accurate and pedagogically effective LLM-based tutoring systems. We begin by outlining key design principles aimed at balancing assistance with promoting student learning. We then explore different designs and architectures to improve performance measured against these standards. We ultimately find that the best results come from a dual-agent model combined with structured chunking retrieval-augmented generation (RAG), few-shot prompting, and fine-tuning. We hypothesize that this architecture improves performance by combining the generation agent’s reasoning with a verification agent that can catch inaccuracies or pedagogical oversteps before they reach the student. Experimental results demonstrate that these techniques improve accuracy and adherence to pedagogical guidelines, particularly in iterative, multi-turn learning scenarios. However, these gains come at the cost of increased computational overhead, requiring more tokens and two model calls per interaction, which negatively impacts inference time and operational cost. Our findings offer a framework for building tutoring agents that better support student learning and describe the trade-offs inherent in LLM deployment.

Advisors: Prabal Dutta

BibTeX citation:

@mastersthesis{Wu:EECS-2025-119,
    Author= {Wu, Michael},
    Editor= {Dutta, Prabal and Pierson, Emma},
    Title= {Toward Pedagogically Effective AI for Introductory Computer Science},
    School= {EECS Department, University of California, Berkeley},
    Year= {2025},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-119.html},
    Number= {UCB/EECS-2025-119},
    Abstract= {The rise of large language models (LLMs) like ChatGPT has begun to revolutionize the educational landscape. Specifically in computer science, this has led to the emergence of LLM-based tutors intended to guide students through the learning process, much like how a human instructor would. However, these systems still face major challenges: hallucination and inaccuracy can mislead students, and over-helping can stifle independent learning. In this thesis, we investigate different design strategies for building more accurate and pedagogically effective LLM-based tutoring systems. We begin by outlining key design principles aimed at balancing assistance with promoting student learning. We then explore different designs and architectures to improve performance measured against these standards. We ultimately find that the best results come from a dual-agent model combined with structured chunking retrieval-augmented generation (RAG), few-shot prompting, and fine-tuning. We hypothesize that this architecture improves performance by combining the generation agent’s reasoning with a verification agent that can catch inaccuracies or pedagogical oversteps before they reach the student. Experimental results demonstrate that these techniques improve accuracy and adherence to pedagogical guidelines, particularly in iterative, multi-turn learning scenarios. However, these gains come at the cost of increased computational overhead, requiring more tokens and two model calls per interaction, which negatively impacts inference time and operational cost. Our findings offer a framework for building tutoring agents that better support student learning and describe the trade-offs inherent in LLM deployment.},
}

EndNote citation:

%0 Thesis
%A Wu, Michael 
%E Dutta, Prabal 
%E Pierson, Emma 
%T Toward Pedagogically Effective AI for Introductory Computer Science
%I EECS Department, University of California, Berkeley
%D 2025
%8 May 16
%@ UCB/EECS-2025-119
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-119.html
%F Wu:EECS-2025-119