Large Language Models in SE

Author

Neil Ernst

Published

June 9, 2026

AI-supported development tools, like Codex, Claude, ChatGPT, etc., have taken a big role in SE recently. What underpins these tools, how do they work so well, what ethical concerns do they raise, and what can we expect for SE in the AI future?

Learning Outcomes

  • a more than passing awareness of how large language models “work” on code
  • deep dive on language parsing and embedding for LLMs
  • ability to discuss the (current) tradeoffs of these tools
  • analyze the way such tools are evaluated and discern hype from reality
  • how to apply these tools for SE problems like refactoring or code comprehension

Before Class

Lectures

Readings

  1. Hindle et al., On the Naturalness of Software
  2. Codex
  3. SWE Bench Verified
  4. Components of A Coding Agent
  5. Chapters 1-3 of Raschka, “Build an LLM”
  6. Come to class June 10 with a working embedding from the samples in the book.

In Class

Slides

Data and code

  1. Implement Command in code: Command Pattern

Optional Readings and Activities

Helpful tutorials and summaries: