Early Approaches and Problems in SE data

Author

Neil Ernst

Published

February 5, 2026

Early work in DS for SE, and some possible challenges.

Learning Outcomes

  • Appreciate how the field started; lessons from the past. Taylorism.

  • Differentiate between SE “engineering” and factory work?

  • Challenges with SE data mining and data sources.

  • Understand limitations with SE inferences

    Topics and slides

    These are the submodules I covered in class.

  • Early Work in DSSE

  • Problems in Data Science for SE

Readings

Exercises

These are done in class. The source code below is a combo of what I typed and what I prepped before hand.

  1. Explore a repo using SonarQube metrics - linked in the notes
  2. Write R code to load and explore simple data
  3. Power calculations app
  4. Sampling exercise (in class)
  5. Data validity exercise

Optional Readings and Activities

These readings enrich the material but are not strictly necessary to read.