Data Science Basics

Author

Neil Ernst

Published

February 5, 2026

Intro to data science applied to Software Engineering problems. Refresher/intro to data science concepts and tools.

Learning Outcomes

  • position data science in the scope of SE activities
  • refresh knowledge of basic statistical approaches
  • re-examine concepts on regression and interpretation of coefficients

Topics and slides

These are the submodules I covered in class.

Readings

  1. Confusion matrices - Wikipedia
  2. R for Data Science - Tidy Data
  3. EDA in R
  4. Regression intro/overview, ROS ch 6.1-6.5

Exercises

These are done in class. The source code below is a combo of what I typed and what I prepped before hand.

  1. Basic data exploration
  2. P-value displays
  3. Run glm

Optional Readings and Activities

These readings enrich the material but are not strictly necessary to read.

  1. Quarto overview