Within and Out of Sample Data
Neil Ernst
University of Victoria
2025-08-13
Discussion
We discussed the need for out of sample training data. This is known as the principle of Generalizability.
Exercise
- Two tools for building web apps are Vue.js and React. These both basically do the same thing, and are written in the same languages (JS/TS).
- Go to React and to Vue.js
- Find a PR in both repos that deals with something similar. For example, you might want to look at security issues or UI issues.
Exercise cont
One example: https://github.com/vuejs/core/pull/13550 and https://github.com/facebook/react/pull/30451 both deal with whitespace issues.
- Pretend you and your group are building a Data Science tool to help with PR assignment. What elements of each project will make it:
- Harder to generalize your results?
- Easier to generalize your results?
- Report back.