2025-07-29
(McElreath chapter 5; hint: helpful for assignment)
A DAG can show the causal associations in the model you are building.
It helps answer the question, What are the elements that matter in understanding the inference and potential confounding factors?
Is there any additional value in knowing a variable, once I already know all of the other predictor variables?
This might help us identify when we are collecting useless extra data.
If we look at the model in Statistical Rethinking, 5.1.3, we see how we build a model to check the ways in which two variables, or factors, influence the model.
The point of causal modeling is to create a model outside of statistics to describe what we think is happening in our research world.
It is about causation, i.e., we specifically want to see if some intervention is bringing about a change in the predicted variable.
If we don’t care about causation then this model wouldn’t apply. But note in my experience that is usually the type of question data science is curious about.
Does expressing negative sentiments in code comments affect prioritization of those comments? 1
// USED ONLY FOR REGRESSION TESTING!!!! todo: obviously get rid of all this junk
// Used only for regression testing! todo: clearly remove all this unnecessary code
With your project team, draw a DAG expressing causality in this experiment.
(reference: github.com/r-causal/ggdag)


Neil Ernst ©️ 2024-5