Posts

Module 4: Regression with Dummy Variables

Image
Module 4: Regression with Dummy Variables STATA playlist Open the full YouTube playlist Dummy variables are essential whenever categorical predictors appear in a model, including sex or gender, hospital type, race or ethnicity, treatment groups, or place of residence. This module explains how to include those predictors correctly and why one category must be omitted and treated as the reference group. Students learn the dummy variable trap, the logic of the baseline category, and how to interpret coefficients as differences relative to that reference group. Key points Use one fewer dummy than the total number of categories. The omitted category becomes the reference group. Interpretation is always relative to that baseline. STATA examples tab group, gen(d_) regress y x1 d_2 d_3 regress y i.group x1

Module 4: Regression with Dummy Variables

Image
Module 4: Regression with Dummy Variables STATA playlist Open the full YouTube playlist Dummy variables are essential whenever categorical predictors appear in a model, including sex or gender, hospital type, race or ethnicity, treatment groups, or place of residence. This module explains how to include those predictors correctly and why one category must be omitted and treated as the reference group. Students learn the dummy variable trap, the logic of the baseline category, and how to interpret coefficients as differences relative to that reference group. Key points Use one fewer dummy than the total number of categories. The omitted category becomes the reference group. Interpretation is always relative to that baseline. STATA examples tab group, gen(d_) regress y x1 d_2 d_3 regress y i.group x1

Module 3: Binary Outcomes and Why OLS Fails

Image
Module 3: Binary Outcomes and Why OLS Fails STATA playlist Open the full YouTube playlist Health research often focuses on binary outcomes such as disease versus no disease, admitted versus not admitted, and survived versus did not survive. A central lesson from the logistic regression notes is that OLS can produce fitted values below 0 or above 1, which makes no sense when the quantity of interest is a probability. This module uses that problem to motivate the move from OLS to logistic and related models. The key point is practical: binary outcomes require a model that respects the probability scale. Why this matters Linear predictions can fall outside the range of valid probabilities. Binary outcomes violate the logic of a simple linear fit. This is why logistic regression is not optional but appropriate. STATA illustration regress hiqual avg_ed predict yhat logit hiqual avg_ed predict phat

Module 3: Binary Outcomes and Why OLS Fails

Image
Module 3: Binary Outcomes and Why OLS Fails STATA playlist Open the full YouTube playlist Health research often focuses on binary outcomes such as disease versus no disease, admitted versus not admitted, and survived versus did not survive. A central lesson from the logistic regression notes is that OLS can produce fitted values below 0 or above 1, which makes no sense when the quantity of interest is a probability. This module uses that problem to motivate the move from OLS to logistic and related models. The key point is practical: binary outcomes require a model that respects the probability scale. Why this matters Linear predictions can fall outside the range of valid probabilities. Binary outcomes violate the logic of a simple linear fit. This is why logistic regression is not optional but appropriate. STATA illustration regress hiqual avg_ed predict yhat logit hiqual avg_ed predict phat

Module 2: Foundations of Regression (OLS)

Image
Module 2: Foundations of Regression (OLS) STATA playlist Open the full YouTube playlist Ordinary least squares (OLS) is the starting point for most applied analysis because it provides a simple way to estimate linear relationships between an outcome and one or more predictors. In this course, OLS matters both as a method in its own right and as the baseline model students must understand before learning why binary-outcome models require a different framework. This module focuses on coefficient interpretation, the idea of a fitted line, and the logic of estimating expected changes in the dependent variable while holding other variables constant. Learning goals Interpret coefficients clearly. Understand what OLS is estimating. Recognize why OLS is the foundation for later modules. Core STATA command regress y x1 x2 x3

Module 2: Foundations of Regression (OLS)

Image
Module 2: Foundations of Regression (OLS) STATA playlist Open the full YouTube playlist Ordinary least squares (OLS) is the starting point for most applied analysis because it provides a simple way to estimate linear relationships between an outcome and one or more predictors. In this course, OLS matters both as a method in its own right and as the baseline model students must understand before learning why binary-outcome models require a different framework. This module focuses on coefficient interpretation, the idea of a fitted line, and the logic of estimating expected changes in the dependent variable while holding other variables constant. Learning goals Interpret coefficients clearly. Understand what OLS is estimating. Recognize why OLS is the foundation for later modules. Core STATA command regress y x1 x2 x3

Module 1: Introduction to Data, Variables, and STATA

Image
Module 1: Introduction to Data, Variables, and STATA STATA playlist Open the full YouTube playlist This module introduces the building blocks of quantitative analysis: variable types, data structure, value labels, and basic STATA workflow. A recurring theme of the course is that the type of outcome variable determines the model you should use, so this module lays the conceptual groundwork for everything that follows. Students should be able to distinguish between continuous, categorical, binary, and count outcomes; understand why coding matters; and begin navigating STATA confidently. Key ideas Variable type shapes model choice. Labels and coding choices matter for interpretation. Clean setup in STATA makes later regression work easier. Starter commands describe codebook tab varname tab varname, nolabel summarize