What is “residual” in the context of an OLS regression model?
OLS regression model is usually the first statistical model you learn in STAT 101. The equasion looks like this: Y= intercept + beta*X + residual. Before thinking about how this equation works, we should look at a model that is a lot simpler: Y= intercept + residual. This model has no predictor. And this model is the same as the procedure that obtains an average. If the data set contains math scores of John, Mike, and Luke, it will look like this: Y= [70, 80, 90] intercept= 80 residual=[-10, 0, +10] OLS is a technique to obtain values (intercept or an average score in this case) that minimizes the size of residual. Imagine I completely ignored the algorithm and guessed the average. I say the average is 70 just because I feel like it! Then observe the size of residual (it gets bigger). Y= [70, 80, 90] intercept= 70 residual=[0, 10, +20] Compare: residual=[-10, 0, +10] VS residual=[0, 10, +20] Can you tell the residual got bigger because I guessed the average/intercept without relying on