How does one distinguish between the regression line for x on y versus y on x?
A regression of “Y on X” means that Y is the dependent (response) variable, and X is the independent (predictor) variable. A regression of “X on Y” means that X is the dependent (response) variable, and Y is the independent (predictor) variable. The lines are different, because they are based on different interpretations. Consider a regression of Y on X. For any given value of X, there are many possible outcomes of Y. A regression means that the population averages of Y for each value of X are connected by a straight line: avg. Y = Intercept + slope*X. Hence, the regression of Y on X tells you about what happens with Y, given an X. Now consider a regression of X on Y. For any given value of Y, there are many possible outcomes of X. The regression of X on Y means that the population averages of X for each value of Y are connected by a straight line: avg. X = Intercept + slope*Y. Hence, the regression of X on Y tells you about what happens with X, given a Y. In real analyses, you get to