Is the difference statistically different?
The second check should be: Are the differences large enough to be meaningful for policy purposes? These questions are addressed below: • Is the difference statistically different? Are the p-values less than 0.05? If so, you can assume that the underlying distributions come from different populations or experiences. But there are some other considerations. The statistical test of differences is affected by the number of observations from which the measures were generated. For example, if the measures were generated from hundreds of thousands of records then summary measures (such as averages) have less variance and lower p-values, which imply “statistical significance” even when the magnitude of the differences might be tiny. Alternatively, when differences are large and the number of observations is few, the absence of statistical significance might simply mean that the data set does not have enough observations for a powerful test. This happens frequently with the Behavioral Risk Fac