I'm a psychological researcher (grad student) and familiar with correlations and multiple regression analysis. For those unfamiliar with multiple regression, it can be understood as a type of correlation, except that you use multiple independent variables to predict your dependent variable.
I thought multiple regression could be useful to estimate the variance of a teams' points. Specifically, my purpose is to assess how important the power play "truly" is, compared to 5-on-5, to a team's performance. We all know that 5-on-5 play is very important; more important than power plays. That makes logical sense as most goals are 5-on-5. No fancy stats needed here. Yet, how much does a PP contribute to a team's performance relative to 5-on-5? And what about the penalty kill's contribution to team performance?
Using this seasons's stats from NHL.com. I calculated the following predictor variables: (a) 5-on-5 goals/game, (b) 5-on-5 goals against/game, (c) power play goals/game, & (d) short-handed goals-against/game. The correlations between these 4 predictors and team points were as follows: (1) r(a) = .52 (p<.01), (2) r(b) = -.53 (p<.01), (3) r(c) =.26 (p=.10), r(d) = -.18 (p=.17).
Using a statistical significance cut-off (i.e., p-value) of .05 (a 19 in 20 probability that the results are not due to random chance), the correlations with 5-on-5 were significant, but the power play & penalty kill correlations were not. So that may be telling in itself (or a case of small sample size). But what about unique predictions? When controlling for (or holding constant) the other variables, how much does each variable uniquely predict a team's points, if at all? This is where multiple regression saves the day.
In creating the regression equation, I simultaneously entered all 4 predictor variables. I then entered team points as the dependent variable. What will result is the beta coefficient of each predictor variable. You may be asking, "What the heck does that mean?" The beta coefficient is like a correlation, except that in this case, the beta coefficient (B) predicts the dependent variable (team points) while holding the other 4 predictor variables constant. In other words, the beta coefficient shows the extent one variable predicts team points over and above the other 4 variables, that is, its unique predictive power. The beta coefficients were as follows:
B (a) = .55 (t=4.69, p<.01) (You can ignore the t-value. It's proper form to include it.)
B (b) = -.63 (t=-5.46, p<.01)
B (c) = .20 (t=1.67, p=.11)
B (d) = -.12 (t=1.00, p=.33)
First, let's get the business of statistical significance out of the way. Because of their large p-values, the specialty team predictors are not statistically significant. However, the PP beta coefficient (B = .20) may turn out to be significant with a larger sample, but short-handed goals-against (B = -.12) is so far from significant, we can safely state that it does not uniquely contribute to a team's points.
A useful way to interpret beta coefficients is to square them. These squared values (R^2) would tell us how much of the variance in team points is uniquely explained by each variable. Thus, 5-on-5 goals accounts for 31%, 5-on-5 goals-against accounts for 40%, and PP goals accounts for 4% of the variance in team points. However, recall that the unique contribution of the PP is not statistically significant. I suspect a larger sample would fix that.
That still leaves about 25% of team points unaccounted for, but that's a question for another day. Some of this variance comes from shoot-outs, which isn't embedded within the 4 predictor variables. Perhaps shoot-out wins can be a 5th predictor? Looking at NHL.com, I see a shoot-out wins table. I'll post these results separately.
In any case, this multiple regression suggests something that I would not have suspected: That the power play and penalty kill contributes very little to a team's performance when compared to 5-on-5 play. Indeed, a major proportion of a team's points (71%) is decided by 5-on-5 goals for & against, whereas the power play only contributes a very minor portion (4%). Even if the penalty kill was statistically significant, its contribution to team performance is only 1%. According to this analysis, then, I think we can safely say that penalty kill is irrelevant to a team's performance.
Using regression analysis to assess team performance is a work-in-progress. Perhaps some more knowledgeable hockey stats gurus have already played around with regression and I'm merely reinventing the wheel, and maybe a broken wheel at that. I'd like to hear from anyone who does this sort of thing, or knows of someone who does, or just knows what they're talking about. I'm not sure I do.
All questions, comments, and criticisms are welcome.