Lots of discussion lately about the Corsi number, at MC79hockey, in Kukla's Corner, and here at The Copper & Blue. In the first of these I got into a bit of a hey rube with some of the more formidable stats buffs in the sphere. You'd think I'd know better by now.
In his post, Tyler picked out a comment I'd made in an earlier thread:
The bad Corsi goes with the territory of fourth liners everywhere, who by definition face many more unfavourable match-ups than favourable ones.
For this one remark at least, I wasn't talking out of my ass but had done some background research. The results are interesting enough on a number of levels to be worth sharing.
Using Gabe Desjardins' outstanding resource Behind the Net, I identified the 344 NHL forwards who played 50+ games in 2008-09 and sorted them by Corsi rates. At the top of the list were a handful of guys whose teams had attempted > 20 shots more than their opponents per even-strength hour the player was on the ice; at the bottom were just a couple who had sunk below minus-20. The top 10 included many players whom you might expect: Datsyuk, Hossa, Franzen, Holmstrom, Zetterberg, Ovechkin. It also included a few surprises: the top three in the league were Sergei Fedorov, David Moss, and Eric Fehr for goodness sake, while Curtis Glencross (shown up top causing problems on Nik Khabibulin's doorstep) ranked 9th in the NHL. (That Glencross was in the top 10 in the league while his erstwhile linemates Brodziak and Stortini were both in the bottom 10 speaks volumes. What a colossal blunder.)
Besides great Corsi rates, what do all these guys have in common?
Rather a lot, it would seem. I decided to sort into groups of 50, with one slightly smaller group of 44 put in the middle to make the groups at the top and bottom of equal size. The resulting table showed some surprising uniformity:
Of course the last column is forced since Corsi was used for the sort. Still, it's interesting to see the incremental gaps in Corsi rates from one group to the next, with the largest differentials being at the extremes, especially on the good side of the ledger.
Far more interesting is how the one sort works right across the table. The top players have both the highest average GP and the highest average minutes per game, which means they have a significant edge in total ice time. This does not give them an advantage in Corsi rates, since Gabe cleverly presents these on a per-60 basis, but the combination of more ice time at a higher rate obviously means the gross Corsi totals of the top guys are outstanding.
Moreover, the top Corsi performers face the highest quality of competition, albeit the increments from one level to the next are very small indeed when they are not zero. Only the bottom group shows a significant drop, which is no doubt due to what Dennis calls the "gentleman's agreement" where coaches frequently match their fourth lines against one another. This accounts for a significant fraction of the scrubs' ice time, however by no means all of it.
The QualTeam column is most revealing. There is a little hiccup in the third group which breaks an otherwise orderly descent, in fact it's the only anomaly on the entire table. The QT values are far higher than those recorded under QualComp, ranging from +0.14 for the top 50 to -0.14 for the bottom 50. The main reason for this is surely that the coach has full control over who his players line up with, but only partial control of who they line up against. I expect home/road splits might be particularly revealing if they were available, for the top guys and especially those at the bottom.
The splits become even more clear when we divide the list into just three groups, like so:
If one accepts QC and QT at face value, the positive Corsi guys have the huge advantage of playing with the best teammates, the negative guys tghe huge disadvantage of lining up with the worst. Of course Tomas Holmstrom had a great Corsi, he was always playing with Datsyuk, Rafalski, Lidstrom, Hossa, et al. And of course Kyle Brodziak had a crummy Corsi, he lined up frequently with Staios, Moreau, Stortini, Strudwick ...
The problem with these metrics, QT in particular, is that they are unavoidably self-referential. The teammates' results as measured in Corsi, goal differential, or any other outcome, are going to be in part - but only in part - due to the contribution of the player being considered. If there was a large sample size where each guy played a lot with each different teammate it might be possible to disentangle that with some confidence, but when a guy like Brodziak plays most of the time with scrubs he's gonna look like a scrub. Especially sans Glencross. *sigh*
The QT factor becomes even larger if we refine the extremes of the list.
I have added a third significant digit to QC to show the minuscule incremental change, as the results round to +0.01 in all cases. QT on the other hand just soars at the elite end of the list.
From the above tables one could derive a crude working formula: (QT - QC) * 100 ≈ Corsi/60, suggesting that QT is the driving factor of shots. I'm not suggesting we should do that - in fact I think it draws dangerous conclusions - just that it is consistent with the data at hand. Certainly the quality of teammates is a hugely important factor.
Two more small tables, the first showing the offensive production of the Corsi groups:
... and the second showing team results for the same groups:
In both cases we again see a fairly orderly progression from top to bottom with the muddled middle less stratified. Even though a positive Corsi in theory hinges as much on effective shot prevention as shots attempted, the effects of outshooting appear to be much greater at the offensive end of the ice. (Of course, this study involves only forwards.)
The final column is one I concocted to determine the relationship between outshooting and outscoring. The middle groups should probably be discounted as the divisor (Corsi +-) passes through zero, but the top and bottom 100 show a strong correlation, whether positive to positive or negative to negative, of about .045. I don't have actual numbers to hand, but if one assumes that ~55% of all attempted shots make it to goal, and that about 8% of those shots (EV Sh%) find twine, it lines up very nicely with these results.
Collectively, the results undeniably show that outshooting and outscoring are strongly correlated when measured in terms of the individual playing at even strength. It is especially strong at the level of first- and fourth-line players, which is in part but to an unknown degree a result of the quality of players with whom they play. This is at variance with my own previous studies on outshooting which were conducted at the team level and which considered the game holistically including special teams situations, and which have shown outshooting is certainly a positive but is a relatively weak driver of wins and losses. While there are many individual exceptions to the rule, I emerge from this exercise with an increased (if still qualified) respect for Corsi as a measurement of individual 5v5 performance.