One of the most promising areas of statistical study in recent years has come in the area of faceoff locations. While it is impossible to record where the puck is throughout a player's shift, or its location when he begins or ends a shift on the fly, we do know where the disc is both on the faceoff and at the whistle, and which players are on the ice for all such events. In what zone do coaches deploy their players, and in what direction do those players drive the subsequent play? Vic Ferrari of Irreverent Oiler Fans has done some outstanding pioneer work in this area, inventing (to the best of my knowledge) terms like ZoneStart, ZoneFinish, and ZoneShift. More recently, Gabe Desjardins of Behind the Net has made the raw stats for the first two of those categories readily available, and in sortable form which proved extremely useful for a global study.
In Gabe's version of zone stats, all faceoffs and whistles are counted while a player is on the ice, not just those at the beginning or end of his shift. Zone start and finish are both expressed by discounting neutral zone faceoffs altogether (a somewhat dangerous practice in my opinion) and calculating what percentage of a player's end zone faceoffs occurred in the offensive zone. Both figures are somewhat confusingly recorded at Behind the Net as "OPCT"; to avoid ambiguity I have relabelled them OZS% and OZF% respectively.
I decided to focus my (first) study on defencemen, specifically the 200 rearguards who had 41+ GP in 2009-10. It's worth bearing in mind that by excluding those players with <41 GP - who could arguably be labeled "replacement level players" - I was no longer playing a zero sum game. The 200-man group should be slightly better than average defencemen overall. But the method was such that I was reluctant to give equal credit to a 10 GP injury call-up as a "regular".
The plan was to sort the players into groups based on ZoneStart (even strength only), and then to track the results of each group. All players under study had an OZS% between 35% and 65%, with the low-end percentages representing the heavy lifting and the high-end consisting largely of offence-first guys and/or young guys being protected. The St. Louis Blues took this to an extreme level, in that Roman Polak (36.1%) and Barret Jackman (38.9%) were 1-2 in the entire NHL in toughest ZoneStarts, while Erik Johnson (59.9%) and Carlo Colaiacovo (59.6%) were both among the top six for easiest OZS%. Pat Quinn on the other hand, "protected" his rookie defenceman Taylor Chorney by saddling him with the fourth most difficult ZoneStarts in the NHL (39.2%). I honestly have no idea what he was thinking, or indeed, if he was thinking. Makes no sense to me.Anyway, the percentages broke nicely into 6 groups at 5% intervals, distributed along a fairly regular Bell curve with only a handful of outliers <40% or >60%.
|Bracket||#||GP||TOI/GP||QUALCOMP||QUALTEAM||OZS%||GF ON||GA ON||Goal%|
Interesting to note that, as a group, the guys below the 50% line also had heavier workloads in GP, in ATOI, and particularly in QualComp which weakens in lockstep with ZoneStarts. Funny that.
Lest you think that ZoneStart (and QualComp) is overrated, I draw your attention to this graph showing Goal% = GF/(GF+GA) for each identified group.
My apologies for the confusing similarity between the axes. The X axis is the groups (or brackets if you prefer), and the Y axis the results. The blue line is, by definition, almost straight diagonal since it's simply a group average OZS% that is sure to fall near the middle of its range. It clearly shows that the guys who start at the good end of the ice tend to be outscorers, while the guys starting out in their own end collectively come out on the short end of the scoreboard. I'm just a numbers guy, not a trained math guy, so I'll let the math whizzes tell you what the Pearson correlation is, but I can say this much: it's positive, and it's strong.
The other useful outcome of the group method, and indeed the prime focus of my study, was to try to better establish the relationship between ZoneStart and ZoneFinish. Vic's first attempt was something called ZoneShift, which was simply the net differential between the two. It was pointed out in the comments section of this terrific post - and Vic readily agreed - that this method was heavily biased in favour of those players getting the tough ZoneStarts, as a defensive zone start could only generate a plus but no minus. Further discussion ensued as to how to come up with a weighting system to account for the built-in bias.
When introducing ZoneFinish stats to BtN a while back, Gabe Desjardins further illustrated that "hockey swings very hard towards the equiliibrium". It's a short article, but the beautiful graph at bottom tells a thousand words.
How hard is that swing? Perhaps our grouped players will provide a clue. The first step was to adjust ZoneFinish to account for GF and GA. The adjustment was simply one more offensive ZoneFinish credited for a goal scored, one more defensive one charged for a GA, and steal both from the faceoffs officially recorded as neutral zone draws (as in, centre ice, after the goal). Not all neutral zone faceoffs can be considered innocent!
The pendulum swings most of the way, it would seem.
The anticipated bias in ZoneShift is readily apparent, and in fact is even higher than I had expected. After several trials (50%, 33%, 25%) I ultimately found that a slope of 30% was the best fit, which I think is saying the same thing as a positive correlation of 0.3 between OZS% and OZF%. Whatever the terminology, here's a graph that'll save me a thousand words.
The 30% solution does a very nice job of "predicting" every group except the one outlier at the heavily sheltered end of the spectrum. Just four regular defenders started more than 60% of the time in the O-zone, and two of those guys (Mark Fraser and Paul Bissonnette) were so bad that their ZoneFinish was below 50%, dragging down the group.
Of course if I wanted to be truly scientific about all this, I would repeat the above study with the forwards, or for other seasons, but for now, I've seen enough to accept 0.3 as my working fudge factor from which we can derive Expected ZoneFinish. On an individual level, it's pretty darn interesting to compare that number to Actual (adjusted) ZoneFinish. But that's a subject for another day.