Hockey Analytics is a Method, Part 1

The most valuable insights are the last to be discovered; but methods are the most valuable insights.

--the methods, it should be said ten times over, are the essential thing, as well as the most difficult thing, as well as the thing that can be blocked by habit and laziness for a very long time.

(Nietzsche, Anti-Christ: 13, 59)

The great analytics test case of the Toronto Maple Leafs is careening to its glorious conclusion. To recap:

  • The Leafs are a horrible possession team. They are 29th in the league in 5x5 Fenwick Close percentage (29th last year; 27th the year before).
  • The Leafs, despite horrible possession numbers, have enjoyed extreme and unsustainable luck (e.g., they are currently 5th in the league in PDO), which has masked their poor play and kept them afloat in the standings.
  • We've known for a long time that possession stats correlate very strongly with winning games.
  • Toronto is the center of the hockey media complex. Everything that occurs there reverberates throughout the hockey landscape. As in all major hockey markets, the Toronto media complex enjoys a symbiotic relationship with the team it covers, often reflecting and/or amplifying it's thinking and messaging.
  • The Toronto Maple Leafs and the old guard media that covers them have, in the main, proved implacable to the methods and knowledge gleaned from analytics.
  • A rather brash stand-off has taken place between the stats and anti-stats communities that has, to some extent, hung on the fate of the Maple Leafs for one side of the other's vindication.
  • With the long predicted collapse of the Leafs at hand, a palpable re-appraisal of what analytics has to offer is underway throughout the hockey community, most notably at the top of the main channels of hockey chatter.
  • Parallel to this flash-point event analytics have over the years slowly gained acceptance and expanded their reach among NHL decision makers, the media that covers them and the fans that voraciously consume and discuss hockey media.

With this reprisal, I'd like to make three points.

1) The manner in which the Leafs (and before them the Wild and the Kings) erupted into hockey consciousness dictated that the conversation revolve around possession stats. That is, case studies have driven the conversation and framed its lines of argument.

2) Possession stats, while incredibly important, are not the sum-total of what analytics has to offer (this is obvious to most I imagine). However, insofar as the various mainstream debates (that is debates occurring above the level of internal analytic community debates) have taken place, these debates have by and large been framed by the case study at hand and therefore revolved around the import of possession.

3) Insofar as possession has become a prism through which analytics have been viewed and discussed, not only has it been granted an outsized importance in relation to analytics, it has indirectly encouraged a hyperbolic situation, whereby possession is positioned as an all-or-nothing proposition. A particularly strident form of this hyperbolic framing popped up on Hockey Twitter yesterday:

That's a comment from a Pension Plan Puppets post. And, while it is obtusely hyperbolic, it does resonate with a line of thought I'm sure many have encountered. Typically, the analytics-as-possession mainstream conversation goes one of two ways:

1) an insistance that possession -- absent any other context -- correlate perfectly, or near so, with actual game, player and team outcomes. So, when a poor possession team wins a game or a string of games (or, like the Leafs last season, makes the playoffs), it is treated as evidence that analytics is invalid.

2) a concession that possession is useful, or a tool (one of many), but that it doesn't tell the whole story -- the whole story including a lot of bias-prone "intangibles."

The path of the first conversation is mere frustration and a failure to communicate. Part of the problem here stems from the way the conversation begins, with possession framing the issue. Another key part is surely the sense sports fans and media have that "stats" (typically boxcars, win/loss record) tell you in a short-hand but definitive way how good a player or team is. This has no doubt prepped fans and media to expect short-hand but definitive answers from stats, but it is also an approach that fights off context and nuance.

The path of the second conversation allows for a lot more context, however, it still defers to analytics as possession and treats corsi or fenwick numbers as an isolated point of information to be added to the cloud of chatter about the "will to win" and so forth. The problem here is possession stats are divorced from analytics and becomes merely an element in a sea of ideas about hockey. Here, a corsi number can easily slide in and out of a narrative about a given player or team to reinforce any argument.

A problem here is that analytics ≠ possession stats

What I mean here is that analytics is better understood as a method, or a process of working through problems, than as a series of tools or numbers. I would define analytics as

using all available data (e.g. what is provided by the NHL's game sheets) to build tools (like corsi, fenwick, zone starts, etc.) to craft reasonable and contextual arguments concerning why and how an event occurred, or is likely to occur and/or make inferences regarding the true talent of a player or team.

That's certainly convoluted. But, it transfers the burden of analytics conversations away from the framework of "analytics as possession." The idea being that the "analytics as possession" framework encourages "failure to account for" arguments, i.e., tired anti-stat arguments such as:

you said the Leafs would lose [no, actually, I did not], and they won! how do you account for that?! [rhetorical question].

corsi doesn't account for x, y, or z, therefore it's irrelevant or extremely limited in its use.

Both of these arguments are bogus and based on faulty premises, i.e., that analytics fails to account for a variety of outcomes and/or variables.

But, I'd like to suggest, these faulty premises are in part encouraged by the way in which the analytics conversation has reached the mainstream, i.e., through case studies (Wild, Kings, Leafs) that focus primarily on possession.

The key difference between a framework that posits "analytics as possession" and one that posits "analytics as a method" is that the latter starts with problems (i.e., how is this poor possession team overcoming its handicap?) and does not limit itself to any particular tool or combination of tools to provide an answer. It shifts the burden away from one of the utility of a single reference point to that of an unfolding process capable of contextualizing a wide variety of variables.

Eventually, one assumes, just as possession is now beginning to mainstream, analytics as a method, or process of working through problems, will enter the mainstream.

For the moment, however, one strategy to counter the limiting framework of "analytics as possession" might be to add counter case studies to the debate. In another article I will briefly go through the case of the New Jersey Devils, an incredibly strong possession team that is nevertheless faltering in the standings, to illustrate how analytics is anything but a "failure to account for" proposition.