October 5th, 2005 Daniel Alfredson fired a wrist-shot through Ed Belfour to take the lead after netting two goals in the third period. It was the first shootout attempt in NHL history. Following a poke-check by Dominick Hasek on Jason Allison, a stick save by Belfour on Martin Havlat, a high-and-wide miss by Eric Lindros, and a five-hole Danny Heatley goal, the Ottawa Senators put-up the first NHL shootout victory.
That same post-lockout season was also the first to implement a salary cap and the Oilers were quick to benefit from both changes, picking-up Chris Pronger from a Blues’ organization looking to shed salary, and squeaking their way into the playoffs with seven shootout victories and nine Bettman points for shootout losses. The 2006 cup-run saw reunification of a fractured fan-base and laid the foundation for the eventual sale and growth of the Edmonton franchise. Had the sixteen-year finals drought not ended, the post-dynasty era would have been somewhat bleaker. It’s possible that hockey fans might have sought other hockey options, causing the Oilers to leave Edmonton. Oil Country, it seems, owes a debt of gratitude to the Bettman point and the shootout.
One often-repeated statistical claim is that the shootout is a "coin-toss" or "crap-shoot" implying that its results are random. Is it possible that the Edmonton Oilers are still in Edmonton by chance? Do we owe the current enthusiasm surrounding the team to the fluke operation of the ball-lottery and the fluke operation of the shootout? Such questions are somewhat unanswerable by statistical analysis and involve a considerable amount of conjecture. But the question of the shootout as chance operation is one worth revisiting. If it isn’t a chance operation it’s an opportunity for the current Oilers to attempt to pick-up a point or two, thus book-ending the longest playoff drought in Oilers’ history with another shootout playoff appearance (and avoiding the dreaded honour of sharing a record with the Florida Panthers). Should the Oilers invest in the shootout? Should any team, for that matter, invest in shootout specialists? Statistical analysis can suggest answers to such questions.
Both journalists and statisticians have promoted theories and models comparing the shootout to a coin-toss. Accomplished analysts have argued both for and against the shootout as a random operation. Like the flip of a coin, the shootout is tallied by two possible outcomes: goals and saves. The purpose of the coin comparison is to assign probability values that express the possibility of arriving at a certain number of goals or saves by chance alone. Such "p-values" do not indicate the truthfulness, accuracy, certainty, or reasoning associated with the associated statistic; they simply indicate the possibility of arriving at a "x" number of goals or "y" number of saves given "z" number of trials. Analysts must still interpret the results and defend their position.
Experimental design tasks with two-outcomes are often called "two-alternative forced-choices" (2AFC’s). 2AFC’s typically involve presenting participants with one of two randomly assigned stimuli, or one of two randomly assigned conditions. A typical 2AFC is the so-called "Coke-Pepsi" test where participants are presented with one of the two beverages and asked which one they prefer. There are numerous biases involved in such a test, but one of them is to make certain that the test is blind, meaning that respondents can’t be influenced by appearances not directly related to the taste of the beverage. Another bias can be avoided by having the same number of participants taste Coke first as taste Pepsi first. Some people, it seems, are fooled by first impressions. Let’s say that taste researchers invaded Rexall place and asked five hundred fans to taste two unmarked cups (one Coke and one Pepsi) and asked which taste they preferred. 250 respondents were given Coke first, while 250 were given Pepsi first. Fans chose Coke 248 times as their preferred beverage and Pepsi 252 times. In this situation, the experimenters must ask themselves if the 4 answers in favour of Pepsi are significant. Did they measure a real difference, or perhaps the respondents were just guessing? Maybe they didn’t taste a difference at all?
Experimenters tend to rely on established formulas from statisticians in order to measure such differences. A typical criterion for estimating the significance of results for a 2AFC is the binomial test, which for most experimenters simply involves using a look-up table. Such tables typically display the number of responses required to meet or exceed a "p-value" of p=0.05 or p=0.01.
Most binomial distribution tables are calculated with a 50% probability of arriving at a single response by chance (although they may use other probabilities); this is the coin-flip situation. In our Coke-Pepsi example there is a 50% chance of each respondent making their selection by chance, for example, it’s entirely possible people can’t discriminate between the two drinks or prefer them equally. With our result of Pepsi preferred 252 times p=0.03.
What you may have realized by now is that experimenters can never know if a result is the product of chance or an actual result. All they can do is state the probability that a particular result might be the product of chance. What this means is that different communities establish different criteria, different p-values, at which they decide to place some "faith" in their results, and thereby accept or reject the "null hypothesis" (thank you Karl Popper). In the food sciences and engineering p=0.05 is typically held to be "significant". In the hard sciences p=0.01 is held to be significant. For our result of Pepsi being preferred 252/500, food engineers are likely to view such a result with some degree of "faith", while physicists are likely to be more skeptical.
If you have a look at a binomial distribution table smaller numbers of "percent-correct" or "percent-preferred" are held to be significant for experiments with a greater trials. For example, both 8/10 and 55/100 meet or exceed p=0.05 for our beverage test. Generally, because the notion of our "Coke-Pepsi" test is to represent the population of Oilers fans, more samples are held to be better tending to produce results with a lower probability of being spurious representations of the fan-base's preference. We see here why colloquial use of the the terms "sample" and "sample-size" among the broader hockey analytics community is often used incorrectly. More shots or more saves for a particular player are not more "samples", they are more "trials" from a single player.
The shootout as a "three-sided coin"
There have been 11,166 shootout shots taken in the regular season with 3630 of those resulting in goals (=0.325). Goalies have saved 7536 of those shots (=0.675). This is pretty close to 1/3rd and 2/3rds respectively, with more than 2/3rds resulting in "saves". The reason I believe this is so, is that there are actually three possible results to our test, in other words we’re actually dealing with a three-sided coin. The three sides of our coin are: goal (counted as goal), miss (counted as save), and save (counted as save). We may have a 3AFC test counted as a 2AFC. For this analysis, I’m going to go with the 3AFC as a model for the shootout where both misses and saves are counted as saves.
Full-sized chart and table visible here. Shots are charted on the horizontal axis while goals are on the vertical. Bubble-size corresponds to shot-percentage. Bubble colour corresponds to statistical significance category (p-value equal-to or smaller-than 0.05, calculated by a 1/3rd probability of scoring a goal by chance). Green bubbles are better than chance, white bubbles equal to, and red bubbles are significantly worse than chance. Mousing-over the bubbles displays relevant data.
843 shooters have participated in the NHL shootout. Victor Kozlov (pictured in the photo at the top of the blog) is statistically the best shootout shooter over the last 10 years as determined by our criteria. Over 46 shots he scored 27 goals giving him a shooting-percentage of greater than 58 (p=0.00). We see several superstar players that have significant results, and have repeated these results over many trials, including cup winners Jonathan Toews and Patrick Kane, and American specialist T.J. Oshie. There are also surprises in the upper end of the spectrum such as Brad Boyes.
Based on these stats it would be more prudent to pick Kozlov to take a shootout for the Russian National team (he missed the 2014 Olympics) than Alexander Ovechkin. Ovechkin has scored 27 goals over 90 attempts, which is a performance equal to chance. Given the rules of Olympic hockey for the shootout, Kozlov's NHL shootout record is sufficient evidence to have him play on the team at a ripe old age, if in fact he can still "bring-it" in the skills competition. In the olympics the same player can repeatedly take shootouts for his team. Kozlov could give the Russians a significant statistical advantage.
We also see some players who should not be asked to shootout, those red bubbles worse than chance. Among the red are Vincent Lecavalier, Daniel Sedin, and Steven Stamkos. Team Canada take note: if Stamkos is in the next olympics don't put him on the shootout, pick Johnathan Towes or maybe Sydney Crosby!
Full-sized chart and table visible here. Charted using the same criteria as above.
This Chart includes shootout performances from 72 Oilers and ex-Oilers. These shots may have been taken as Oilers, or before or after the player in question played for Edmonton. There are only two performances that are statistically significant, Dustin Penner (worse than chance), and Rob Schremp (better than chance). Players like Eberle, Stoll, and Satan were/are approaching significance and might reach significance given more trials.
Full-sized chart and table visible here. Shots-against are charted on the horizontal axis while saves are on the vertical. Bubble-size corresponds to save-percentage. Bubble colour corresponds to statistical significance category (p-value equal-to or smaller-than 0.05, calculated by a 2/3rds probability of making a save or causing a miss by chance). Green bubbles are better than chance, white bubbles equal to, and red bubbles are significantly worse than chance. Mousing-over the bubbles displays relevant data.
Please note that this analysis (following the NHL's stats page) counts saves and misses together (as saves). We can't technically tell if goalies are making more saves or just causing players to miss more. But we can calculate the probability of a saves-plus-misses data point (indicated in the chart as saves) being a product of chance. These data points are more robust because goalies see more trials than shooters in the NHL.
160 goalies have participated in the NHL shootout. Henrik Lundqvist, Roberto Luongo, and Ryan Miller have been very dominant. They are certainly outliers. Lundqvist's 245 saves over 327 shots-against for a save-percentage of 74.9 (p=0.00) is incredible. Many of the top goalies are better than chance including Carey Price.
Goalies significantly worse than chance include Niklas Bakstrom, Vesa Toskala, and Jean-Sebastien Giguere.
Full-sized chart and table visible here. Charted using the same criteria as above.
This Chart includes shootout performances from 20 Oilers and ex-Oilers. These saves may have been made as Oilers, or before or after the player in question played for Edmonton. I have included Tim Thomas because he was with the organization and I like to fantasize about what may have been. Other than Thomas only Mathieu Garon has exceeded chance while higher-profile goalies Nikolai Khabibulin and Jussi Markkanen are worse than chance. Garon is "your man in a pinch". Dean Dubnyk, Dwayne Roloson, and Ben Scrivens have performances equal to chance.
A few thoughts
If the Oilers are to use the shootout to their advantage the best tactic is to train their goalies to do well. Their secondary tactic should be to push players like Eberle into statistical significance.
Is the shootout random? The answer is that it depends on who's shooting and who's saving. If Alexander Ovechkin is facing Devan Dubnyk it's probably random. If Johnathan Towes is facing Henrik Lundqvist it's not random at all.
If the NHL wants to make the shootout less random and more exciting it should adopt the same rules as the Olympics where teams are allowed to forward their best shooters as often as they want. This is the best way to guarantee that the most skilled team wins. I like watching skilled players in a duel as long as it's the best possible match-up. I'd rather have triple-overtime but for the regular season I'm just fine with the shootout, thanks.