Outshooting leads to winning. News at eleven.
In this post, I talked about the inverse relationship between Corsi and Zonestart as a discussion point on context in microstats. The graph lent a visual to a not-so-obvious relationship for some. Below is a chart that shows a more obvious relationship. Without enlarging the chart, do you have thoughts on what this represents? The scale on the Y-axis is marked in tenths.
We'll get back to that chart in a moment.
The importance of outshooting has been discussed endlessly by Tyler at MC79, Vic at Irreverrant Oiler Fans, Jonathan Willis here and at Hockey or Die and of coure Gabriel Desjardins at Behind the Net. Recently, Jonathan made an excellent post, taking up a topic that Tyler had presented as well.
Internet arguments against the concept of outshooting = outscoring = winning are typically reduced to the absurd, as in "The Oilers have lost the last four games where they outshot the opponents!" That very small sample size aside, the concept holds.
The chart above is the even strength shot ratios and goal ratios for each team by finishing position for the last three years. The main point of the chart is that it is possible to outscore while being outshot, but it's not likely. The linear trend line inserted into the graph shows the relationship between outshooting and outscoring. The middle four teams are muddled, as expected, and if we were to continue to track the outshooting versus outscoring ratios, eventually only the middle four teams would be muddled. Everyone else would fall into the trend, because even though three years of data is more than the single games mentioned above, it's still the short-term.
Yes, it's possible to outscore while being outshot over the course of a season or two, but it's not likely.
Next up: the relationship between even strength outshooting and winning. Jonathan looked at the results of Tyler's study and concluded:
Finally, with regard to point four I again turn to Dellow’s work. He did a massive big-picture post on team’s records when out-shooting or getting out-shot; all of the data from 1987-2008 is included. The full study is here; for the sake of brevity I’ll just post the records:
- Outshooting Team: 9451 W – 7116 L – 1979 T – 612 OTL
- Outshot Team: 7728 W – 8647 L – 1979 T – 804 OTL
None of this should be surprising - the team that outshoots at even strength will outscore. The team that outscores at even strength will win.
Update: Here is the graph with the shots ratio flipped:

41 comments
|
0 recs |
Do you like this story?
Comments
oh sure
Another fancy microstat! You know what counts pal! Sticktoitiveness. And Stanley Cup rings. And, and, and ….
Truculence!
SNN Sports - A theoretical Oilers blog (i.e. theoretically, I write stuff there). Link now 100% less broken.
I’ll preface by saying I believe in everything that you’re saying (you took a gander at Kent’s site so you know I’m very vocal in this regard).
The graph was a bit confusing at first though, it took a while for me to realize that shots were an A/F ratio, and goals were an F/A ratio. Also I was confused by the negative ratios, I presume you subtracted out “one” to get the middle of the y-axis to be zero?
After all that is said and done: good job, the chart is pretty convincing. Shots now as a predictor of goals tomorrow was already pretty solidly covered, now you have shots now as an indicator of goals now. And goals now is the language of casual fandom.
And goals now is the language of casual fandom.
Goals Now is the language of the game. The game is played in the present tense and is decided on the scoreboard. Goals are the currency that separate winners from losers. Shots are a good but far-from-perfect indicator of goals, and of winners and losers.
My own crude research in this area is based on different parameters, in part because I’m a lazy bastard and in part because I’m a stubborn bastard and in part because I believe that if independent methodologies find similar trends or reach similar conclusions that strengthens those conclusions. My method is to use “whole game” shots without filtering out special teams. More broad brush than fine comb. Both have their advantages and could prove to be quite complimentary.
So mine is more a holistic rather than a reductionist method. It is cruder in some ways, but more “pure” in the sense that teams don’t play to win the battle at even strength, they play to win the game. The score that’s on the board affects how the game is played right now, whether they happened to come on the powerplay or not. A team won’t play to defend a 2-1 lead at evens on a night they’ve been outscored 2-0 on the powerplay. Microstats are great but always run the risk of decontextualizing themselves from the real game unless they are specifically tabulated by leading / trailing / tied, an area in which much promising work is being done. Anyway, there’s a real interesting philosophical debate in there, which path I don’t want to go any further down just now.
A few months ago I looked at the readily available outshooting/outshot records at NHL.com, has shown that since the lockout (i.e. the Bettman Point 2.0 Era) the outshooting team wins ~ 53%, which is significant but not earth-shattering. As I recall — and I’m too lazy to look it up now, it;s one a different computer and the hockey gods know what thread on whose blog I might have posted it! — it used to more about 55%. It’s about as significant as home-ice advantage, which is to say it’s important but not the be-all and and-all. The majority of teams will tend towards the mainstream of doing a little better when they outshoot and worse when they don’t, but there are always a significant number of outliers who buck the trend. Furthermore, no team outshoots all the time, so that’s a second order consideration: e.g. if a generic team outshoots x% of the time they should be expected to win 53% of the points those games and 47% in the others. Plug in some numbers and the margins are small. The difference between a team outshooting in 60% of its games and 50% is about one standings point.
So it’s a useful trend but one can’t be that confident in it. Even in your example above which because it’s a 3-year average should sand down the outliers, nearly a quarter (7 of 30) of all positions have an outshooting ratio which has the “wrong sign”. Within any given season the number can be larger than that, maybe a third of all teams?
The other real interesting thing about your graph is the shape of the red curve, which shows how the #1 team in GF/GA has been a major outlier from the otherwise almost-linear distribution.
Two minor beefs about the graph; one echoes R O about the inversion. Did you do that to make it easier to see? Cause it’s counterintuitive for sure. I’d rather see both curves the same way. The other is the strength of the trend lines which tends to bias the perception. The green bars deviate quite a lot from the derived trend line. Turn those lines off (or make them fainter commensurate with their standard deviation) and the trend will still be there but not hitting you over the head with false pretences. :)
Anyway, good work Scott, lots of food for thought here. Can you send me your raw data please?
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
Not my piece! But I’m sure Derek will send it to you.
I think the strength of the outshooting data is on the margins. There haven’t been any teams with a shot differential of -4.5 per game or worse since the lockout that have made the playoffs (until this year’s Avs this year?). Of the teams that are +4.5 per game or better since the lockout, they’ve all been pretty dominant teams. If you’re between the two extremes, you might be good and you might be bad. What Derek has tried to show here is that the better your shooting record, the better chance you have of being good. Like you say, it’s far from perfect. Still, in the big picture (i.e. results over several games rather than game-to-game data) it’s much better than “53%” since we can talk in terms of degree rather than a simple “yes” or “no” (although I’d be interested to see data on games where a team outshot by, say, at least 5 or at least 10).
by Scott Reynolds on Jan 29, 2010 6:00 PM MST up reply actions
Not my piece! But I’m sure Derek will send it to you.
* Blushes * Sorry Derek! I got so lost in there and the links I forgot to go back and double check, it seemed like the sort of thing Scott would do.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 29, 2010 6:35 PM MST up reply actions
Two minor beefs about the graph; one echoes R O about the inversion. Did you do that to make it easier to see? Cause it’s counterintuitive for sure. I’d rather see both curves the same way.
Leaving the shots data as SF/SA didn’t have the visual impact. However, by popular demand, I added it to the story.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
Thanks. I think it does show much more clearly, that the slope of SF:SA is less than half that of GF:GA. When they were sloping in opposite directions that was much less apparent.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 12:23 AM MST up reply actions
Anyway, good work Scott, lots of food for thought here. Can you send me your raw data please?
Sent.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
Thanks, Scott Derek.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 29, 2010 8:49 PM MST up reply actions
What is the error bar or standard deviation on the slope for the SA/SF graph? And/or what is the statistical confidence level (i.e. probability) that the slope is positive and non-zero? Or what is the correlation (in this case, inverse correlation) coefficient?
If you guys are going to do statistics, do statistics properly. If you don’t determine and state an error bar or confidence level in/from your analysis, it really doesn’t mean anything.
it really doesn’t mean anything.
Well, I could “do statistics” (whatever that means) on the numbers and come up with a P and a S and all of the fun involved. Or I could ignore it because I know that the abbreviated data is going to have some issues, because as I said in the post itself:
even though three years of data is more than the single games mentioned above, it’s still the short-term.
though we both know that the 30 sample points will only trend closer to the slope as we add more seasons of data to them.
The point, not lost on everyone here, is that single game looks are inane to the point of insane. On a game-by-game basis, the correlation, as Tyler demonstrated is significant, but not dominant. On a season-by-season basis, the correlation gets stronger. With three seasons of data, the correlation is now obvious, but not as statistically significant as Tyler’s sample, and we don’t have the details necessary to run the correlation over ten years.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
godot10:
That is nonsense.
Any kid can run a linear regression with Excel, R, MATLAB or similar software and post the error values without understanding what they mean. I fail to see how that would move the conversation forward.
A scatter plot with of GF/GA vs SF/SA might have made the point more succinctly, but that’s in the eye of the beholder. And enough people have already complained about the presentation :)
And he could have inserted a trend line onto that scatter plot, any type … I’ll assume he’s used least squares error, spreadsheets use that as the default for best fit lines. Or he could have created his own error measure Let’s call Zona error = sum:(absolute deviation(i))^ZONA! … this where “ZONA!” is the ratio of the jersey numbers of Derek’s two least favourite Oiler players.
There is a very real chance that the ZONA! trend line would be a fairer reflection of reality than the LSE trend line, by chance alone.
You could argue that the data points should all be weighted as well, because there are different underlying sample sizes. And being multidimensional, that would be a bitch. Everyone start building their covariance-variance matrices now! Whoopie!
Postdictive regressions are the slow road to nowhere in the first place. They are an arrow in the right direction, a starting point for reasonable investigation, nothing more or less.
Sensibly, we all know that some teams have top end talent that play against the other team’s best, outshot them by a decent margin and could really finish their chances … so outscoe them by even more … and they had bottom 6 forward groups and bottom D that got their asses owned, so gave back a lot of that gained shot differential, but little of the goal diff … because they were usually playing against schleps.
Plus for some teams their PP’s kick ass. And other teams had great goalies (BUF didn’t need to outplay you to outscore you when they had Hasek.) And we know that there is some error in shot counting venue to venue. Then there’s empty net goals … on and on. There is never such a thing as too much common sense.
Still, even through all that noise, as the sample grows larger the trend grows visibly stronger.
As well, it is flagrantly obvious that the p value for the null hypothesis of “outshooting doesn’t matter”, which is nothing more than a rewording or your slope request. It will be extraordinarily small.
Derek could get the r for the two sets of data … correl() in Excel … then run them at timeonice.com/coincidence.html. He’d get the p value you’re after … and as I say , it will be miniscule.
And before you questioned his result, you would have to really think about the underlying math posed by your original comment … because it is precisely the same. I mean I think it would be useless, but it would satisfy your request.
You are missing my point.
The analysis suggest that the slope of the “shots” line (say slope = m) is positive and non0-zero. One should pursue the statistical analysis far enough to show that the standard deviation in the slope is less than (+/- m), or the probabilty that slope could really be zero is not statistically probable.
A statistical result without a quantification of exactly its statistical significance doesn’t mean anything.
It is like developing a test for a diseaser, without telling me how many false positives and missed positves are likely. Without those, one doesn’t know how good the test is.
How likely is that positive slope a “false positive”? It is probably pretty small, but the analysis is only half done unless one does the siginficance analysis as well.
Yes. Running statistics properly is idiotic. Because running that shit through SPSS/SigmaStat/whatever would take more than five minutes.
Don’t ever change, Vic.
SNN Sports - A theoretical Oilers blog (i.e. theoretically, I write stuff there). Link now 100% less broken.
Doogie
You’re another one that throws out some gems. I suspect that you don’t understand what you are saying.
I think most people see these types of comments and just assume: “This kid sees the game differently, he doesn’t like what is being inferred. Plus he’s taken a couple of University math courses lately.” Still, I feel moved to comment.
OLE regressions aren’t the key to the kingdom. Remember trivia craps? Simulate that and run an OLE regression on the stats for a season … you will come to the irrevocable conclusion that trivia ability is insignificant and the ability to throw hot dice is key. Your scatter plot will look even more random than that above.
A good published paper on the subject, in a statistics or mathematics journal, will almost certainly start with a general scatter plot … more or less what Derek has done. No error bars, no trend line. It’s a check for real effects. They will probably compute an r value … but will warn that the value itself doesn’t have much meaning. Other than a large r value indicates that there is likely to be value in performing a study on the data. The size of the sample and the underlying sample (frame) size have an impact, as does the variation in frame sizes … but an experienced guy, whether a mathemetician or a hockey fan who reads places like this a lot … they’ll have a pretty good sense of whether or not it makes sense to proceed.
From there, the mathematically pure, but subject naive route would be to compute the ability distributions for, in Derek’s case SH%/oppSH% and SF/SA … the structure would be a bit hairy. And being subject naive you’d probably assume ability distributions of the Dirchlet form dispersed through multinomial or multivariate normal distributions.
That’s some big math. And it would be more sensible, though even more work, to break it down into it’s components. That’s what Thinker B was doing with the rookie shooting% in a recent post at IOF. He used a binomial dispersion, the gain in using a multinomial dispersion is very small … and the math is miles simpler that way. Thinker ‘B’ also had knowledge of the subject matter, that is evidenced by the fact that he referred the rookie shooter back to the population of forwards, he knew that defenders have far less opportunity to bury shots, and therefore didn’t constitute the same peer group.
In baseball, Brad Null has done some terrific work with nested Dirichlet’s, Jim Albert has done some terrific work with nested beta’s. Googlescholar those Doogie, quite a few papers from both are publicly available. Jay Bennett is another.
In games of with a large degree of inherent randomness (baseball, hockey, trivia craps) regression analysis is just an inappropriate tool to separate the two. You will end up wih player evaluations that have tragically poor predictive value.
Career trajectories would be a better application,though essentially there we are looking at best fit lines. And even there a Bayesian approach makes more sense if one is trying to fit trajectories to individual players, and not the group as a whole. Least squares won’t work worth a damn even there, not unless the individual errors are weighted. Because if a player had 200 AB in 2003 and 550 AB in 2004 … and you were fitting a multinomial regression for batting average … using simple OLS error regression will be a train wreck. If you weighted the error of each point to 500 AB … then it would be fairer.
Even then you would end up with some players with screwy looking career trajectories … because chance alone demands that. So to get some more Thinker B action going … you take these players in the study to represent the population, compute the actual spread of curvatures you would expect from chance alone, then work back to determine the most likely true career trajectory of each player. So a guy whose career trajectory might look like a bowl (concave) with OLSE, goes more of a shallow dish with weighted OLSE, and to a shallow convex (upside down) dish with the population correction.
You could argue that some form of an intra class correlation system, a random effects model regression, would be the best option. You would be wrong, but at least you’d be in the conversation. RE modelling would be the best choice if we were ignorant of the source of results (test scores, adverse reactions to drugs,etc.) … but that’s not the case with batting average. We know where batting average comes from, it’s hits divided by at bats. And we know (thanks to the work of Bill James and hundreds of others over the years) that bernoulli dispersion is an effective model for simple BA. We also know of park effects, PED effects, injury effects etc. And the result of the study would likely show certain types of players had a comparable career trajectory. So you could lump these player types into classes and try again.
So that would be a place where you could sensibly apply regression analysis in hockey. Granted I’ve simplified a touch. I don’t see you or godot doing that, though.
I know you guys probably both feel that “outshooting doesn’t matter” and bust in with this stuff for that reason alone. That doesn’t bother me, Doogie. Guys like Scott, Derek, Tyler, etc … I don’t think that they fully realize how bewilderingly naive your comments are, and they don’t have math backgrounds, so there is a chance that they will find yours, godot’s,etc comments discouraging. My hope is that, by pointing out your folly, they learn to ignore you in a casual way, understanding that you just don’t like what the information is implying.
It’s for the greater good, Doogie.
I suppose you think if you throw enough big words out there, I’ll get intimidated and bow to your superior bullshit? Don’t give yourself blisters.
Your assumptions are erroneous, thus making your entire argument flawed. I know a little more than you think I do. Not as much as I could, but that’s what grad stats are for once I’m done my B.Sc. In the meantime, I’ll take solace in the fact that you continue to miss the point utterly in your quest to be the most superior in all the land.
SNN Sports - A theoretical Oilers blog (i.e. theoretically, I write stuff there). Link now 100% less broken.
Oiler History 101
Yes, it’s possible to outscore while being outshot over the course of a season or two, but it’s not likely.
Season …. GF:GA … SF:SA …. PDO
-——————————————————-
1983-84 ….. 1.420 ….. 0.988 ….. 1.052
1984-85 ….. 1.346 ….. 0.985 ….. 1.042
1985-86 ….. 1.374 ….. 0.953 ….. 1.049
1986-87 ….. 1.310 ….. 0.995 ….. 1.038
1987-88 ….. 1.260 ….. 0.988 ….. 1.033
1988-89 ….. 1.062 ….. 0.913 ….. 1.021
1989-90 ….. 1.113 ….. 0.965 ….. 1.019
How about over the course of a season or seven? And it would be more, except SA data aren’t available before ’83-84, at least at my source (hockey-reference.com).
Yes of course I know they are an outlier, but the 80s Oilers proved it is possible to outscore, consistently and significantly, without outshooting.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
But it’s plain and obvious that they’re not the norm. What’s the point of building a model that will reflect the results of the 80s Oilers when we all agree that their results are the exception and not the rule and that it’s very unlikely that we’ll see another team that’s similar to them in terms of talent level? I suppose it’s possible that the Colorado Avalanche are earning their success in the same way as the 80s Oilers but… I wouldn’t bet on it. I guess my question is, how relevant do you think your example is to Derek’s analysis of the league today?
by Scott Reynolds on Jan 30, 2010 1:09 AM MST up reply actions
I wouldn’t bet on it. I guess my question is, how relevant do you think your example is to Derek’s analysis of the league today?
It’s irrelevant because the 80’s Oilers wouldn’t exist for seven seasons in the league today. Messier would be in New York by 1986 and Kurri would be in in Detroit. Fuhr would get $6 million from Waddell and Coffey would get $7 million from Philadelphia.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
Of course, that era is obsolete. It’s a dynamic sport. This era will become obsolete in time.
I guess my question is, how relevant do you think your example is to Derek’s analysis of the league today?
All I’m saying is no matter what you think is the model, the next successful strategy might be the one that breaks the model. The Oilers sure did that in the ‘80s. Who knows who’ll do what in the ’10s.
But it’s plain and obvious that they’re not the norm. What’s the point of building a model that will reflect the results of the 80s Oilers when we all agree that their results are the exception and not the rule
I guess it depends if you are trying to find a model of a normal team that plays by “the rules”, or of an exceptional team. The latter are much more interesting, and oftentimes more successful.
I still find it amazing that the greatest offensive team in the history of the game generated less shots than their opponents every season. It was ALL about shot quality, not quantity. Hard to argue that it worked.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 1:45 PM MST up reply actions
Since we don’t have shot differential by game state, we have no idea if the Oilers outshot the hell out of teams for the first 27:30, got a three goal lead and they hung around and drank beers while Fuhr was under a barrage.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
That did happen quite a bit. There were also games where Fuhr (or Moog) got peppered in the first 35 minutes and then the Oilers would wake up and bam, bam, bam.
I daresay one could learn quite a bit on that subject by adding up shots by period from the Hockey Summary Project. My guess would be the Oilers got outshot in a lot of third periods while they had a lead, comfortable or otherwise. I’d also place a small wager that Oilers’ Sh% was highest in the third period.
For sure strategy #1 was to get the lead, not just force the other team to open it up but invite them do so, and then kill ’em on the counter attack.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 2:03 PM MST up reply actions
I daresay one could learn quite a bit on that subject by adding up shots by period from the Hockey Summary Project.
Meant to add and will here that The Contrarian Goaltender has done some real good work using this method, for example this post. While it’s not as precise a method as stripping play-by-play data, information already collected in those three crude buckets called periods has a lot of value. The best part is that there are decades of such data.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 2:17 PM MST up reply actions
Yeah good point Bruce. Right now we’re talking about variation in percentages looking like noise but that’s only been demonstrated (afaik) at the NHL level post-2000s. I think it’s pretty clear that if you put an NHL team vs. a beer league team that differences in SH% would drive some of the results. And that’s an extreme example but I’m sure there are high-level leagues where there are more delinated tiers in terms of retaining skilled men.
I think we’re after hockey truth, specifically at the highest levels, and I understand that the 80s Oilers had a ton of the most offensive talents of that generation, but if I understand correctly teams like NYI, MTL and CGY could also hold their own in the offense department too. So while I would buy that EDM could be an outlier, I also am very interested in Derek’s question about the effect of the scoreboard on EDM’s outshooting/outscoring splits.
I think we’re after hockey truth, specifically at the highest levels, and I understand that the 80s Oilers had a ton of the most offensive talents of that generation, but if I understand correctly teams like NYI, MTL and CGY could also hold their own in the offense department too. So while I would buy that EDM could be an outlier
Which is why I grouped the data by finishing position 1-30, rather than by team. If takes away the year to year ups and downs of a team and consistently groups equivalent teams from year-to-year.
I also am very interested in Derek’s question about the effect of the scoreboard on EDM’s outshooting/outscoring splits.
Vic’s new toy should help us going forward with that, but like Bruce says above, it’s going to require a bunch of manual work.
Editor of The Copper & Blue, and leader of The Cult Of Hartikainen.
Right now we’re talking about variation in percentages looking like noise but that’s only been demonstrated (afaik) at the NHL level post-2000s.
To quibble somewhat, right now you’re talking about variation in percentages looking like noise. To me there’s a lot of information in that “noise”, and I think it’s a bad mistake to call it that, smooth it out and pretend it’s not there. Thus my comment above about the trend line, which obscured at least as much as it revealed. What is demonstrated beyond a doubt is there is a strong correlation between goal ratio and standings placement, which is almost but not quite a no-brainer cuz there are outliers there too. Nonetheless after just three seasons of data the red bars proceed in a most orderly manner.
What is also demonstrated is that there is a much weaker correlation, albeit a positive one, between shots ratio and standings placement. You will not get an argument from me that outshooting your opponent is one path to success.
However it is not the only one. Those pesky percentages play a huge role. Sure they are more volatile and are more difficult to control, but to handwave them all away as “noise” or “luck” is a mistake. Ultimately it is the best combination of outshooting and positive percentages (PDO) that will succeed. Shot quantity and quality, if you will. It is possible to be successful at being dominant in one of those categories and average in the other, as the 80s Oilers and 2008 Red Wings proved, but surely the idealized model to consistent success is to be above average in both.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 2:54 PM MST up reply actions
To quibble somewhat, right now you’re talking about variation in percentages looking like noise.
Sure Bruce, but I think you’re in the minority.
I myself am convinced, not just by the numbers either. The teams (in the NHL) that look to me like they play hockey the best are the guys that can keep the puck in the offensive end of the rink. That looks fucking hard from where I sit.
And of course I doubt anyone can really get a concrete feel for whether territorial advantage truly leads to better goal differential, after all none of us could possibly watch all 1230 NHL games for seasons at a time and then somehow juggle all of the tens of thousands of events (which direction the puck was moving when shifts started, who was on the ice for every shot for and against, who drove every offensive zone entry, etc. etc.).
That’s what we have numbers for and they are convincing. I look at what Vic did with the correlation between events today (goals, shots, Corsi, faceoff starts) and goals tomorrow and I don’t think it could be argued against. Same with the PDO half-season splits MC did.
Sure Bruce, but I think you’re in the minority.
Oh yeah, it’s a place I’m quite used to being. I’m quite comfortable down here, and think I can actually serve a useful function. As in science, any theory is/should be open to question. If the theory is strong it will survive such tests, may even emerge a little stronger if a weak link is exposed. So I apply my weird brand of logic probing for weak links, and I ask questions and propose alternative hypotheses. Some days I can be a real pain in the ass, but hopefully sometimes the conversation gets advanced a little bit. As you say, we’re all after hockey truth.
Another consideration is that many place the emphasis on the predictive power of these numbers (goals tomorrow) and I’m at least as interested in their capability as an analytical tool of past events. I’m an historian, not a gambler. :) Also, I’m at least as interested in the outlier as I am in the norm, even as it’s good to establish what that norm is. So I’m often coming from a completely different direction.
I myself am convinced, not just by the numbers either. The teams (in the NHL) that look to me like they play hockey the best are the guys that can keep the puck in the offensive end of the rink. That looks fucking hard from where I sit.
I agree with this, by and large, certainly in the modern NHL. But what you do with the puck while it’s there is pretty important too. You generate shots and attempted shots, create scoring opportunities, and convert them at different rates which collectively vibrate around the true results. Some of those metrics seem to be more reproducible than others, and the work you cite that Vic, MC and others are doing in that area is very significant.
But shots are just a proxy. It’s possible to dominate in the offensive zone without necessarily generating a lot of shots, or it’s possible to shoot from everywhere like Toronto does and have a nice looking shots ratio but a shitty PDO and still find yourself looking up the standings. In such a case it’s a stretch to equate shots ratio with possession or zone time, their (negative) goals ratio may well be a truer barometer of that.
And if I may be permitted to dwell in the house of the 80s a while longer, it’s possible to dominate on the scoreboard without necessaily dominating zone time. The Oilers tended to spend about five seconds in the offensive zone and then get together for another celebration. They were counter attackers de luxe, scored an incredible number of goals off the rush. This can be seen statistically through the prism of their special teams, where their powerplay was good but rarely great while their shorthanded production was truly extraordinary. I suspect in their admittedly special case that if it were possible to calculate zone time it wouldn’t be any more impressive than their very ordinary shots ratio, perhaps even less so. There would be times of course where they would hold and hold the puck trying to set up that perfect shot, but more often than not that shot occurred very early in the attacking sequence.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 5:13 PM MST up reply actions
I’m sure there are high-level leagues where there are more delinated tiers in terms of retaining skilled men.
Agreed. You needn’t look further than international hockey, which produces severe mismatches. I don’t even remember who it was that Canada played to open the world junior — Kazakhstan? Latvia? Tuva? — but I do remember they outshot them 66-10 and outscored them 16-0. Both of which are mind-boggling stats, but when you express them as ratios one is much more severe than the other. That Canada scored on every fourth shot but the butt-kickees couldn’t score even on one in ten speaks anecdotally to shot quality as well as quantity. That’s just a single game obviously, but at such levels the top teams tend to have goal ratios that exceed their shot ratios, and commensurately excellent PDO #s.
Your comment about analysis looking at the post-2000s NHL is well-taken, R O. The parity, both real and artificial, that exists in Gary Bettman’s NHL will have the effect of constraining the outliers while consigning a large number of clubs to the mushy middle.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 3:07 PM MST up reply actions
Latvia, it was not pretty.
16-0 probably wasn’t the fairest outcome though, shot quality was surely a factor but puck luck is always huge in any single game.
Of course, R O, I did say “anecdotally”. To cite a single game is more a parable than a statistic. That said, a Latvia goal would have been a surprise, wouldn’t it?
One way one could verify this effect statistically would be to look at a data set of international games where the shot ratio exceeded certain thresholds, say 1.5, 2.0, 3.0, and calculate the goal ratios in those games. I would be stunned if the latter wasn’t significantly higher, perhaps even exponentially so. The winners would have great PDOs. The better countries have better shooters and better goalies, better playmakers and better team play. They should be expected to produce at a higher rate.
A study of similar games in the NHL would be interesting but, I suspect, much more ambiguous. There would be a lot fewer such games (esp. 3:1), and while some would be blowouts there would be a few stolen games in there where the team that was behind forced the play and thus achieved the shots ratio threshold. As a group the PDO #’s in such a specific group of games might be fairly flat, certainly more so than in a setting with a wider range of talent levels such as an international tournament.
Writer for The Copper & Blue and primary shareholder of Zorg Industries
"Never be ashamed of who you are" -- Jean-Baptiste Emanuel Zorg
by Bruce McCurdy on Jan 30, 2010 4:07 PM MST up reply actions
What “big words” have confused you Doogie? There are no big words there, this is run of the mill stuff. It’s not basic for someone talking about hockey on the internet and making their point with data, but it is very fundamental for someone with a maths background.
Jesus, what’s happening to our education system? You’re UofC, no? That’s my alma mater. What math courses have you taken?
I’m not the educating type, and frankly the type of person that throws out your sort of argument is never contributing to the community in a way that is meaningful (IMO) anyways.
I suppose I could explain how simple regressions are built, from the ground up. Brick by brick. Unless it’s going to help Scott, Zona and the like … I just can’t justify that use of my time.
It’s okay, don’t bother. You’re clearly more interested in making things sound hard than anything else. I can assure you that finding the statistical significance and confidence interval of a simple linear regression like the one above takes about 30 seconds, because I’ve done it lots in my various work in science labs over the last six years. That was the only point I was interested in making, and I stand by that.
SNN Sports - A theoretical Oilers blog (i.e. theoretically, I write stuff there). Link now 100% less broken.

by 

































