A few days ago, Allan Mitchell aka Lowetide noted that "the Leon Draisaitl v. Sam Bennett debate is reaching a fever pitch". To foster the discussion – or fan the flames? – as to who the Oilers should draft, he posted the Vollman NHL equivalencies for the Top 10 projected picks.
One of the factors that subsequently sparked discussion in that thread is that Vollman’s NHLE rates the OHL as a better league than the WHL. In the discussion following, several commenters postulated or questioned whether this was actually true, or just bias:
- "there is a definite eastern/ohl bias"
- "What’s the rationale for OHL love?" and "I’m still not sure about this high rating of the OHL"
- "very skeptical of this rating the OHL much higher than the WHL"
As a born and bred Edmontonian, of course I will bring up the spectre of Eastern Bias at the drop of a hat. I mean that literally, I once dropped a hat that crumpled slightly in a vaguely easterly direction, and I reflexively shouted "Eastern bias! Corrupt anti-West Federal Government!" Luckily I was visiting an elderly relative in a retirement home at the time, and all the seniors in the area cheered. True story*.
* No it’s not. Except for the part about me once dropping a hat.
Now my first reaction to this situation was pretty straightforward: given that both Vollman and Desjardins have looked at how OHL and WHL numbers translate to actual NHL scoring, and both of them show a higher translation rate for the OHL, isn’t that prima facie evidence that the OHL is a better league?
I think so, but when looking at any situation, it usually helps to pull multiple sources of data and evidence. So … having just recently loaded and analyzed a pile of data with respect to draft success, I figured why not take a look at that data as well, compare the two leagues, and see if it adds anything to the discussion.
Purpose of Analysis
- Where are players drafted from?
- Is there any indication of an OHL drafting bias?
- If so, is the bias justified?
As with my previous work, this information is based on the draft years 1998 to 2007, a range striking a balance between data quantity, data uniformity/recency, and data staleness.
Players drafted from …
Pretty straightforward for the first step. I discombobulated the data to take a look at where players were drafted from, with some binning to produce structure to the data. Here are the results:
More players were drafted from the OHL then anywhere else, but in context that doesn’t necessarily mean anything since I don’t have information on how many players there are in each of the respective leagues. The only surprise to me was how many kids were drafted out of high school.
Players by round …
The next was to compare the WHL and the OHL round by round. For this section, I separated out the other significant Jr league (QMJHL), but lumped everything else together. Again, few surprises:
Note that not all the numbers add up to exactly 360 (30 teams x 12 years) – there were a few skipped draft picks along the way, but that seems to be in the data itself. It’s a small number, I didn’t investigate further as a discrepancy of 12 mostly late stage picks out of 2630 total is just noise. Again, we can see that the OHL is the preferred player source, especially in the first round.
Success charts …
The following charts show the success curves for the two leagues.
In my previous draft success analysis, I showed charts for both games played and a success rating (using a low 10 game cutoff). To my surprise, this sparked argument in the comment section as to what the ‘right’ cutoff should be – as if there’s some universally correct answer.
Let me reiterate this point: there is no right answer. It’s a classic filtering problem. If you set the cutoff low (as I did), you maximize signal but you also have to contend with more noise. If you set the cutoff high (as some argued was ‘right’), you reduce noise but you also reduce signal.
While the high noise/high signal will include players that shouldn’t be counted as successes, the low noise/low signal approach as an example (assuming e.g. a 200 game cutoff) will completely fail to distinguish between a draft round where every single one of the 30 players played a full 2 seasons versus a draft round where no players played any games, and will treat both as having zero success. Tough to argue the latter is ‘more right’ than the former, isn’t it? That’s what I mean by less noise but also less signal.
The bottom line is, there is no ‘right’ answer for your choice of filter, it all depends on what you’re using the data for. In my case, I am trying to produce a (smoothish) curve over all picks, which by definition is a noise reduction mechanism. So for my purposes, more data (including more noise) is much better than less data.
That said, other than the above paragraphs of ranting, I have no desire to spend more time arguing that there is no right answer to a question where some are convinced that there is a right answer, so I’ve skipped the issue altogether and just stuck with curves based on total games played. Less work for me!
End of Sidebar Rant - Let's get to the Charts
The first two charts below show the detailed 'total games played by draft position' results for the WHL and the OHL, along with a couple of smoothing lines. As the two charts are basically impossible to compare, the third chart superimposes the smoothed results of both leagues on the same chart for comparative purposes:
OK, we’ve got some charts and data now. Let’s also calculate some comparative numbers for context (I have used Desjardins Jr to AHL as my starting point for comparison. This might not be 100% fair game, however, it's what I've got for now! Desjardin's NHLE is 0.3 for both leagues - interesting that there is a distinction at the AHL level but not the NHL level - while Vollman's data does draw a distinction. At this time, as I don’t have Vollman’s NHLE at hand, I'll use the Desjardins AHL as the NHLE comparative, and close my eyes and pretend) and see if we can draw any conclusions:
Conclusions from the Analysis
So after this long and arduous process, what can we conclude?
Honestly ... I would say not very much! Debbie Downer (SNL) Sound (via Debi Devens)
First, remember that (like with any hockey stat exercise), we have to be careful with the numbers because of wide error bars – small samples, data recency issues (the NHLE I believe will be more recent than my draft data), data smoothing, data transposition errors, etc. So many ways to make mistakes! But let’s ignore that and just dive right in anyway:
- Yeah, there might be a slight OHL bias. There were a greater proportion of OHL players picked, especially in the first round, than the NHLE or the games played would indicate should be the case.
- On the other hand, maybe the drafters have it right in the first round. Those OHL picks on a per player basis in the first round appear to have had more success in the NHL than comparative WHL players, at least on a games played basis. So maybe the OHL really does do a slightly better job of preparing elite players for the NHL. Maybe the WHL really is a tougher league and does a better job of preparing third and fourth liners! Maybe.
- One oddity to mention – and this goes out specifically to Mr. Magnificent Bastard, who I’m sure reads C&B FanPosts diligently. Note the odd little seventh round bump for the WHL in terms of players and games. Small sample of course, so it could mean nothing. Or … maybe it's not a toughness thing like some have guessed. It could be that the WHL, which is generally acknowledged as being less heavily scouted than the OHL, may moreso allow for the picking of undervalued gems late in the draft compared to the OHL. The gap in games played between the two leagues narrows markedly as you get deeper in the draft rounds, and then the WHL takes the lead in Round 7. In financial terms, the implication is that the WHL might be a "less efficient" market, and there is a greater likelihood of being able to 'beat' the market as a result. Conclusion … MBS, get your WHL scouts out there and find us a seventh round WHL gem!
The draft information came from hockeydb.com. The number crunching was done using the staggeringly useful open source data analysis tools that run under Python, specifically numpy, scipy, and Pandas. Thanks to the Lowetidians for the inspiration. Thanks to the good folks here at C&B for providing me a forum for ranting at you without having to start my own blog.
The Copper & Blue is a fan community that allows members to post their own thoughts and opinions on the Edmonton Oilers and hockey in general. These posts do not necessarily represent the views of the staff of The Copper & Blue.