Wednesday, October 26, 2016

I Started Trying To Create A New Way To Measure NHL Goalie Performances

Please allow me to preface this post (it's going to be a journey) by saying I'm not a math guy. I'm a sabermetrics guy. We're going to hopefully explore some new sabermetrics that is both sufficiently analytical and easily comprehensible.

And it's all about hockey goalies.

Goals against average is useless. Save percentage can even be kind of deceptive. Vezina trophies are bullshit. We don't have a way to measure goalies, or at least we don't have one that's good enough for me.

So I'm going to use the tools available to me (basically just Corsica) and try to come up with a new goalie measuring formula. Primarily, I want it to include how many shots the goalie faces (accounting for shot quality) and how well they do against those shots.

Some quick statistics notes: I'm using 50 minutes of 5v5 ice time as my minimum because anyone who played less than one game is probably not worth including. I'm going to drop it down to 10 minutes for 4v5 though, because I want to include as much data there as possible.

2015-16 5v5 Goalie Stats

Total time on ice: 115327.1 minutes

All types of chances
  • 105826 shot attempts faced
  • 78521 unblocked shot attempts faced (74.20% of all shot attempts)
  • 56445 shots on goal faced  (53.34% of all shot attempts)
  • 5v5 shots on goal against are 84.67% of shots on goal against
Low-danger chances
  • 25655 shots on goal faced 
  • 45.45% of shots on goal, 24.24% of shot attempts
  • 0.222 shots on goal per minute
  • 0.9791 save percentage (0.0219 goal percentage)
Mid-danger chances
  • 18306 shots on goal faced
  • 32.43% of shots on goal, 17.30% of shot attempts
  • 0.159 shots on goal per minute
  • 0.9252 save percentage (0.0748 goal percentage)
High-danger chances
  • 12484 shots on goal faced
  • 22.12% of shots on goal, 11.80% of shot attempts
  • 0.108 shots on goal per minute
  • 0.8134 save percentage (0.1866 goal percentage)
2015-16 4v5 Goalie Stats

Total time on ice: 13196.9 minutes

All types of chances: 
  • 19273 shot attempts faced
  • 14339 unblocked shot attempts faced (74.40% of all shot attempts)
  • 10216 shots on goal faced (53.01% of all shot attempts)
  • 4v5 shots on goal against are 15.33% of shots on goal against
Low-danger chances
  • 1681 shots on goal faced
  • 16.45% of shots on goal, 8.72% of all shot attempts
  • 0.127 shots on goal per minute
  • 0.9672 save percentage (0.0328 goal percentage)
Mid-danger chances
  • 4902 shots on goal faced
  • 47.98% of shots on goal, 25.43% of all shot attempts
  • 0.371 shots on goal per minute
  • 0.9117 save percentage (0.0883 goal percentage)
High-danger chances
  • 3633 shots on goal faced
  • 35.56% of shots on goal, 18.85% of all shot attempts
  • 0.275 shots on goal per minute
  • 0.7922 save percentage (0.2078 goal percentage)

The relationships between Corsi-Fenwick-Shots largely remains the same between even strength and penalty kill. So, too, does the save percentage in each of the three categories of chances. As you might expect, the penalty kill numbers are a bit worse, but only by two-tenths. Honestly, I would have assumed it was more. 

The biggest disparity is the frequency of shot creation, especially of the mid- and high-danger varieties. Teams generate 233% more mid-danger chances and 255% more high-danger chances on the power play than they do at even strength. 

How The Fuck Do I Turn This Into A Formula

Bear with me while I try to figure out a way to turn this into a formula that we can apply to traditional box scores. 

Attempt 1

We're going to start wide here, and just create a formula to plug shots on goal into.

We're then going to multiply that by our expected percentage of those shots that are on the powerplay (and then by the quality that we expect those 5v5 and 4v5 shots to be). That'll be our expected goals total. 

The nuts and bolts of this is (shots on goal) x (% of shots that are 5v5 or 4v5) x (% of shots that are low, mid, or high danger) x (the goal percentage of that particular strength/danger) 

A1 = (SOG x .8467 x .4545 x .0219) + (SOG x .8467 x .3243 x .0748) + (SOG x .8467 x .2212 x .1866) + (SOG x .1533 x .1645 x .0328) + (SOG x .1533 x .4798 x .0883) + (SOG x .1533 x .3556 x .2078)

So if you don't attempt a shot, the expected goals is 0. That's a good sign. Each shot is worth about 0.083 expected goals, so ten shots expects 0.8 goals and thirty shots expects 2.5 goals. 

Twelve shots per goal doesn't seem like a terrible ratio, but this is obviously a very simple formula that doesn't take into account the actual on-ice happenings of the game. For example, a goalie facing a lot of high-danger chances is going to look bad based on this formula. We could always adjust the percentages based on the actual distribution of dangers, but that's tricky and time-consuming. 

For season-long measurements, this formula would work. On a night-to-night basis, though, too much changes from game to game for this to really be an effective tool. 

Attempt 2

The two assumptions we can't make on a game-to-game basis are (1) danger and (2) percentage of power play shots. 

With this second attempt, I'm going to try to tackle the 5v5/4v5 issue. 

The formula skeleton is based on minutes played, both at even strength and on the penalty kill.

A2 = (5v5TOI x 0.222 x .0219) + (5v5TOI x 0.159 x .0748) + (5v5TOI x .108 x .1866) + (4v5TOI x .127 x .0328) + (4v5TOI x .371 x .0883) + (4v5TOI x .275 x .2078)

This formula ends up being a scale, from Totally Even Strength All Game (2.21 expected goals) to Totally Killing A Penalty All Game (5.64 expected goals). I'm trying to wrap my head around that 4v5 number being so low, but 2.21 seems fair at even strength and power plays are generally about twice as effective at shot generation, so it makes sense. 

Most games will end with about 2 expected 5v5 goals against and about 1 expected 4v5 goal against, which also makes sense. 

Attempt 3

And now we have to deal with the final piece of this puzzle. Some defenses are good and hold the opposition to low-quality shots from the outside. Other defenses are not as good and allow a disproportionate amount of high-danger chances. 

Short of just plugging in the counts for low-, mid-, and high-danger shots against, is there any way we can factor that in? I don't really see how. So I'm not going to try. 

Attempt 4

Are you ready to take this next level? I'm going combine the SOG formula (which measures shot count but not quality) and the TOI formula (which measures quality of icetime but not shot count). 

They should end up hedging each other. I think. Shots are generally good. Powerplay time is generally good. Both of those things together should mean good things for expected goals. 

This isn't going to be perfect because we aren't accounting for shot quality, but the only three variables we need here are shots on goal, 5v5 ice time, and 4v5 ice time. Plug 'em in and fire it up!

A game with 30 shots on goal against, 50 mins of 5v5 ice time, and 5 mins of 4v5 ice time would result in an Expected Goals of 2.40. Bump it to 40 shots against, 48 minutes of 5v5 time, and 10 minutes of 4v5 time and it rises up to 3.01 expected goals. 

Let's apply that to the Flyers and Sabres goalies last night:
SOG SOG Expg 5v5TOI 5v5 Expg 4v5 TOI 4v5 Expg TOI Expg Avg Expg Act Goals Notes
27 2.23 54.9 2.03 5.52 0.52 2.55 2.39 3 10/25 both PHI goalies
17 1.40 28.78 1.06 3.52 0.33 1.39 1.40 3 10/25 Neuvirth
10 0.83 24.7 0.91 2 0.19 1.10 0.96 0 10/25 Mason
44 3.63 49.64 1.83 3.58 0.34 2.17 2.90 3 10/25 Nilsson
I am sorry if that looks like shit for you. Let's break that down into segments:
  • Shots on goal: Nilsson got buried, and actually performed better than he should have since he got pummeled (score effects is a real thing). Neuvirth gave up twice as many goals as he should have based on shots, and Mason came in to relieve him and kept a clean sheet. Mason had an easier time because he faced less shots (again, score effects), but Neuvirth under-performed and Mason over-performed. 
  • 5v5 time on ice: Somehow I came up with different totals for 5v5 ice time across the teams. This isn't perfect, especially because Corsica doesn't give you the goalie's ice time. 
  • 4v5 time on ice: Neuvirth got stuck killing several penalties to Mason's single one, and I'm thinking I need to tweak that Expg formula for penalty kill scenarios. Three and a half minutes of time only results in a third of a goal? The tweak has to somehow factor in power play opportunities (and not just minutes), I guess. Buffalo had a six-second powerplay last night that resulted in a goal (it would translate to 0.01 expected goals). 
  • Average expected goals: Even with the kinks, the model projected the Flyers would win 2.90-2.39. They won, in a shootout, 4-3. So we're close. This means the Flyers goalies (primarily Mason) did better than expected, and Nilsson did about what he was expected to do. That seems fair. 
  • Maybe we just need to bump up the impact of powerplays by somehow including power play attempts. I'll tackle that after lunch. 

No comments:

Post a Comment