Sunday, February 2, 2014

How to Accurately Rank Fantasy Basketball Players

In part I, I explained how fantasy players are usually ranked - with Standard Scoring. In part II, I introduced another way, Rarity Scoring. This, part III, is putting the finishing touches on Rarity Scoring by introducing what I call "True Zero" - finally giving us a quality means of ranking fantasy basketball players. In part IV, I will discuss the differences between Standard Scoring and Rarity Scoring (and hopefully show how much better my system is than the standard system).

True Zero

True Zero (t0) is the amount of the statistic that you expect is the baseline value for any player that is good enough to be owned in your league. Pure rarity scoring assigns weights to each statistical category so that they are equal in value, since each category is just as valuable as another. Using True Zero values, we will be able to equally value production accross all statistics above the minimum expected from owned players.

I know that this isn't a simple concept to grasp - at least with how well I described it - so here is an example to try to make it more obvious. Earlier in the Manifesto, to demonstrate pure Rarity Scoring, I grabbed the stats from the top 12 players and ranked a few of their stats. The weights ("Value" in the pictures) were for pure rarity scoring. Since points are by far the most common statistic, the other coefficients ("Values") are much higher, i.e. points is 1.00 and blocks is 29.23.

This example, and these coefficients, aren't applicable to actual leagues since it only accounts for twelve players and six categories. But the concepts are nearly identical. In this league, the worst player scores almost twenty points. If we assume that these are the only players you can play, in order to out-score your opponent, the worst player you can play will score twenty points. That makes a player who can score 30 points much more valuable!

(Click on image to open in larger view)

The relative value ("Coefficient" is the term I will usually use to refer to these values that scale different categories so they are equally weighted) of each category is represented by the blue "Value" row. Since the worst player in this league (like most leagues!) records 0 blocks, the True Zero (t0) is actually zero. The result is that blocks are 6 times (the blue "Value" region) as rare as points scored. In the future, I will refer to the resulting score (Coefficient*Production) as Fantasy Points Equivalent (FPE).

In actual leagues, determining True Zero is much more complicated. We have to figure out what the worst player in each category would produce but still be good enough to be owned. To be clear, a player who scored the t0 value in every category would be a terrible fantasy player. For example, t0 would be the number of blocks we expect a lazy point guard to have or the number of points a pure defensive specialist will score. Using a smooth-line approach gives us what we expect to find on the waiver wires in each category, as a minimum.

True Zero by Category

It turns out that production in fantasy basketball is best represented with exponential decay. Using this knowledge, I smooth the lines and find what the worst player in each category should produce in that category. For categories like Blocks and Threes, these values are basically zero, but for categories like Points and Rebounds, there is significant expected value for everyone in the league.

These plots are FPE for the best 156 players ranked using all of the categories except turnovers. Unfortunately, the way my spreadsheet is set up this is much easier than the actual stats and I'd have to tinker with all of my numbers to get these screenshots to reflect production instead of the equivalent FPE. The important thing to notice is how the smoothed lines match up with production (or don't!) and the general shape of the plots. OK, maybe it's not important, but I thought it was interesting to see the shapes of the stats.

To calculate the weight of each category, I add up all the production with the top players (based on league size, if you have 10 teams of 13 players, your population is 130) , then subtract (number of players in the league)*(minimum expected production), or Population*t0. Then I assign coefficients for each category so they are weighted equally above true zero.

The t0 is only used to develop the coefficients for each category, not in ranking players. This means I don't subtract the t0 value from production individually when I rank players - it would have no effect. If you took that value away from everyone, it would be like giving every NBA team 30 points to start the game. It changes the total score at the end, but the winner is still the person who scored the most during the game. All players start at zero and get the same credit for each point, steal, block, FGOP, or assist as every other player.

I also do not find a t0 for the percentage categories like Field Goal Percentage and Free Throw Percentage, since FGOP and FTOP true zeros are actually 0, which is what an empty spot on your roster scores and the average shooters score.

Finishing the System

When developing these numbers, the coefficients for the traditional counting categories stay pretty constant (points, rebounds, steals, assists, etc) no matter what time period you analyze over, but the percentages vary wildly. After some panic, I realized this is due to it being common for players to get in shooting streaks, so high values of FGOP and FTOP happen in short time periods compared to points scored. However, these values tend to flatten out in longer time periods. This means high and low shooting percentages are more rare if you look at season-long stats, and therefore FGOP and FTOP are "worth more" compared to points scored. For roto leagues that compare statistics over the course of the entire season, we should use the larger coefficients, but for more common weekly leagues, we should use the smaller coefficients.

It would simplify the numbers to use a flat 1.00 for points scored. I have decided, instead, to normalize the numbers to have a total of 10 FPE above the t0 value for points. So in my system, the average player would get 10 FPE in every category above the true zero. This makes total scores more consistent across different leagues, but is wholly unnecessary for analysis - using 1.00 would work identically; to convert my numbers listed below simply divide all the coefficients by the points coefficient. The end result is that the average score, for each category and no matter what your league size or settings are, is 10 FPE above t0.

So, How Do I Use This?

Now, for the numbers! The numbers below are for a league size of 156 (12 teams of 13 players) and 8 categories - Points, 3PM, FG%, FT%, Rebounds, Assists, Steals, and Blocks. I then computed the value of a Turnover, but the players were not ranked using turnovers. The process for creating these is the same as I went over in Part II, but using the t0 values.

CategoryStatCoefficientTrue Zero
Field Goal PercentageFGOP12.300
Free Throw PercentageFTOP31.300
Three Pointers Made3PTM10.250.10
Points ScoredPTS1.578.17
Total ReboundsREB2.982.17
Some notes on these numbers:

1. The t0 values listed above are actually stats, not Fantasy Point Equivalents.

2. FGOP over the season has a coefficient of 23.1, FTOP has 51.0. These numbers I used here are over the past 7 days, which we are assuming are average values (these are a little low, but not very far away).

3. An FTOP of over 31 does seem really high, but imagine how hard it is to have an entire free throw made over average percentage (78.8% so far in this example league) per game. You would have to average 5/5 from the line, or 9/10 for a full FGOP. It is just as helpful to your fantasy team to have a player score 31.3 FPE's in FTOP or Points - which is going 9/10 from the line or scoring 20 points (31.3/1.57=20). That sounds about right to me.

4. Note that some plays help you in multiple categories - making shots (free throws or regular) helps you in points and percentages. Some leagues have FTA or OREB as categories, which help in multiple categories as well.

5. I use yahoo's in-game average stats, so .245 blocks is calculated the same as .155 blocks, both show up as .2. These errors will average out almost all of the time, so I haven't tweaked my spreadsheet to fix this.

6. Due to normalizing for 10 FPE over t0, the total amount of FPE in every category will equal (10+t0*coeff)*Players_Owned.

7. Blocks are much more rare than steals, but since they are so much more spread out, they are nearly equal in true rarity. It is approximately the same to average 1 block as it is to average 1.5 steals, for fantasy valuation purposes.

Using These Results

To close, I've created a WolframAlpha widget to calculate players values using these coefficients. Disclaimers: different sized leagues and different league settings would have to change the numbers to be perfectly accurate. For leagues not using Turnovers, enter 0. For leagues using different categories not listed here, and different sizes - stay tuned, I will be posting results for all the categories I've heard of eventually!

1 comment:

  1. Too tired to attempt this right now, but looking forward to coming back and reading. Did read the first bit though and I must say you write really well for a statistician, ha!