Fantasy Football: Optimizing Permutations of Teams Based on Statistics Derick Owens and Amanda Zimecki Gene Tagliarini, Professor Abstract The idea of fantasy football has been around since the early 1960s and, with the invention of the internet, has become one of the biggest industries in sports. Millions of people participate in the game, making it extremely competitive. Many, if not most, of those participating are intending on making money, as it is a form of gambling. And, of course, in any form of gambling, the odds are against the player. We’ve set out to design an algorithm to increase the odds the player has of winning in a week-by-week fantasy draft using DraftKings. We will discuss the type of algorithm we will be using and how the algorithm will increase the odds of the player winning. 1 Defense/Special Teams (D/ST) 1 Tight End (TE) 1 Flex (RB, WR, or TE) Given that there are 32 starting players per position (since there are 32 teams), that yields 32 (QB) * 32 (RB) * 31 (RB) * 32 (WR) * 31 (WR) * 30 (WR) * 32 (TE) * 32 (D/ST) * (30 [RB] + 29 [WR] + 31 [TE]) (Flex) permutations, or 87,063,684,710,400 possible permutations of teams. extremely large demographic. Its impact on society varies in results, from what can be considered We will be using a neural network to give each player a rating based on 2014’s statistics imported from fftoday.com. We will then retrieve the player’s dollar value assigned by DraftKings and add it as an attribute or statistic to each player. This value will assist us in a knapsack implementation of the problem, as DraftKings provides a $50,000 salary cap for each player. We will also create a ratio referred to as “true value,” which will be determined by dollar value*our rating. The produced rating will always be less than 1, except for those cases in which the player is elite. The true value, when summed for a complete and drafted team, will be less than or equal to the salary cap. But how will we produce these permutations of teams? positive to what can be considered negative. Gambling has an impact on the economy and various parts of society. Its addictive nature can ruin people’s lives. We will be using a greedy implementation of the knapsack problem to produce possible permutations of teams. The closer to the salary cap of $50,000, theoretically, the better. Key Words: Gambling, gaming, fantasy football, algorithms, computer science, programming, neural network, knapsack problem. 1. Introduction Gambling is a multi-billion dollar industry with an That’s part of the reason why we’re developing an algorithm – to make it easier. We want to level the playing field for players of fantasy football. We will be combining a neural network with a knapsack implementation. DraftKings.com uses a $50,000 salary cap to draft a nine-player team. The team’s positions consist of: 1 Quarterback (QB) 2 Running Backs (RB) 3 Wide Receivers (WR) However, it will be possible that a permutation of a team’s dollar value will be very close to the salary cap, but the true value will be much lower since each player’s dollar value is being multiplied by that same player’s rating produced by the neural network to produce its true value. In order to draft the theoretically “best” team, the user would use the permutation with the highest summed true value. 2. Neural Network • +/- .1 each standard deviation (b) below/above the position average Each calculated statistic must then be multiplied by a threshold. The threshold we decided on is the average player’s statistic, multiplied by two, and is calculated by the following: • Our neural network will create ratings for each position: QB, RB, WR, TE, and D/ST. We will be using a different rating system for each position, as each position has different statistics that determine the quality of the player. As discussed in the introduction, the statistics will be the most recent available statistics in 2014 and will be retrieved from fftoday.com. Once retrieved, the rating for each statistic must be rated for the player. Statistics will be calculated in the following manner: Tp = Sp/[(ΣSq/q)*2], where: • T = Threshold • S = Statistic (being calculated) • q = All Players • p = Individual Player Note that it is possible for a player, if the player excels in a certain statistic, to earn a score greater than the “soft” maximum of 1. This is possible if he exceeds the threshold of twice the average player. Therefore, if the player excels in such a manner in enough of the statistics, he can be defined as an elite player and a rating greater than 1 is computed. Yards per season, TDs per season, Games Played, Touches, Targets: • R = (Sp/(ΣSq/Σq)) * w, where: • S = statistic • p = individual player • w = weight • q = all players • R = individual player rating Turnovers: • R = 1-(T*0.05) * w, where: • R = individual player rating • T = Turnovers • w = weight Note: It may seem that .05 is a “magic number” in the calculating the turnover rating, but the concept is quite simple. Given that the player has 0 turnovers in a season, that player gets a perfect score of 1 for the turnovers rating. However, if he has 20 turnovers, he has a score of 0. It is possible for the player to receive a negative rating if the player has more than 20 turnovers during the season. Catch Percentage: • R = a (+/-) .1(b), where: • a = Base score of all players average 2.1. Neural Network Implementation Our neural network implementation will be using weights produced, at this point, by the software developer’s football knowledge. While these appear to be “magic numbers,” unless there is some form of back propagation, we have no choice but to use “magic numbers.” There are some issues with the possibility of an implementation using back propagation. The results are limited to one: the results of the points produced by the permutation (which are, of course, known only after the games occur) vs. the best possible permutation (which the odds of producing are 1:87 trillion). This means that implementing a weightlearning algorithm would be difficult, or even impossible, considering there is one unique permutation to one unique result. There are also too many unaccounted factors to possibly produce an algorithm that learns weights. We’ll define such factors as “football knowledge.” Football knowledge refers to recognizing external factors in football: tracking injury reports, coaching history between two opponents (how well do the coaches know one another?), a player or team’s performance in certain weather conditions, a player or team’s performance in certain types of venues such as turf or natural grass fields, etc. This list could go on for quite a long time, and it would be extremely strenuous to program football knowledge, let alone set weights that would operate on an efficient level for those factors. Therefore, it is better, in our opinion, to focus on the hard statistics and optimize the permutations for the user, so the user can use their football knowledge to make the best choice out of those optimized permutations produced by our algorithm. 2.2.1. QB Rating Implementation Quarterbacks will be given their rating based on the following statistics and respective weight: STATISTIC Total Yards (Passing + Rushing) Total Touchdowns (Passing + Rushing) Games Played Turnovers Completion % WEIGHT .20 .25 .35 .15 .05 2.2.1. RB Rating Implementation Running backs will be given their rating based on the following statistics and respective weight: STATISTIC Total Yards Total Touchdowns (Receving + Rushing) Games Played Touches Targets Turnovers WEIGHT .20 .20 .25 .15 .10 .10 2.2.1. WR Rating Implementation Wide Receivers will be given their rating based on the following statistics and respective weight: STATISTIC Total Yards Total Touchdowns (Receiving + Rushing) Games Played Turnovers Targets Catch % WEIGHT .20 .20 .25 .10 .15 .10 2.2.1. TE Rating Implementation Tight Ends will be given their rating based on the following statistics and respective weight: STATISTIC Total Yards Total Touchdowns (Receiving + Rushing) Games Played Turnovers Targets Catch % WEIGHT .15 .25 .25 .10 .15 .10 2.2.1. D/ST Rating Implementation Defense/Special Teams will be given their rating based on the following statistics and respective weight: STATISTIC Turnovers Forced Points Allowed Total Touchdowns Sacks Yards Allowed WEIGHT .20 .20 .25 .10 .15 3.1. Knapsack Implementation Our knapsack implementation has a maximum value of $50,000, which is the salary cap set by DraftKings. It is extremely unlikely that we “fill,” the knapsack, but that isn’t the objective. Our knapsack, although extremely important, simply gives us our dollar value permutations. One may believe that the “best” permutation is one that completely filled the knapsack. While that is ideal, it doesn’t necessarily improve the odds of winning. You could be filling the knapsack with dirt rather than gold. How do we make it more likely to be picking up gold rather than dirt? Once we have the possible permutations produced, we will multiply each player’s dollar value by the player’s rating, producing that player’s true value and sum those to produce the team’s true value: • V = d * R, where: • V = true value • d = dollar value • R = our produced rating • VT = Σ(Vp), where: • V = true value • T = team • p = player Note: There is a possibility that, if a team consists of enough elite players, that the sum of the true values exceeds the salary cap of $50,000. This, however, is extremely unlikely, as DraftKings sets their dollar values for the players according to their performance, and the elite players are set at a much higher dollar value than non-elite players. Once we have generated a true value for each permutation, we can then display to the user the top permutations available to choose from. There will surely be thousands and thousands of permutations available to the user, some of which have the same true value. The user will then have to use his football knowledge to make the best selection out of the permutations we produced with our algorithm. 4. Algorithm Complexity The complexity of the entire algorithm is at least O(n), where n is the number of players. The complexity of the neural network is O(n) since the purpose of the neural network is the compute the rating values of each player. In the “black box” of the neural network, it is simply computing arithmetic in order to produce the rating. Once we have ratings produced, it is then time to organize those player ratings into permutations. We will use the greedy implementation of the knapsack problem to get the permutations that are closest to the salary cap as possible. The complexity of this algorithm depends highly upon the implementation we use. If a brute force algorithm is used, the complexity will be 2n where n is the number of players. If we could produce an implementation of the knapsack problem using dynamic programming, however, the complexity could be maintained at O(n). Brute force is a very straightforward approach and would be an exhaustive search, where we would create all permutations of teams and produce the optimal solution(s). Dynamic programming, however, works by breaking this problem down into smaller problems and using those smaller problems to construct possible permutations in a bottom-up format. If the generalized permutation with one or more missing players doesn’t maintain a given threshold X with the maximum value of the player from that missing position, we can throw out the entire permutation and get rid of an entire position, then try again with another position. This method could reduce the complexity dramatically since the method would be crossing out a “column” of position players and a “row” of an entire permutation. Furthermore, this process can be repeated until the threshold of X is met by all permutations, so the optimal solution set is what would be left over. We could further reduce the computational complexity by adding a GUI with checkboxes beside the name of each player. If the user knows directly who they want in a certain position, they could check the box of that player. Since the knapsack would be “partially filled” with that player, that position would no longer be needed to be filled in the knapsack. Based on the fact that there are 32 starters for each position, the complexity would be reduced a maximum of 232, all with a simple Boolean value and user input. 5. Rating Results Our rating results matched very well with those of the experts on ESPN.com. We as well noticed slight variations which was what we were looking for. This discrepancy is what makes our player’s value unique in comparison to professional analysts. Figure 1 provides a visual representation of our highest valued Running Backs compared to those of the sport’s experts. This graph is also applied to all of the other positions to provide us with a visual model showing our values uniqueness. We are very pleased with our results and we look forward to continuing with implementing a GUI and future enhancements to the algorithm. Figure 1