Understanding Baseball Through Computer Simulation: Assessing Hitting, Stealing, and Hot Hand

Accurate analysis of baseball is made very difficult by the fact that there are so many variables in play and that multiple players can affect the game simultaneously. For example, we can’t truly quantify a hitter’s individual contribution to an offense because scoring runs is a team effort. The analysis is further limited by sample size, which makes judgments about decisions like stealing third base highly imprecise. I can get around these barriers with a computer simulation, where I can test changes to one variable while holding the rest constant, and quickly execute millions of inputs. The simulation can isolate and evaluate a batter by simulating a foul made up entirely by the same player and calculating this team’s runs per 9 innings. For example, you can simulate an offense made up of 9 Mike Trouts, and determine how many runs he would score every 9 innings, or simulate an offense of all players hitting home runs 15% of the time and going out the rest of the time. weather. When tested with actual MLB teams, the simulation accurately determines their runs per 9 innings and results in 4.5434 runs per 9 innings for the 2016 MLB average, just 1% above the actual average. This simulation can help improve our understanding of a player’s value as a hitter, the relative value of walks and different types of hits, the hot hand fallacy, and baserunning decisions.

To make the program analyze hitting, it simulates only the portion of the game where the team is on offense and simulates all baserunners to act like the 2016 MLB average baserunner. The program works by randomizing each plate appearance with the probability of each outcome directly based on the stats entered. For example, if the user entered five doubles out of 100 plate appearances, the simulated batter would have a 5% chance of hitting a double each time he bats. It also includes the chance for errors, double plays and sacrifice flies, which is based on the league average. The program then stores information about the number of outs, runs, the inning, and the status of each base. In order for the simulation to be an accurate evaluation that does not vary by chance, it simulates 9 million innings and then returns the number of runs scored per 9 innings.

This simulation creates a unique stat for a player’s value as a hitter, called Simulated Runs Per Game (SRPG). This is the number of runs scored per 9 innings in the simulation by a team made up entirely of this player. This is an effective measure of a hitter because he isolates him from other factors and converts all parts of hitting into runs scored, which is what results in wins. This can be used to compare and rank the value of MLB hitters, or hypothetical players, such as a player who walks half the time and leaves the other half versus a player who only hits home runs the 15% of the time (walker wins by 3.8 runs). The most common batting statistics do not accurately weigh the relative value of walks and different types of hits. Batting average and on-base percentage don’t value a home run more than a single, slugging percentage values ​​a home run four times more than a single, and on-base plus slugging just adds up the stats. Through this simulation, we can determine the value of walks and each type of hit in terms of runs added. I did this by taking the 2016 league average and then adding or subtracting a result, like walks, and finding the runs per game added per base added, and then doing the same for different types of hits. I then used division to find the relative value of each result and set the walks equal to one. The relative career value of a single is 1,226 times the value of a walk, doubles are 1,713, triples are 2,211, and home runs are 2,977 times more valuable than a walk. By using these values ​​and dividing them by plate appearances, we have a new stat I’ll call Batting Value, defined as: ((Passes + 1.226*Singles + 1.713*Doubles + 2.211*Triples + 2.977*Home Runs) / Home Run Appearances dish ). This is similar to the advanced statistical base-weighted average (wOBA) created by sabermetrics expert Tom Tango. wOBA similarly evaluates results based on execution value, but is not based on a computer simulation. The relative value of each type of hit to walks in wOBA is 1.29 for singles, 1.84 for doubles, 2.348 for triples, and 3.043 for home runs. These values ​​are very similar to batting value, but the batting value values ​​walk slightly higher. While there are slight differences, both wOBA and batting value are much more accurate and comprehensive measures of a hitter’s value than are commonly known. Both batting value and simulated runs per game can be used to rank the effectiveness of hitters. The 2016 MLB average for SRPG is 4.5434 and 0.4524 for batting value. According to SRPG, the best hitter of the 2016 season was Mike Trout, with an SRPG of 9.712871. In terms of batting value, David Ortiz was the best hitter at .607, just above Trout’s .603 batting value. This difference in ranking makes sense because Ortiz’s power is a major advantage in the context of MLB average, on which batting value is based, where Trout’s ability to get on base is a major advantage in the context of the simulated superteam of 9 Mike Trouts. .

This simulation also offers insight into the idea of ​​the hot hand. This program doesn’t take into account hot or cold hand, so, for example, it doesn’t make a pitcher nervous after giving up hits and throw less effectively for the rest of the inning. The entire program is random without this boost idea, but the simulation can still accurately generate runs per game for MLB teams. This suggests that there is no real hot hand for offenses or pitchers, because if there were a hot hand, hits would be clustered in certain innings more than in the simulation, resulting in more runs per game. This supports the idea that the hot hand is a fallacy, and is just streaks being misinterpreted as the result of the hot hand rather than just a possible random outcome, like getting heads three times in a row on a coin toss.

When the simulation is used to evaluate hitting, all baserunning is simulated to be the 2016 MLB average, so, for example, runners score from second on a single about 60% of the times. times. However, an additional feature of the program allows the user to enter the second and third steal success rates, and then adds the stolen bases to the program accordingly. This can be used to determine what rate of base stealing adds to the runs per game and helps the team. This can be found by trying different success rates until the breakeven rate is found. For stealing second place with all other bases empty, the breakeven rate is 76.5%. When broken down by the number of outs, the breakeven rates are 79.5% with 0 outs, 74.4% with 1 out, and 69.5% with 2 outs. To steal third with all other bases empty, the overall breakeven rate is 77%, 76% for 0 outs, 74% for 1 out, and 84% for 2 outs. This confirms the conventional wisdom that the best time to steal third is with 1 out and the worst is with 2 outs. This can help determine if a steal attempt is a good idea by estimating if the broker’s success rate is higher than the situational breakeven rate. However, other factors still need to be considered, such as the type of hitters behind the runner, the score, and the inning.

In conclusion, this program is capable of performing simulations that are impossible in real life, giving us a new way of analyzing baseball. Using this, we can isolate a game variable from all other variables in real baseball and test its effect on runs scored. This includes eliminating all other players and making a team entirely of one hitter, adjusting the probability of a given outcome such as a home run, and adding stolen bases to a specific situation. The simulation can accurately determine the runs per 9 innings for MLB teams and the 2016 MLB average, providing evidence against hot hand.

Leave a Reply

Your email address will not be published. Required fields are marked *