Modeling MLB Wins - Stat 410 Final Project 

In the modern era of baseball analytics, understanding the driving forces behind team success necessitates a multifaceted approach that considers performance across the three facets of the sport: batting, pitching, and defense. This project aims to develop predictive models for Major League Baseball (MLB) team wins using data from the 2022 and 2023 seasons, with the goal of predicting 2024 wins by applying our model to the 2024 data. Leveraging a combination of linear regression and principal component analysis (PCA), we construct composite scores that reflect team-level performance in each major phase of the game.

Presently, the majority of baseball discourse focuses on new-age analytics, and we are very aware of the credibility that modern analytics and metrics possess, especially when evaluating player talent and performance. However, we aim to determine whether modern analytical measures are sufficient in predicting team performance. After all, individual player-by-player outcomes are only so valuable; ultimately, there is no better way to quantify team success than by win totals, which lead to playoff success and a potential World Series title. 

Before exploring the data, we conducted brief background research to supplement our existing understanding of baseball analytics. We read “Baseball Analytics” from the Catapult website to gain a better understanding of the advanced analytics currently being used and their applications. One of their primary functions is to conduct proper player evaluation. We also read “Stats to Avoid: Batting Average” by Neil Weinberg on FanGraphs to learn more about why the batting average statistic has fallen out of favor. The answer is that batting average only tells us how good a player is at getting on base via a base hit, but it leaves out important context such as how many bases they totaled, where the defense was positioned, and much more. Other metrics are preferred because they provide more information and context. With this additional background research, we are confident that we have a well-defined problem to solve, as we are testing whether advanced analytics are indeed effective in predicting team performance.

Our analysis began with team-level data scraped from FanGraphs. For each team across the 2022 and 2023 seasons, we compiled separate datasets for batting, pitching, and defense, choosing features that are both widely accepted as predictive of team performance and rich in underlying baseball insight. In the batting model, we prioritized metrics that reflect plate discipline, batted ball quality, and overall offensive approach. These included variables such as BB% (walk rate), K% (strikeout rate), Hard Hit%, Launch Angle, O-Swing%, and Z-Swing%, among others. These features were selected for their ability to quantify how often and how well teams make contact, as well as their tendencies in the strike zone. We initially included Contact% and Barrel%, but due to issues with multicollinearity, we decided to remove both.

The graph above displays our selected features, which provide a balance of characteristics that are not overly collinear and offer unique perspectives on our research topic that we wish to explore.

For the pitching model, we included stats that measure both outcome-based and process-based performance, such as F-Strike% (first pitch strike rate), CSW% (Called Strike + Whiff %), LOB% (left on base %), Hard Hit%, Launch Angle, and traditional indicators like K/9, HR/9, and BB/9. We avoided features like ERA and FIP, which summarize overall performance too closely with win outcomes, to prevent multicollinearity. We also initially included GB% (ground ball%) and Contact%, but we found that these variables were highly collinear with some of our other variables, so we removed them. 

For the pitching model, we included stats that measure both outcome-based and process-based performance, such as F-Strike% (first pitch strike rate), CSW% (Called Strike + Whiff %), LOB% (left on base %), Hard Hit%, Launch Angle, and traditional indicators like K/9, HR/9, and BB/9. We avoided features like ERA and FIP, which summarize overall performance too closely with win outcomes, to prevent multicollinearity. We also initially included GB% (ground ball%) and Contact%, but we found that these variables were highly collinear with some of our other variables, so we removed them. 

To synthesize these variables into a single interpretable metric per phase of the game, we constructed a “score” for batting, pitching, and defense using linear regression coefficients as weights. Each score was generated by training a model to predict team wins using only the respective domain’s features (e.g., batting-only model for batting score), then multiplying the standardized features by the model coefficients to produce a composite. We scaled and rescaled these scores to make them comparable across teams and years.

This allowed us to evaluate the contribution of each component to team success and build a final model that predicted team wins using only these three scores, effectively creating a modular and interpretable system for quantifying overall team quality.

Batting Score

This chart illustrates the standardized coefficients from our batting model, revealing the relative importance of each offensive metric in predicting team wins. ISO (Isolated Power) and BB% (Walk Rate) emerge as the most positively influential variables, underscoring the value of power hitting and plate discipline in driving offensive success. Conversely, K% (Strikeout Rate) has the largest negative impact, suggesting that teams with high strikeout rates are at a significant disadvantage. Metrics like Hard Hit% and O-Swing% also contribute positively, indicating that quality contact and a disciplined approach at the plate are beneficial. Overall, the results highlight the importance of an efficient, power-oriented offense with a reduced number of strikeouts.


This table highlights the top 10 team batting scores from the 2022 and 2023 MLB seasons, showing a clear correlation between offensive output and win totals. The 2023 Braves top the list with a massive batting score of 278 and 104 wins, exemplifying how elite offense drives team success. The Dodgers appear twice, and the 2022 Yankees, boosted by Aaron Judge’s 62-home-run season, also rank highly. World Series winners, such as the 2022 Astros and 2023 Rangers, further support this connection. However, a few outliers emerge: the 2023 Padres ranked 7th in batting score but managed only 82 wins, hinting at weaknesses in pitching, defense, or clutch performance. Similarly, the 2023 Cardinals posted a top-10 batting score yet won just 71 games, likely due to broader team deficiencies. These exceptions underscore that while offense is crucial, balanced performance across all phases is key to sustained success.

Pitching Score

The chart upove illustrates which pitching metrics have the most significant influence on team wins, based on standardized coefficients. LOB% is the top positive predictor, highlighting the importance of stranding baserunners before they become runs. BB/9 and HR/9 are strongly negative, emphasizing the need to avoid walks and limit home runs. K/9 stands out as a key positive factor, reflecting the value of strikeouts. Other stats like Hard Hit % and First Strike % contribute modestly. Overall, the model indicates that pitching success is largely determined by command, strikeout ability, and limiting damage.


This table showcases the top 10 team pitching scores from the 2022 and 2023 MLB seasons, and overall, it supports a strong correlation between elite pitching performance and high win totals. The 2022 Dodgers, who posted an MLB-best 111 wins, top the list with a dominant pitching score of 206. They are followed closely by the 2022 Mets and 2022 Astros, both of whom also exceeded 100 wins and featured deep, efficient rotations and bullpens, underscoring the value of run prevention.

Several teams in the top 10 had strong win totals that align with their pitching strength, such as the 2022 Braves, 2022 Yankees, and 2022 Blue Jays, all of whom made the postseason. Notably, the 2023 Twins rank 4th in pitching score despite a more modest 87 wins, reflecting a well-pitched but perhaps offensively inconsistent team. Similarly, the 2023 Mariners and 2023 Blue Jays also appear with strong pitching metrics and respectable win totals, but may have been hindered by inconsistent hitting or poor situational play.

Defense Score

The chart above displays the standardized coefficients from the defense regression model, showing which components of team defense most strongly correlate with winning. FRM (Framing Runs) stands out as the most impactful feature, suggesting that catcher framing plays a significant role in run prevention and, consequently, team success. RngR (Range Runs) follows closely, highlighting the importance of defensive range in converting batted balls into outs. ErrR (Error Runs) also has a positive impact, indicating that teams minimizing errors gain a measurable advantage. Meanwhile, DPR (Double Play Runs) and ARM (Outfield Arm Runs) contribute less to the model, suggesting their effects are more situational or less consistent across teams.

Upon examining our results, we immediately noticed a weaker correlation between win totals and our defense score compared to our pitching and batting scores. The 2023 Pirates and 2022 Diamondbacks are examples of that, with just 76 and 74 wins, respectively. This lends to the idea that good defense, in most cases, can not compensate for bad hitting and bad pitching.

2024 Predictions

To predict 2024 wins, we combined all three of our performance metrics—batting, pitching, and defense—by applying the trained model coefficients to the corresponding 2024 features. 

Examining the coefficients for each of our scores from the linear model, we observe that pitching is the most indicative of team success. Naturally, teams that can consistently prevent runs throughout the course of the season are most likely to succeed. On the other hand, we find that defense is significantly less predictive of team success, with its 95% confidence interval containing zero, indicating it is an insignificant variable. 

When we plotted the actual 2024 win totals against our predicted 2024 totals, we see that our model performed very well, with a correlation coefficient of 0.917.

Testing

Here, we looked at our Breusch-Pagan test results, observing that none of our models – batting, pitching, defense, or win prediction – indicate evidence of heteroscedasticity. Each model displays a p-value greater than 0.05, indicating that we fail to reject the null hypothesis of constant variance in our residuals.

This chart presents the Shapiro-Wilk test results for normality of residuals across the four models, all of which show p-values well above the 0.05 threshold (marked by the red dashed line). This indicates that we fail to reject the null hypothesis in each case, suggesting that the residuals are approximately normally distributed.

Summary and Discussion

This project aimed to build predictive models for Major League Baseball (MLB) team wins using data from the 2022 and 2023 seasons, and then ultimately predict 2024 team win totals by applying our model to 2024 data. The project incorporated the three core components of baseball – batting, pitching, and defense – to inform our findings and analysis most effectively. We developed batting, pitching, and defense scores as predictors to quantify overall team quality and performance and predict team win totals. 

Our batting model revealed that ISO and BB% had the greatest positive coefficients, while K% is the strongest negative feature. Therefore, teams that generate more power, draw more walks, and strike out less should win more games. While we are not surprised by the general premise that teams with more power, more walks, and fewer strikeouts will have more success, we are still intrigued by the specific metrics that our model identified as the strongest contributors to wins. For example, the ISO coefficient being larger than the launch angle coefficient indicates that there may be a greater advantage gained by consistently hitting extra-base hits than consistently hitting baseballs with a high launch angle. Similarly, we see that K% has a greater negative coefficient than a metric such as CSW%. One reason for this could be that outcomes such as balls in play or foul balls are not baked into CSW%. Meanwhile, K% is a reliable metric because it is sequence-agnostic; the stat does not care how a batter’s count progressed toward a strikeout, but rather it only cares that a batter actually struck out. For the most part, a relationship existed between batting score and wins. There were a few outliers, but we believe that's acceptable because it likely means those outlier teams were stronger in a different category, such as pitching or defense.

Our pitching model revealed that preventing baserunners – and, consequently, runs – is a significant contributor to team success. Interestingly, LOB% had the strongest positive coefficient, rather than a metric like K%. Similarly, BB/9 and HR/9 had the greatest negative coefficients. From these results, we can conclude that one of the best pitching-related features of team success is the ability to strand runners on base. While allowing runners to reach in the first place is not ideal, our findings show that how a pitcher continues to perform after allowing runners to reach is extremely important. A higher LOB% means that runners are reaching base but not scoring at a high rate. As pitchers work into trouble but escape these jams, their teams benefit. This also explains why strong bullpens are crucial to team success, as these pitchers specialize in entering difficult situations and shutting down hitters to prevent the opposing team from scoring.

Our defensive model found that FRM and RngR were also among some of the most influential features in predicting wins. We are not surprised to see FRM, specifically, as the top defense-related indicator of success. While an infielder or outfielder may not have the ball hit to them in an individual inning, the catcher is likely to catch a large number of pitches, often more than a dozen per inning. Every time a pitcher throws a pitch, the catcher is responsible for framing that pitch to appear closer to the strike zone. If the catcher can repeatedly deceive the umpire, they can steal strikes, which in turn leads to more outs, fewer baserunners, and more wins for the pitching team. With that said, we are also not very surprised to see that defense has less of an impact on team wins than pitching or offense. In particular, we are confident in the interpretation that good defense contributes to wins; however, it may not compensate for deficiencies in pitching or offense (or vice versa). This understanding is reflected in our defense score leaderboard, which has a weaker relationship between score and wins compared to our batting and pitching leaderboards. 

In conclusion, when combining batting, pitching, and defense into a single composite score, we felt confident that we could accurately predict team win totals for a given season. This was supported by a 0.917 correlation coefficient between the actual and predicted 2024 win totals. This correlation coefficient also suggests that our approach of breaking down performance into three categories was an effective strategy for determining success.

One potential application of these findings is strategic decision-making within an MLB front office. As we reveal which metrics have the strongest impact on team win totals – and which ones may not be as influential – our findings can offer a framework for general managers to follow when making offseason free agent signings, Rule 5 draft selections, player addition and subtraction via trades, and midseason roster construction through promoting players from the minor leagues. Our findings can also be very useful on a smaller, game-by-game scale by informing a team’s manager when making substitution decisions. If a manager knows which metrics – and, therefore, which player skills and traits – best influence the season-long win totals, then they can select pitchers or batters to enter the game based on that interpretation. For example, our finding that a high LOB% correlates to team success could encourage a manager to call upon a reliever with a high LOB% in a close game, rather than someone with strong values in other statistical categories (such as strikeout, walk, and home run rates) but a relatively poor LOB%.

In the future, we would like to explore even more features. Not only could we refine our existing feature selection, but we could also explore other metrics that yield an even stronger relationship between statistics and win totals. Along these lines, we could attempt to identify which types of players, rather than which metrics, impact team success. By doing so, we could address existing topics of baseball controversy, such as determining the optimal lineup slot for the best hitter, the value of the left-handed relief specialist, or balancing offensive and defensive productivity at the catcher position.

An additional area of focus in the future is the consideration of multicollinearity. Since the win total was the dependent variable for our individual scores, using those scores to then predict wins can introduce multicollinearity. 

Despite some limitations and areas for further improvement, we remain confident that we have successfully achieved our goal of developing predictive models to forecast individual team win totals.

Next
Next

Random Forest and Ensemble Model - Predicting wRC+ Values