Baseball Research submitted as part of the 2025 SMT Data Challenge, by myself and Tejas Rama

Brief summary of what we did:

  • Engineered derived features (velocity, horizontal break, pre-pitch movement vectors) from 2D/3D MiLB player tracking data using the R tidyverse, arrow, and sportyr packages.
  • Built logistic regression models to predict pitch type (fastball vs. offspeed) from fielder position, handedness, and movement, identifying 13 statistically significant positional tipping effects across 7 defensive positions.
  • Designed faceted visualizations with ggplot2 showing fielders’ movement patterns by team and pitch type
  • Proposed implications for advance scouting

Link to the Paper