Automatic Pitch Type Recognition System from Single-View Video Sequences of Baseball Broadcast Videos

Automatic Pitch Type Recognition System from Single-View Video Sequences of Baseball Broadcast Videos

Masaki Takahashi, Mahito Fujii, Masahiro Shibata, Nobuyuki Yagi, Shin’ichi Satoh
DOI: 10.4018/jmdem.2010111202
(Individual Articles)
No Current Special Offers


This article describes a system that automatically recognizes individual pitch types like screwballs and sliders in baseball broadcast videos. These decisions are currently made by human specialists in baseball, who are watching the broadcast video of the game. No automatic system has yet been developed for identifying individual pitch types from single view camera images. Techniques using multiple fixed cameras promise highly accurate pitch type identification, but the systems tend to be large. Our system is designed to identify the same pitch types using only the same single-view broadcast baseball videos used by the human specialists, and accordingly we used a number of features, such as the ball’s location, ball speed and catcher’s stance based on the advice of those specialists. The system identifies the pitch type using a classifier trained with the Random Forests ensemble learning algorithm and achieved about 90% recognition accuracy in experiments.
Article Preview

1. Introduction

Many kinds of metadata on various sports are being distributed in real time via data broadcasting and the Internet (Kon’ya, Kuwano, Yamada, Kawamori, & Kawazoe, 2005; Liddy et al., 2002). For baseball in particular, the data generated for each pitch is diverse and includes player at bat, the count, ball speed, etc. A lot of research has gone into scene analysis in baseball videos (Chan, Han, & Gong, 2002; Ando, Shinoda, Furui, & Mochizuki, 2007; Lien, Chiang, & Lee, 2007).

Even though baseball spectators pay a lot of attention to the type of pitch, it is difficult to determine the pitch type automatically from a baseball broadcast video, and so far, human specialists are needed to make the decision from single-view pitching sequences (Hoshikawa, 2006; Shibata, 2007).

To make this work less costly, we developed a system that automatically recognizes the type of pitch by using only the information contained in the baseball broadcast video. We selected features such as ball trajectory, ball speed, and the catcher’s stance just prior to the pitch after referring to the opinions of expert judges of the pitch type.

Various methods of analyzing baseball broadcast video images to display the pitch trajectory or segment the baseball events have been researched (Chen, 2006; Chen, Chen, Tsai, Lee, & Yu, 2007; Shum & Komura, 2004), but no method until now has been established for identifying pitch type. A method that shows the ball speed and changes in breaking balls by analyzing their trajectory has been presented (Chu, Wang, & Wu, 2006), but it goes no further than deciding whether it is a straight ball or breaking ball. Thus, we aimed at developing a new system that can automatically classify the same kinds of individual pitch types with the accuracy of human specialists.

Data on pitch types for all professional baseball games in Japan is available from metadata distributors such as Data Stadium Inc. (n.d.). Although there are various pitch types in professional baseball, human specialists in Data Stadium classify pitches into nine types: straight balls, screwballs, curveballs, sliders, cutballs, forkballs, change-ups, sinkers, and other. Our system classifies pitches into nine types, as in Data Stadium’s method.

Techniques using multiple fixed cameras to obtain the 3-D position of a ball have been developed (Gueziec, 2002; Rander, 1998); one method creates three-dimensional ball trajectories by using multiple stereo cameras (Theobalt, Albrecht, Haber, Magnor, & Seidel, 2004). These techniques can reproduce the curve of a breaking ball. 3-D ball position measurements promise highly accurate pitch identification, but the systems incorporating them tend to be large. Japan Broadcasting Corporation (NHK), which the authors belong to, broadcasts more than 100 live professional baseball programs a year. If we used multi-cameras for each game, the amount of setting up and adjustment required would be large. Considering operability and the need to get results quickly, determining the type of pitch only from broadcast video is a more desirable method for us.

In addition, our system can be used for video searches of previous baseball broadcast videos because it only needs broadcast video as input. In contrast, techniques using multiple fixed cameras cannot be used for video searches because they don’t work on broadcast video. The system has the advantage that can annotate the type of pitch in previous broadcasts.

Moreover, the number of pitch types, ball speed, and the degree of changes in trajectory vary depending on the pitcher. Hence, we need to create a classifier or decide thresholds for each pitcher even if the 3-D positions of ball data are measured.

Another study analyzed changes in trajectory by considering the aerodynamics of breaking balls (Alaways, 1998). However, the special sensor cameras are needed to measure the roll of a ball. It is difficult to measure it from a broadcast image because the ball in the image appears small and has motion blur.

Complete Article List

Search this Journal:
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing