Article Preview
TopIntroduction
Reading digital video clocks, also called time recognition, is an application-oriented research problem because clock time is the critical information of multiple applications in video analysis, video surveillance, panorama video production, and video indexing and retrieval (Bu, Sun, Ding, Miao, & Yang, 2008; Covavisaruch & Saengpanit, 2004; Li, Wan, Yan, Yu, & Xu, 2006; Li, Xu, Wan, Yan, & Yu, 2006; Xu, Wang, Wan, Li, & Duan, 2006; Yin, Hua, & Zhang, 2002; Yu, 2012; Yu & Ding, 2015; Yu, Li, & San Lee, 2008; Yu, Li, & Leong, 2009; Yu, Cheng, Wu, & Song, 2016; Yu, Ding, Zeng, & Leong, 2015; Yu, Lyu, Xiang, & Leong, 2017). Reading digital video clocks, especially reading multiple digital video clocks of a video, is a very challenging special case of reading text from overlaid video object, because reading digital video clock has multiple extra difficulties such as multiple asynchrony clocks, low resolution, and tight processing time. In fact, reading general scene text still is an open research problem(Anthimopoulos, Gatos, & Pratikakis, 2013; Epshtein, Ofek, & Wexler, 2010; Ghanei & Faez, 2015, 2016; Jaderberg, Simonyan, Vedaldi, & Zisserman, 2016; Lee, Lee, Lee, Yuille, & Koch, 2011; Lyu, Song, & Cai, 2005; Mishra, Alahari, & Jawahar, 2012; Neumann & Matas, 2012, 2013, 2015; Pan, Hou, & Liu, 2011; Shi, Wang, Xiao, Zhang, & Gao, 2013; Shi, Wang, Xiao, Gao, & Hu, 2014; Shivakumara, Phan, & Tan, 2011; Wang, Babenko, & Belongie, 2011; Wang, Wu, Coates, & Ng, 2012; Weinman, Learned-Miller, & Hanson, 2009; Zhong, Jin, Zhang, & Feng, 2016; Zhu & Zanibbi, 2016).
The clock time plays a critical role in video semantics analysis. The time on clocks often indicates the game time or event time in sports and video surveillance (Xu et al., 2006; Zhong et al., 2016; Zhu & Zanibbi, 2016). This paper considers the common case in which digital video clocks have been superimposed on video. While current videos already can have a text channel to store the encoded clock or/and timestamp information, this paper proposed algorithm does not need to use these encoded clocks or timestamps (Bu et al., 2008; Covavisaruch & Saengpanit, 2004). Thus, the proposed algorithm has a wider application range. More importantly, it can avoid the harm from the malicious modification to the encoded timestamp stored in text channel.
A lot of sports and surveillance videos have superimposed digital video clocks or/and timestamps for various reasons — such as to show game-related time or to show the time of the recording. For example, video clocks in a soccer video indicate game time lapsed at a frame, whereas reversely-running game clocks in basketball videos indicate the remaining game time at a frame and reversely-running shot clocks indicate the longest remaining time of the current ball possession. Examples of single and multiple digital video clocks in soccer and basketball videos are shown in Figure 1. In surveillance videos, superimposing digital video clocks or timestamps into videos (Yu et al., 2016) is one method guard against malicious tampering of the encoded timestamp information stored in video text channel. Hence, there is a need for algorithms for reading the superimposed digital video clock, independently of the clock or timestamp encoded in video text channel.