Article Preview
TopIntroduction
Nowadays computer vision is widely applied in the industry. One of the most acute problems is the creation of automatic real-time controllers of production lines. In this paper, we consider quite specific, but practically important task in the field of video-based object detection, namely, the detection of a moving forklift truck and computing its key attributes (moving direction, cargo presence) in case of automation of a route building system in existing cargo warehouse.
It seems that the universal state-of-the-art object detectors can be used in this task. The careful attention should be primarily paid to such algorithms as SURF (Speeded-Up Robust Features) (Bay, Ess, Tuytelaars, Van Gool, 2008), SIFT (Scale-Invariant Feature Transform) (Lowe, 2004), ORB (Rublee, Rabaud, Konolige, Bradski, 2011) and FAST (Rosten, Porter, Drummond, 2010). These algorithms detects the keypoints and compute their descriptors on the target image of the object of interest. The same procedure is repeated for each video frame and the similarity between keypoint descriptors is computed to obtain an appropriate affine transform of the model object (Lowe, 2004; Savchenko, 2013). Next, this matrix is verified with, e.g., RANSAC method (Forsyth, Ponce, 2002). Finally, an appropriate object tracking method is used to speed-up further processing of detected object (Lucas, Kanade, 1981).
Unfortunately, this approach is highly dependent on the presence of noise in the query video and therefore is characterized with high false negative rate (FNR) and/or false positive rate (FPR) if the noise (variable illumination, shadows, color of light sources, light flashing, partial occlusion of objects, etc.) is present in the images. Thus, the purpose of our research is to reduce the impact of the noise on the object detection accuracy if the video of the moving object is available (Savchenko, 2012a) by using the specific domain knowledge about the object of interest. In such case, the video frame is segmented (Chien, Ma, Chen, 2002) and the moving object's silhouette may be obtained with the conventional Motion History Image (MHI) method (Ahad, Rahman, Tan, Kim, Ishikawa, 2012). Next, based on information about object orientation, it is possible to compute its key attributes (width, height, area, etc.) using the binary mathematical morphological operations (Shapiro, Stockman, 2001; Najman, Talbot, 2010). Finally, these features are compared with the (known) thresholds determined based on the domain knowledge.
This article is an extended version of the conference paper (Chernousov, Savchenko, 2014). We implemented our algorithm in complete software and provided much more careful experimental study of FPR/FNR of proposed method with addition of various noise to the available video data.
The rest of the paper is organized as follows: in Section 2, we briefly describe the existing solutions for motion and object detection tasks. In Section 3, we formulate the task of the empty moving forklift truck detection. In Section 4, we conduct an experimental study of the detection methods, compare our results within the conventional local descriptors (SURF, SIFT, ORB, FAST) and proposed morphological algorithm, and analyze the algorithms from the point of additive noise resistance. Finally, we present the findings and give concluding comments.