A Novel Fast Hierarchical Projection Algorithm for Skew Detection in Multimedia Big Data Environment

A Novel Fast Hierarchical Projection Algorithm for Skew Detection in Multimedia Big Data Environment

Li Cheng (School of Power and Mechanical Engineering, Wuhan University, Wuhan, China & School of Computer Science, South-Central University for Nationalities, Wuhan, China) and Gongping Wu (School of Power and Mechanical Engineering, Wuhan University, Wuhan, China)
DOI: 10.4018/IJMCMC.2017070104
OnDemand PDF Download:
List Price: $37.50


Optical character recognition is an effective way for information input of paper media and skew detection of document images is a key stage of it. An algorithm for skew detection employing hierarchical projection is proposed in this paper. Projection histograms at various directions in a given range are acquired according to an initial angle step length. Then variances of it and absolute difference of the variances are calculated respectively and the angle corresponding to the maximum difference is served as rough skew estimate. The similar work above is implemented in which the projection angle range is two times the initial step length and symmetric about the estimate. Finally, the maximum value of the variances is found and the angle corresponding to it is served as skew angle. Experimental results show the algorithm has such advantages as fast processing speed, high detection accuracy, insensitivity to noise and suitable for complex layout.
Article Preview


Nowadays, the amount of multimedia data is increasing dramatically, which makes people’s information source extends to multiple modalities of media objects, such as image, video, audio, flash and so on. As a rich, intuitive performance, suitable vision multimedia information, the application of image has been included as national defense, industrial manufacturing, media, health care etc. Optical Character Recognition (OCR) is an important means of information input for paper media image, especially in this big data era of information explosion; it has been applied in many fields, such as digitization of large number of ancient books, license plate recognition in the intelligent traffic management system and video caption recognition and so on. To recognize texts in the images by computer, skew corrections of the images are usually needed to make the text lines horizontal or vertical, and then individual characters are segmented, so it is required to first detect skew angle of the document images. Over the years, offline character recognition has been achieving great success due to demands in many fields. The major advantage of these off-line recognizers is to allow the previously written and printed texts to be processed and recognized.

Skew in an acquired image generally occurs due to the inaccuracies in the process of image acquisition. Skew detection and correction is one of the most important pre-processing steps involved in the process of OCR. The paper will be dealing with the skew detection and correction of document images i.e. images containing textual information.

Many research findings have been obtained about skew detection of the document images. So far, there are five algorithms for skew detection (Felhi et al., 2011; Li et al., 2007): (i) Nearest Neighbor clustering algorithm (NN); (ii) The traditional Projection Profile (PP) based approach; (iii) Hough Transform algorithm (HT); (iv) Mathematics Morphology algorithm(MM); (v) Cross Correlation algorithm (CC).

In (Hashizume et al., 1986), NN was proposed, and many improved algorithms were developed (Lu et al., 2003; Michael et al., 2010). The algorithm is fast and not limited to images with large text areas or certain range of skew angles, but it generally shows low detection accuracy. The nearest neighbor-clustering algorithm (SUN et al., 2006) was applied to detect skew of Chinese document images, and the least square method was employed to estimate the angle, but the processable text complexity and detectable skew range are unclear.

PP for skew detection was first proposed in (Postl et al., 1986), and some improvements were made to this algorithm (Bloomberg et al., 1995), in which the processing pixels are reduced by down-sampling to save the computing time. However, both the original and improved algorithms are so sensitive to layout of the document images that they often fail to generate correct results when there are multiple fonts or many non-text areas (e.g. pictures, forms or graphs) in the images.

The HT based algorithm was first applied in (Srihari et al., 1989) and many improvements were made to it (Hinds et al., 1990; Le et al., 1994; Pal et al., 1996). The algorithm is robust and simple, but both the time and space complexity are very high. Yu and Jain proposed a multi-step strategy, which makes the algorithm faster and more accurate (Yu et al., 1996). The algorithm was dissected (Singh et al., 2008) from both the time and space complexity, and proposed many methods for improvement, so that performance of the algorithm was greatly improved. But it also has some problems. The improved hierarchical algorithm based on block adjacency graph (BAG) can significantly increase the speed, but it fails to give satisfactory results in the case where straight lines exist in the text lines as the major part. In (GAO et al., 2013), some improvements were also made to the algorithm so that the speed and accuracy were improved, but the processing effect of images with unclear boundary direction was not desirable.

Before skew detection using the mathematics morphology algorithm, opening and closing operations should be implemented for the document images. The CC algorithm can detect skew angle of the images accurately, but the operation takes too much time (Kumar et al., 2012). The combination of the mathematical morphology and Hough transform was applied (ZHANG et al., 2008), but experimental verification was not enough, so the effect is unclear.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing