Article Preview
TopSpam filtering algorithm is proposed in many different kinds of media. Especially, spam images are widely spread in e-mail or on the web. Many machine learning approaches are proposed for e-mail spam image filtering (Biggio, Fumera, Pillai, & Roli, 2011; Guzella & Caminhas, 2009). As there is abundant spam text in e-mail and spam filtering system can therefore capture the spam text very well, image spams are rapidly increased. Image analysis such as OCR (Optical Character Recognition) is conducted for embedded images in e-mail (Fumera, Pillai, & Roli, 2006). Rather than computationally expensive OCR processing, many approaches which train the features of spam images are proposed (Aradhye, Myers, & Herson, 2005; Nhung & Phuong, 2007; Wakade, Liszka, & Chan, 2013). For advanced feature extraction techniques, artificial neural networks are used (Soranamageswari & Meena, 2010). In (Al-Duwairi, Khater, & Al-Jarrah, 2011), image texture analysis-based image spam filtering algorithm is newly proposed which uses low-level image texture features. In these works, using image features also showed the desirable performance rather than using expensive OCR techniques.
In (Mahajan & Slaney, 2010), they proposed image spam classification model fusing image, text and web-graph features to handle the spam images on the web. For automatic spam image identification, (Cheng, Deng, Fu, Wang, & Qin, 2011) proposed a graph-based spectral semi-supervised feature selection algorithm to handle redundant features also. It showed that graph features of spam images have positive impact on spam image filtering.