Improvement of Segmentation Efficiency in Mammogram Images Using Dual-ROI Method

Mammogram segmentation utilizing multi-region of intrigue is a standout amongst the most rising exploration territory in the medical image analysis. The steps engaged with the research are grouped into two kinds: 1) segmentation of mammogram images and 2) extraction of texture features from mammogram images. To overcome these difficulties, a compelling technique is proposed in this paper that comprises of three phases. In the principal arrangement, mammogram images from INbreast database are selected and improved utilizing Laplacian filtering. At that point, the pre-processed mammogram images are utilized for segmentation utilizing modified adaptively regularized kernel-based fuzzy C means (M-ARKFCM). After segmentation, statistical texture FE is connected for recognizing the patterns of cancer and non-cancer regions in mammogram images. Finally, the experimental outcome demonstrated that the proposed approach enhanced the segmentation efficiency by methods of statistical parameters contrasted with the existing operating procedures.


INTRODUCTION
Image segmentation is a basic and essential step in mammography, which is helpful for withdrawal of effectual texture features for mammogram images, Tai et al. 2013. Segmentation provide signs with respect to textures of tumor and non-tumor regions in mammograms, Zhong et al. 2015. Textures are of diverse shapes or noticeable patterns, which is applied for recognizing the textures among malignancy and non-cancer cases for mammogram images in view of the location of inconsistencies, García-Manso et al. 2013. Segmentation based texture is tranquiled of two key advances, such as segmentation of mammogram image and taking out texture features for determined region of mammogram images, Rampun et al. 2017. Region of interest (ROI) is in fact effective support to segmentation that helps to recognize the speculated masses Li, J. B et al. 2012. ROI segmentation is a powerful segmentation, which performs segmentation on different regions of mammogram images Muramatsu et al. 2016. However, ROI can't hold up segmentation for expansive number of mammogram images Beura, S

LITERATURE SURVEy
Several approaches are recommended by researchers in breast cancer segmentation framework. In this situation, short assessments of some paramount contributions to the accessible articles are conferred. Jasmine et al. 2009 presented a paper on a new approach for detecting micro-calcification in digital mammograms employing the combination of wavelet analysis of the image by applying artificial neural networks (ANN) for building the classifiers. The micro-calcification corresponds to high frequency components and the detection of micro-calcification is achieved by extracting the microcalcification features from the wavelet analysis of the image and we use these results as an input of neural network for classification. The neural network contains one input, two hidden and one output .The system is classified normal from abnormal, mass for micro-calcification and abnormal severity (benign or malignant). The experiments demonstrate that their approach can provide true detection rate approximately 87% and 0 false detection per image which is significant. The evaluation of the system is carried on INbreast dataset. Mazurowski et al. 2011 specifically, presented a computer-aided detection (CAD) system for mammographic masses that uses a mutual information-based template matching scheme with intelligently selected templates. They presented principles of template matching with mutual information for mammography before. In their paper, they present an implementation of those principles in a complete computer-aided detection system. The proposed system, through an automatic optimization process, chooses the most useful templates (mammographic regions of interest) using a large database of previously collected and annotated mammograms. Through this process, the knowledge about the task of detecting masses in mammograms is incorporated in the system. Then, they evaluate whether there system developed for screen-film mammograms can be successfully applied not only to other mammograms but also to digital breast tomosynthesis (DBT) reconstructed slices without adding any DBT cases for training. J. Dheeba, et al. 2014 investigated a paper on a new classification approach for detection of breast abnormalities in digital mammograms using Particle Swarm Optimized Wavelet Neural Network (PSOWNN). The proposed abnormality detection algorithm is based on extracting Laws Texture Energy Measures from the mammograms and classifying the suspicious regions by applying a pattern classifier. The method is applied to real clinical database of 216 mammograms collected from mammogram screening centers. The detection performance of the CAD system is analyzed using Receiver Operating Characteristic (ROC) curve. This curve indicates the trade-offs between sensitivity and specificity that is available from a diagnostic system, and thus describes the inherent discrimination capacity of the proposed system. The result shows that the area under the ROC curve of the proposed algorithm is 0.96853 with a sensitivity 94.167% of and specificity of 92.105%.
M.M. Eltoukhy, et al. 2010 presents an approach for breast cancer diagnosis in digital mammogram using curvelet transform. After decomposing the mammogram images in curvelet basis, a special set of the biggest coefficients is extracted as feature vector. The Euclidean distance is then used to construct a supervised classifier. The experimental results gave a 98.59% classification accuracy rate, which indicate that curvelet transformation is a promising tool for analysis and classification of digital mammograms. After getting the coefficients, a supervised classifier was developed using Euclidean distance. Mammogram breast images from MIAS dataset was availed for examining the projected methodology.
M.M. Eltoukhy, et al. 2012 proffered a method for breast cancer diagnosis in digital mammogram images. Multi-resolution representations, wavelet or curvelet, are used to transform the mammogram images into a long vector of coefficients. A matrix is constructed by putting wavelet or curvelet coefficients of each image in row vector, where the number of rows is the number of images, and the number of columns is the number of coefficients. A feature extraction method is developed based on the statistical t-test method. The method is ranking the features (columns) according to its capability to differentiate the classes. Then, a dynamic threshold is applied to optimize the number of features, which can achieve the maximum classification accuracy rate. The method depends on extracting the features that can maximize the ability to discriminate between different classes. Thus, the dimensionality of data features is reduced and the classification accuracy rate is improved. Support vector machine (SVM) is used to classify between the normal and abnormal tissues and to distinguish between benign and malignant tumors. The proposed method is validated using 5-fold cross validation. The obtained classification accuracy rates demonstrate that the proposed method could contribute to the successful detection of breast cancer.
Elter and Hasslmer 2008 proposed a novel, knowledge-based approach to the computer aided discrimination of mammographic mass lesions that uses computer-extracted attributes of mammographic masses and clinical data as input attributes to a case-based reasoning system. This approach emphasizes a transparent reasoning process which is important for the acceptance of a CADx system in clinical practice. The authors evaluated the performance of the proposed system on a large publicly available mammography database using receiver operating characteristic curve analysis. Our results indicate that the proposed CADx system has the potential to significantly reduce the number of unnecessary breast biopsies in clinical practice.
Bala, B. K., & Audithan, S. 2014 exhibited an efficient classification system for micro-calcification in digital mammogram image. As the early prediction of breast cancer is the key to reduce women mortality The classification of micro-calcification system is presented based on discrete curvelet transform (DCT) and discrete wavelet transforms (DWT). The energy features are extracted from the mammogram images by using aforementioned transformations at various level of decomposition and k nearest neighbour (KNN) classifier is used for classification task. Experimental results show that the DCT based classification system provides satisfactory result over DWT. Li Y et al. 2016 presented a paper on a mass classification method in mammograms is proposed based on two-concentric masks and discriminating texton. First, the two-concentric masks are employed, dividing each mass region into the centre region and the periphery region. Then integrating linear discriminant analysis (LDA) with traditional texton, the discriminating texton is proposed. The shortage of not considering the class information in traditional texton is improved. Finally, features are extracted with discriminating texton for both the centre region and the periphery region. Thus, the problem of disregarding the spatial layout information is alleviated. The proposed method is tested on 130 mass regions from Digital Database for Screening Mammography (DDSM) database. The classification accuracy rate reaches 86.92% and the area under the receiver operating characteristics (ROC) curve is 0.91, which is higher than traditional texton and some other texture-based methods. Discriminating texton improved the shortage of not considering class information. Employing the concentric masks alleviated the disregard of the spatial information. The periphery region is more effective to classify masses than the centre region. Sampaio et al. 2011 presented a computational methodology for detection of masses in mammogram images which can be described in following steps: (1) removing noise and objects outside the boundary and highlighting the internal structures of the breast, (2) regions containing mass are segmented using cellular neural network, (3) Thereafter the shape of these regions are analyzed through shape descriptors, (4) classification of candidate region is classified as masses or non-masses through Support Vector Machine. Dalmiya et al. 2012 introduced a segmentation method for mammograms using wavelet and k-means clustering. Authors defined their method in following steps: (1) Discrete wavelet transform is used to extract high level details from MRI images, (2) the outputted image is then added to original input image to get sharpened image, (3) k-means clustering is performed on sharpened image to locate the tumor region. Final tumor region is extracted by performing thresholding on clustered image. Mubarak et al. 2012 differentiated the consistency of the image and represented by quad trees. In quad tree, each node has four descendants and the root represents the entire image. The contribution of this method is to split the image in rapport of our requisition, because the splitting level depends on criteria. The major restriction is, it may create blocky segments. To avoid blocky segment, fragment into better level. But it acquires computation time.
To overcome the above drawbacks, modified-ARKFCM is actualized with statistical texture features for improving the performance of mammogram breast image segmentation.

PROPOSED METHODOLOGy
The proposed strategy for segmenting the cancer and non-cancer region from mammogram breast image is separated into three noteworthy steps: image pre-processing, FE and segmentation.

Image Acquisition and Pre-processing
In the antecedent phase of the mammogram breast cancer segmentation, the mammogram images are taken into account from the established benchmark dataset (INbreast dataset). The original INbreast dataset digitized at 50 micron pixel edge, at that point the database is reduced to 200 micron pixel edge with 1024 × 1024 pixels. Subsequently by obtaining the mammogram images, an important impression (pre-processing) is carried out using Laplacian filter.
In the proposed framework, Laplacian channel is used as the preprocessing procedure, on the grounds that contrasted with other filters it is easy to cut down the noise. This channel comprises of smoothing operator, which is used to transform the noise image into a noiseless image to achieve the objects from original mammogram images and furthermore Laplacian filter assumes an outstanding role in edge detection of mammogram images.
Assuming, a couple of smooth images in a linear scale space, which is represented as . It is formed by convolving the original image G 0 with a Gaussian kernel K Ã .
Here, the goal of this examination is to choose the appropriate index σ ( ) from the scale-space at each pair , x y ( ) and the paradigm for such choice relies upon the Minimal Description Length (MDL) principle. It furnishes an effective descriptive model with less complexity or minimal. Laplacian filtering is illuminated in (1).
where, ∈ σ is represented as a residual of smooth image. However, the optimal model incorporates a predetermined number of bits, which helps to increase the associate information to the model. While, the maximal smoothness with minimum residual is exemplified in (2).
Based on MDL principle, select the minimum value dlG σ at , x y ( ) . This mindlG σ helps to determine the optimal smoothness * σ where, x is represented as the horizontal axis, y is denoted as the vertical axis, and σ is mentioned as the standard deviation. The sample pre-processed mammogram image is spoken to it in the Fig 1. The respective noiseless images are utilized for segmentation by employing modified-ARKFCM.

Segmentation Utilizing Modified-ARKFCM
The pre-processed mammogram image is used for segmentation; a compelling procedure modified-ARKFCM is attempted for segmenting the tumor and non-tumor regions of mammogram breast images. Let I be an image that consists of a set of x i grey scale images at pixel i i N = … ( )  be the gray scale Euclidean distance between i and v j , which is stated in (5).
Using membership function from the alternate optimization the cluster centers are updated iteratively using (6) and (7).
The presence of noise is decreased by adding the spatial information of neighboring pixels that is denoted in (8).
where, α is denote as spatial information, N i and N r is defined as the set of pixel and cardinality of the pixels employed in this system. To avoid the neighborhood function, the term 1 Where, x is a gray scale filtered image and a kernel function is used to replace Euclidean distance. The updated equation is represented in the (9) In addition, a Gaussian kernel-based FCM is proposed, which calculates the parameter η j at every step of the iterations to replace α for every cluster. The kernel functions are used to calculate the parameter value, which is represented in (10).
Here, K is the kernel function. The general identification of K requires a large number of patterns and also many cluster centers are required to find the optimal value for η j . To overcome this problem, the combination of spatial context and grayscale information are made using a fuzzy factor. The fuzzy factor G ij is included in the objective function of the FCM, is stated in (11).
Then, the altered fuzzy factor G ij ' is derived using (12).
This altered fuzzy controls the local neighbor relationship and replace the distance with a kernel function, where w iK denotes the fuzzy factor i and 1 − ( ) denotes kernel metric function.In the proposed methodology (M-ARKFCM), the kernel function k is replaced by correlation function c , so the equation (10) is updated as (13).
After segmentation phase, a dual-ROI is applied on the segmented output, which is briefly described below.

Dual-Region of Interest
The primitive performance is common for both ROI and dual-ROI, however dual-ROI is an repetitious interpretation. Along these lines, it is adaptable and faster for the selection of ROI for extensive image datasets. Textural properties of segmented object assume an indispensable role for disclosure of breast cancer or collocation of abnormalities of mammogram images.
The dual-ROI groundwork is advanced for choosing the limit of ROI and furthermore to demonstrate the interior intensity dissemination of huge gathering of mammogram images. In this situation, the image intensity probability outline figured for a given set of mammogram images. At that point, the mean probability is computed for an arrangement of mammogram images and it is applied on image magnitude probability map PI for evaluating binary image BPI , whose pixels are elevated than the threshold value I . Binary mask of dual-ROI is represented as BIr and then work out the shape image of BIr . The dual-ROI intensity information term is spell out in (14).
where, n designate to the entire amount of mammogram images and x accredits the overall outline or replica of mammogram images and shape of non-selected region of mammogram image is represented as i .The general flow illustration of prospective methodology is represented in the Fig 2.

Statistical Feature Extraction
In the wake of applying dual-ROI in fragmented output then the FE is performed on the portioned mammogram images. FE is characterized as the plan of aligning the image from image space to the feature space. Here, the FE is enforced on the texture modes such as, contrast, correlation, cluster prominence, cluster shade, dissimilarity, energy, entropy, homogeneity, maximum probability, variance, sum average, sum variance, sum entropy, area and difference variance for extracting the texture features of mammogram images. These features ascertain the feature information about the mammogram images.

RESULTS AND DISCUSSIONS
In the scheme of investigational simulation, MATLAB (version 2017a) was engaged on PC with 3.2 GHz with i5 processor. Keeping in mind the end goal to surmise the effectiveness of anticipated algorithm, the execution of planned strategy was contrasted with FCM and ARKFCM on the presumed database INbreast. A set of mammograms were selected from the publicly available INbreast database. Every image with 1536x1024 pixels, sampled at 200 micrometer pixel size resolution. The execution of the proposed methodology was looked at as far as of Area under the Curve (AUC), dice-coefficient, and Jaccard coefficient.

Execution/Conduct Measures
The correlation among the input and output variables of a framework comprehend by exploiting the appropriate performance metrics like AUC, dice-coefficient, and Jaccard coefficient. In segmentation validation, the dice coefficient is communicated as in terms of TP, TN, FP and FN counts, which is obtained by matching the segmented result to the ground truth image. These values are used to ascertain the dice coefficient as shown in (15).

Dicecoefficient TP TP FP FN
where, the dice coefficient value "0" shows, no similarity between the outcomes and the value "1" demonstrates the similarity between the output and ground truth image. In Jaccard coefficient, the TP esteems are identified by the overlaps between the manually segmented ground truth cancer labels and the machine generated cancer labels. The general formula to calculate Jaccard co-efficient is denoted in (16).

Jaccard coefficient TP FP FN TP
where, TP is protrayed as true positive, FP is imported as false negative, TN is expressed as true negative and FN is stated as false negative. Area under the curve (AUC) is a simple measurement metric used to measure the accuracy by reducing ROC curve result into a scalar value 14 . The value of this method is normalized between the range of 0 and 1. Higher value of AUC indicates a better performance of the segmentation. It is calculated using the formulae: where ' x ' and ' y ' are the minimum and maximum axis points in the curve with ' f a ( ) ' a function partly above and below the curve. In simple words, AUC is the difference between the area above ROC curve and area below ROC curve.

Experimental Test Result on INbreast Database
In this experimental investigation, INbreast dataset is assessed for contrasting the performance assessment of accessible strategies and the anticipated conspire. In table 1, the performance assessment of proposed and existing techniques are contrasted with the both benign and malignant (cancer stages).
The AUC of proposed procedure for benign is 0.96142 ±0 0031 . and the existing methodologies offer 0.94 ± 0.0083 and 0.9613 ± 0.0067. Similarly, the AUC of proposed methodology for malignant is 0.9613 ± 0.0029 and the existing methodologies offer 0.944 ± 0.0079 and 0.9611 ± 0.0052. The graphical portrayal of AUC comparison is denoted in the  The Table 1 confirmed that the anticipated tactic performs better than the obtainable methods on the INbreast mammogram images; because modified-ARKFCM encodes the both local and shape features of wavelet transformed mammogram medical images for improving the segmentation efficiency of breast cancer.

Relative Examination
The Table.2 presents the relative investigation of on hand works and planned work presentation. C. Varela, et al. 2007 recommended a impressive procedure named as adaptive threshold segmentation, which depends on contour-related, gray level texture and morphological highlights. This existing work accomplishes 0.7877 ± 0.003 of AUC value. Similarly, P. Agrawal, et al. 2014 proposed another new segmentation methodology using saliency based segmentation. After segmentation, a compelling classifier SVM was implemented for classifying the cancer and non-cancer region of segmentation output. This analysis was carried out on a overtly accessible database (i.e. INbreast database) to certify its outcome in terms of AUC and achieved 0.8917 ± 0.001.Whereas, the proposed work accomplishes 0.96136 ±0 0030 . of AUC that is advanced than the offered works.

Pre-predicting the Classification Result Using Feature Esteems
In this section, the mammogram breast cancer classification result is pre-predicted utilizing statistical feature values, Table 3 indicates the feature vector estimations of statistical highlights. Totally, fifteen statistical features are considered for texture FE such as contrast, correlation, cluster prominence, cluster shade, dissimilarity, energy, entropy, homogeneity, maximum probability, variance, sum average, sum variance, sum entropy, area and difference variance. Among these statistical features, area is the most critical segmentation feature for classification result. By setting the cumulative score between the ranges of 1000-3700, we can able to achieve 80% of classification accuracy. Moreover, by considering the feature methods like correlation, cluster shade, dissimilarity, entropy and area, we can able to achieve >70% of classification accuracy by setting the cumulative score < 800.

CONCLUSION
In this paper, a new texture based mammogram segmentation methodology is proposed, which depends on modified-ARKFCM with dual-ROI. Modified-ARKFCM is the most effective methodology in mammogram breast cancer segmentation. In this experimental inquire, Modified-ARKFCM is used for segmenting the cancer and non-cancer regions in mammogram images. Subsequently, the statistical texture features are utilized for recognizing the patterns of cancer and non-cancer regions in mammogram images. The proposed methodology adequately combines the advantages of statistical texture features and modified-ARKFCM. The trial investigation is confirmed on overtly obtainable databases (INbreast dataset), which demonstrates a predominance of the recommended framework. The segmentation rate on the mammogram image is more efficient in modified-ARKFCM than the FCM, ARKFCM on hand methodologies. In the forthcoming work, feature vectors obtained from the statistical texture features are employed in an appropriate binary classification methodology for classifying the cancer regions in mammogram images.

FUNDING AGENCy
The publisher has waived the Open Access Processing fee for this article.