Abstract
Breast cancer is a leading cause of cancer death in women. Early diagnosis and treatment are crucial to reduce the mortality rate and increase patients’ lifespan. Mammography is effective in early detection. This study proposes a computer-aided diagnosis system based on the mini-Mammographic Image Analysis Society database for analyzing mammograms. After selecting the regions of interest, we computed three typical features: the shape, spatial, and spectral domain features. We then applied the structural equation model to obtain relations between the features and the breast tissue type, lesion class, and tumor severity after feature extraction by information gain. Finally, we used the decision tree and classification and regression tree to construct computer-aided diagnosis rules; we generated 10 rules for predicting the classification of abnormal lesions and 11 rules for classifying the tumor severity. These rules can help clinicians detect and identify breast cancer efficiency from mammograms and improve medical care quality.
1 Introduction
Breast cancer is a leading cause of cancer death in women; however, effective methods to prevent its occurrence are lacking. Moreover, it ranked first among all causes of death in 2011, representing 28% of all cases in Taiwan [23]. However, there is evidence that early detection accompanied with early treatment can increase the survival chances of patients. Thus, early detection is critical for improving breast cancer prognosis. Recent research has emerged on the mechanisms that may contribute to breast cancer [29], suggesting new avenues for cancer control strategy. Nonetheless, mammography screening continues to be a key strategy and has been used as such for decades [35]. Basic information for this detection pertains chiefly to microcalcifications or masses that become visible in mammographic X-ray images.
According to the representations of breast cancer in mammograms, lesions can be classified as space-occupying lesions and microcalcifications (CALC). Space-occupying lesions are further divided into three types: masses, architectural distortion (ARCH), and asymmetry (ASYM). Among them, masses and ARCH are the typical signal characteristics of breast cancer. On the basis of the shape and boundary characteristics, masses can further be divided into speculated masses (SPIC), circumscribed masses (CIRC), and other masses (MISC) [9, 26].
Breast calcifications are deposits of calcium inside the breast tissue. They appear widespread in the breast, and most women have a few on their mammograms at some time, more commonly after menopause [15]. Two major types of calcifications exist depending on the size: macrocalcifications and microcalcifications. Whereas macrocalcifications are nearly always non-cancerous and require neither an additional follow-up nor a biopsy, microcalcifications should be diagnosed after further examinations. Microcalcifications are tiny specks of calcium deposits with an average diameter of 0.3 mm in individual calcifications, and can be scattered throughout the mammary gland or appear in clusters. The size, shape, and distribution of microcalcifications vary. Figure 1 shows typical examples of microcalcification and space-occupying lesions.

Typical Examples of (A) Microcalcification and (B–F) Space-Occupying Lesions.
2 Background
Space-occupying lesions are defined as groups of cells appearing with varying density (bright regions surrounded by a darker homogeneous background); however, their boundaries are often blurred and difficult to identify. Therefore, the detection of space-occupying lesions on mammograms is difficult. Cheng et al. [4] discussed methods for the automated detection and classification of masses and compared their advantages and disadvantages. They reported that the average sensitivity of radiologists in breast cancer screening is only approximately 75%. However, the performance would be improved if computer-aided detection (CAD) were applied to locate abnormalities. Kom et al. [14] presented an algorithm for the detection of suspicious masses from mammographic images. The algorithm was tested on a database of 61 mammograms, on which masses had previously been marked by experienced radiologists. The results showed that a sensitivity of 95.91% was achieved for mass detection.
Suliga et al. [27] proposed a Markov random field (MRF)-based technique for the automatic detection and classification of masses, which are typically among the first symptoms analyzed in the early diagnosis of breast cancer. The presented MRF model was shown to be an efficient tool for mammogram processing. Kozegar et al. [16] implemented two steps for detecting masses. The first step was to extract suspicious regions from the mammograms through an adaptive thresholding technique. Second, an ensemble classifier was applied to reduce false-positive rates. Experimental results showed that the proposed mass detection algorithm outperformed other competing methods. Oliver et al. [20] provided additional details after reviewing existing methods for automatically detecting and segmenting masses in mammograms.
Because microcalcifications appear as small bright spots within the inhomogeneous background of a mammogram, detecting them is a difficult task. Papadopoulos et al. [22] examined the effect of an image enhancement processing stage and the parameter tuning of a CAD system for the detection of microcalcifications in mammograms. They tested five image enhancement algorithms. The optimal performance for two mammographic data sets was achieved for local range modification (AZMIAS = 0.932/AZNIJ = 0.915) and the wavelet-based linear stretching (AZMIAS = 0.926/AZNIJ = 0.904) method. Oliver et al. [21] used local features extracted from a bank of filters for the automatic detection of microcalcifications and clusters in mammographic images. The experimental evaluation was performed on receiver-operating curve (ROC) analysis for microcalcification detection, and on free-response OC analysis for cluster detection, resulting in the sensitivity of > 80% for a single false-positive cluster per image.
Zhang and Gao [33] presented a novel procedure for detecting microcalcification clusters in mammograms that involves image enhancing, feature selection, and supervised learning. The primary contribution of their method is the combination of subspace learning algorithms and the twin support vector machine (SVM). The proposed approach has been evaluated through numerous experiments. Chen et al. [3] used a topological structure over a range of scales in graphical form to classify microcalcification clusters in mammograms. The experimental results showed that the classification accuracy was as high as 96%, and the ROC area was as high as 0.96. Zhang et al. [34] applied mathematical morphology and the SVM to detect microcalcification clusters. They obtained a high level of detection precision and substantially reduced the number of false-positive object regions.
Several studies on microcalcification and space-occupying-lesion detection have been performed. For example, Arodz et al. [1] used the AdaBoost and SVM algorithms to detect suspicious anomalies. The AdaBoost algorithm has an accuracy of 76% for all lesion types and 90% for masses under ideal conditions. Hu et al. [10] developed an adaptive global thresholding segmentation algorithm for detecting suspicious lesions in mammograms. Their experimental results indicated that the algorithm had a sensitivity of 91.3% and 0.71 false-positives per image. Kendall et al. [13] detected anomalies in screening mammograms by using two-dimensional (2D) discrete wavelet transforms, statistical features, and naïve Bayesian classifiers. After a series of tests, the proposed approach resulted in 100% sensitivity and up to 79% specificity for abnormalities. For detecting and classifying suspicious lesions, Veena and Jayakrishna [28] developed a systematic scheme that consists of a back-propagation neural network (BPNN), wavelet transform, and window-based adaptive thresholding. Experimental results showed that the approach yields good detection and classification results. Table 1 lists the summary of studies related to the detection of masses, microcalcifications, and suspicious lesions in mammograms.
Summary of Related Studies for Breast Cancer Analysis.
Author | Purpose | Main Techniques | Data Sets | Performance |
---|---|---|---|---|
Arodz et al. [1] | Detection of suspicious anomalies | AdaBoost and SVM | DDSMa | Accuracy of 76% for all lesion types and only 90% for masses |
Kom et al. [14] | Detection of suspicious masses | Linear transformation filter and a local adaptive thresholding technique | YGOPH: Yaounde Gynaeco-Obstetric and Pediatric Hospital | Sensitivity of 95.91% and ROC area of 0.946 |
Papadopoulos et al. [22] | Detection of microcalcifications | Five image enhancement algorithms | MIASb and University Hospital Nijmegen | LRM (AZMIAS = 0.932/AZNIJ = 0.915) and wavelet-based linear stretching (AZMIAS = 0.926/AZNIJ = 0.904) |
Hu et al. [10] | Detection of suspicious lesions | Adaptive global thresholding segmentation algorithm | MIAS | Sensitivity of 91.3% and 0.71 false positives per image |
Oliver et al. [21] | Detection of microcalcifications | Local features extracted from a bank of filters and a boosted classifier | DDSM and MIAS | Sensitivity > 80% for a single false-positive cluster per image |
Zhang and Gao [33] | Detection of microcalcification clusters | Combine subspace learning algorithms and the twin SVM | DDSM | Proposed framework is effective and efficient |
Kozegar et al. [16] | Detection of masses | An adaptive thresholding technique and ensemble classifier | MIAS and INBreast | Sensitivity of 91% with false-positive rate of 4.8 per image |
Kendall et al. [13] | Detection of anomalies | 2D discrete wavelet transforms, statistical features, and naïve Bayesian classifiers | DDSM and MIAS | Sensitivity of 100% and up to 79% specificity for abnormalities |
Chen et al. [3] | Classification of microcalcification clusters | Topological structure | DDSM and MIAS | Accuracy is as high as 96% and ROC area is as high as 0.96 |
Zhang et al. [34] | Detection of microcalcification clusters | Mathematical morphology and SVM | MIAS | High level of precision and reduced false-positive rate |
Veena and Jayakrishna [28] | Detection and classification of suspicious lesions | BPNN, wavelet transform, and window-based adaptive thresholding | MIAS | Sensitivity of 92.13% |
aDDSM: Digital Database for Screening Mammography, available at http://marathon.csee.usf.edu/Mammography/Database.html.
bMIAS: Mammography Image Analysis Society, available at http://peipa.essex.ac.uk/info/mias.html.
Studies present limits because of the specificity of the detected microcalcification, the characteristics of the extracted space-occupying lesions (e.g., size, shape, and number), or the procedure used for detection. Considering these drawbacks, this study proposes a CAD system for analyzing mammograms. The database used to develop our system was provided by the Mammographic Image Analysis Society (mini-MIAS). After selecting the region of interest (ROI), we determined three distinct features: the shape, spatial, and spectral domain features. In addition, we used the structural equation model (SEM) to calculate relations between features and the breast tissue type, lesion class, and tumor severity. Finally, the decision tree (DT) and classification and regression tree (CART) was applied to frame computer-aided breast cancer diagnosis rules. Figure 2 shows a diagram of the proposed method.

Diagram of the Proposed Method.
The main contributions of the current study are as follows:
A novel method for analyzing irregularities in mammograms that consists of SEM, DT, and CART is proposed.
Several computer-aided diagnosis rules are generated to aid radiologists in classifying abnormal lesions and in classifying tumor severity.
Two practical data sets, namely mini-MIAS and Digital Database for Screening Mammography (DDSM), are used to develop and evaluate the proposed approach.
The remainder of this article is organized as follows: Section 2 details the materials and methods used in the article; Section 3 presents an analysis of the experimental results as well as a discussion; and, lastly, Section 4 offers a conclusion.
3 Materials and Methods
3.1 Data Set
The data set used in this study was provided by the MIAS (mini-MIAS). In the data set, X-ray films have been digitized using a Joyce-Lobel scanning microdensitometer with a resolution of 50 μm × 50 μm, 8-bit word, and the original images have a size of 1024 × 1024 pixels. The mammographic images were obtained from 161 subjects. Both right and left breast photographs of the subjects are provided, totaling 322 files. Odd (even) file numbers concern the right (left) breast mammogram. Of these files, 207 do not show tumor lesions and are presented as normal mammograms, and the remaining 115 plus 8 (some have two or three abnormal areas) show abnormal mammograms with space-occupying or calcification lesions [26].
There are three types of breast tissues: fatty (F), fatty-glandular (FG), and dense-glandular (DG). Among the 207 normal mammograms, 66 mammograms show fatty tissue, 65 mammograms show glandular tissues, and the remaining 76 show dense tissues. Among the 115 abnormal samples, 40 samples have fatty tissues, 39 samples have glandular tissues, and the remaining 36 have dense tissues. Figure 3 shows the distributions of the types of breast tissues.

Distributions of Types of Breast Tissues.
The class of abnormal lesions describes the type of lesions and tumor disease. Apart from the normal (NORM) mammograms, the remaining abnormal tumor lesions are divided into six categories: CIRC, SPIC, ARCH, ASYM, CALC, and MISC. Among the 123 abnormal mammograms, 25 mammograms have well-defined/circumscribed masses, 19 mammograms have speculated masses, 19 mammograms have architectural distortion masses, 15 mammograms have asymmetry masses, 30 mammograms have calcifications, and the remaining 15 have other ill-defined masses. The severity of the tumor record type of tumors was examined using a biopsy diagnosis to determine whether the tumors were benign (BENIGN) or malignant (MALIG). There are 69 benign cases and 54 malignant cases. Figure 4 shows the classification of mammographic images. The number of mammograms is mentioned in parentheses.

Classification of Mammographic Images.
3.2 Shape Domain Features
Shape domain features use the intensity histogram of an image to provide various statistical and shape properties. They are based on the distribution of individual pixel values, other than the interaction or co-occurrence with neighboring ones. We calculated 15 gray-level features shown in Table 2, where k and p(k) denote the intensity and its probability determined from the mammogram histogram, respectively. The term bac stands for background, which is the average intensity of the margin of the ROI. Yu and Guan [32] used these features for automatically detecting microcalcification clusters. Cheng et al. [4] also applied these features to detect and classify masses in mammograms.
Shape Domain Features [32].
Features | Expression |
---|---|
Mean (μ) | |
Standard deviation (σ) | |
Foreground background ratio (FBR) | |
Foreground background difference (FBD) | FBD= μ– bac |
Difference ratio (DIFF) | |
Area (AREA) | AREA of the ROI |
Compactness (COMP) | COMP = perimeter2/AREA |
Elongation (ELONG) | ELONG = max axis/min axis |
Moment invariant features (PHI1 – PHI7) | As follows |
Moment invariant features can provide the properties of invariance to scale, position, and rotation [5]. For a 2D continuous function f (x, y), the moment of order (p+ q) is defined as
The central moments are defined as
where
If f (x, y) is a digital image, then equation (2) becomes
and the normalized central moments, denoted ηpq, are defined as
A set of seven invariant moments can be derived from the second and third moments proposed by Hu [9]
3.3 Spatial Domain Features
The gray-level co-occurrence matrix is a well-established robust method for extracting spatial domain features from images [7]. The matrix element PΔx, Δy(i, j) is the relative frequency with which two pixels separated by distance (Δx, Δy) occur in a given neighborhood, one with intensity I and the other with intensity j. In other words, the matrix element Pd, θ(i, j) contains the second-order statistical probability values for changes between gray levels I and j at a particular displacement distance d and at a particular angle θ.
With an M× N input image containing L gray levels from 0 to L– 1, let I(m, n) be the intensity at sample m and line n of the image. First, the square matrix W with size L ×L is established. The element in W is calculated with
where δd, θ(l, k) is defined as follows:
This matrix satisfies the symmetry property because the relationships between I and j and j and I have the same meaning. Second, each element in the matrix is normalized to a probability term describing how frequently a gray tone appears in a specified spatial relationship to another gray tone in the image:
where Pd, θ(i, j) denotes the normalized probability term. In this study, d was set to 1, 2, 4, and 6, and θ was set to 0°, 45°, 90°, and 135°. Table 3 lists 12 spatial domain features, for which μx, μy, σx, and σy denote the means and standard deviations of Px and Py.
Spatial Domain Features [31].
Feature | Expression |
---|---|
Contrast (CONT) | |
Correlation (CORR) | |
Energy | |
Entropy | |
Homogeneity (HOMO) | |
Dissimilarity (DISS) | |
Intensity | |
Sum of squares variance – X-axis (SSVX) | |
Sum of squares variance – Y-axis (SSVY) | |
Cluster shade (CS) | |
Cluster prominence (CP) | |
Maximum probability (MP) |
3.4 Spectral Domain Features
The spectral domain features employ the determination of values as the texture unit. The technique is based on a two-level version of the texture spectrum method [19]. First, image pixels are labeled using a step function that records the differences between the central pixel and its neighbors. Pixel values in the neighborhood are multiplied by binomial weights assigned to the corresponding pixels. Finally, the products are summed to obtain the number of the neighborhood.
The information for a pixel can be extracted from a neighborhood of 3 × 3 pixels, which represents the smallest complete unit (with eight directions surrounding the pixel). Texture units thus characterize the local texture for a given pixel and its neighborhood, and the statistics of all of the texture units over the entire image reveal the global texture aspects [8].
With a neighborhood of 3 × 3 pixels, which are denoted by a set of nine elements V= {V0, V1, …, V8}, where V0 represents the intensity value of the central pixel and Vi denotes the intensity value of the neighboring pixel i, TU is given by TU= {E0, E1, …, E8}, where Ei is determined as follows:
From equation (15) above, each element can be assigned one of three possible values so the total number of possible texture units for the eight elements can be estimated as 38= 6561. The texture unit number is defined
where NTU varies from 0 to 6560. The set of 6561 texture units corresponds to the relative gray-level relationships between a pixel and its neighbors in all possible directions, i.e., the local texture aspect of a given pixel in accordance with its neighbors.
Moreover, the eight elements can be ordered differently. If they are ordered clockwise as shown in Figure 5, the first element can take eight possible positions from the top left (A) to the middle left (H), and then the 6561 NTU can be labeled by the above formula under eight different ordering ways (from A to H). Table 4 lists eight spectral domain features, where S(i) is the occurrence of the ith NTU; P(a, b, c) is the probability of Ea = Eb = Ec; and K(i) is the probability of Ea = Ee, Eb = Ef, Ec = Eg, and Ed = Eh.
Spectral Domain Features.
Features | Expression |
---|---|
Black-white symmetry (BWS) | |
Geometric symmetry (GS) | |
Degree of direction (DD) | |
Micro horizontal structure (MHS) | |
Micro vertical structure (MVS) | |
Micro left diagonal structure (MLDS) | |
Micro right diagonal structure (MRDS) | |
Central symmetry (CS) |
The spectral domain features are also called local binary patterns. Some studies applied them to analyze breast cancers, such as Lladó et al. [18] and Joseph and Balakrishnan [11].

Eight Possible Positions Associated with the Central Pixel.
3.5 DT and Information Gain
A DT is a hierarchical model consisting of root nodes, internal nodes, and leaf nodes. A root node has no incoming branches and has zero or more outgoing branches. Each of the internal nodes has exactly one incoming branch and two or more outgoing branches. Moreover, each of the leaf nodes has exactly one incoming branch and no outgoing branches. A leaf node is also called terminal node. The DT algorithm selects a root node with the highest purity by using all training samples. Each attribute is selected separately to partition these samples. A branch is created for each value of an attribute, and the corresponding subset of samples is moved to the newly created child node. Numerical attributes must be transformed into categorical attributes, and the purity of the node is measured according to the expected amount of information gain. The attribute with the highest information gain is selected to indicate the nodes with the highest purity. The training samples are successively split until all subsets consist of samples belonging to a single class [6, 30].
The information gain is defined as follows:
where T is a set of train samples and X is a possible test with n outcomes that partition the set T into subsets T1, T2,…, Tn. The terms INFO(T) and INFOx(T) denote the information of T before and after the train samples are partitioned, respectively. The parameter INFO(T) is based on the information theory concept called entropy and is defined as follows:
where S is any set of samples and P(Ci, S) represents the probability that the samples in S belong to class Ci. The parameter P(Ci, S) can be calculated as the weighted sum of entropies over the subsets:
The gain criterion selects a test X to maximize Gain(X). Therefore, a typical decision learning system recursively selects attributes to test and splits the data set into subsets according to the outcome of the information gain function.
3.6 Structural Equation Model
The SEM is a statistical model for testing and estimating causal relations by using a combination of statistical data and qualitative causal assumptions (Figure 6).

Structural Equation Model.
The general SEM can be represented using the following three matrix equations:
An SEM includes two types of latent variables: exogenous and endogenous. Parameters ξ indicate exogenous constructs, which are independent variables in all equations, whereas η denotes endogenous constructs that are dependent variables in at least one equation. Parameter γ represents regression relations between exogenous constructs and endogenous constructs. Parameter β represents regression relations between two endogenous constructs. Typically, in SEM, exogenous constructs are allowed to co-vary freely. Parameter ϕ represents these co-variances. Manifest variables associated with exogenous constructs are labeled X, whereas those associated with endogenous constructs are labeled Y. An SEM includes two separate λ matrices that connect manifest variables with latent variables, one on the X side and the other on the Y side. Parameters δ and ε denote measurement errors, whereas ζ represents the structural error [25].
3.7 Classification and Regression Tree
CART is a prediction and classification tool for a new mammogram test. The data set can be categorical or continuous, and values can be missing. A categorical data set produces classification trees, and a continuous data set produces regression trees. The proposed approach by Breiman et al. [2] involves three stages, namely growing, pruning, and optimizing [17]. The growing stage involves creating the tree in a recursive manner by partitioning the training samples into successively purer subsets according to a splitting criterion. For regression trees, the criterion can be least squares, the trimmed mean, or least absolute deviations. For classification trees, the criterion can be Gini, towing, ordered towing, or ϕ coefficients.
The pruning stage involves discarding one or more subtrees according to a minimal-cost complexity measure. The pruning procedure begins with identifying the largest tree and replacing subtrees with leafs to simplify the tree. The procedure continues until only one node of the tree remains. In the optimizing stage, the tree with the lowest predicted error rate and the highest classification quality is obtained through the cross-validation techniques. These techniques divide samples into training and testing samples. The process entails removing parts of the tree that do not contribute to the classification accuracy of unseen testing samples, and produces a less complex and more comprehensible tree. Furthermore, the one-standard-error rule is applied to obtain a stable tree that consists of smaller trees with comparable accuracy within one standard error. The ranking of features can be measured by computing the decreasing predicted error rate when another feature is used to replace the primary split. Therefore, if a feature has an increased chance of having a primary split, then the ranking of the feature is high.
4 Experimental Design and Analysis
We implemented the proposed method by using Matlab 7.0, Weka 3.7, SPSS 12.0, Lisrel 8.5, and SPM 7.0 (provided by Salford Systems, San Diego, CA, USA) on a personal computer (Intel Core i7, 2.93-GHz CPU, and 3.46 GB RAM). After selecting features by information gain, we obtained the remaining features for shape (FBD, DIFF, AREA, COMPACT, ELONG, PHI2, PHI3, and PHI6) and for spatial (CP_2_90, Intensity_4_90, CP_4_90, Energy_4_90, MP_4_90, Intensity_6_90, CP_6_90, Energy_6_90, and MP_6_90). These features plus the spectral features are then fed into the SEM to establish a structural model.
Figure 7 shows the path model that was fitted using the maximum likelihood method in LISREL to estimate the path parameters; the asterisk indicates 5% significant differences (type I error). The χ2 of the SEM was 10,530.57 (p= 0.0) with 546 degrees of freedom. The root mean square error of approximation was 0.24, which indicates that the model was acceptable. For the structure model of the SEM, nine significant paths reached the significance level of 0.05: shape→type (0.18), shape→class (0.05), shape→severity (0.11), spatial→class (0.06), spatial→severity (–0.12), spectral→class (–0.05), spectral→severity (–0.12), type→class (–0.01), and type→severity (–0.03). The value in parentheses is the correlation between latent variables. Therefore, we applied space, spatial, spectral, and type to predict both class and severity.

Path Model of the Features.
Figure 8 shows the DT tree for predicting the classification of abnormal lesions. AREA, CS, MP_6_90, FBD, BWS, CP_2_90, and PHI6 are shown to play a critical role in rule induction. For instance, if a mammogram ROI is characterized by AREA≤ 10,050, CS> 0.075, MP_6_90 ≤ 0.5655, FBD≤ 16.40, BWS≤ 0.1828, CP_2_90 ≤ 0.9033, it falls into ARCH class with an accuracy of 83.3%. Table 5 lists in summary form the rules and the classification results for lesion class obtained from the constructed tree. Table 6 lists the detailed accuracy by class of DT results for predicting the classification of abnormal lesions. The average accurate rate is 73.64%, and the average ROC area is 0.865.
Rules for Predicting the Classification of Abnormal Lesions.
Node | Rule |
---|---|
1 | If AREA≤ 10,050 and CS ≤ 0.075, then class = CALC (57.69%). |
2 | If AREA ≤ 10,050, CS > 0.075, and MP_6_90 ≤ 0.5655, then class = MISC (42.85%). |
3 | If AREA ≤ 10,050, CS > 0.075, MP_6_90 ≤ 0.5655, FBD ≤ 16.40, BWS ≤ 0.1828, and CP_2_90 ≤ 0.9033, then class = ARCH (83.33%). |
4 | If AREA ≤ 10,050, CS > 0.075, MP_6_90 ≤ 0.5655, FBD ≤ 16.40, BWS ≤ 0.1828, and CP_2_90 > 0.9033, then class = CIRC (75.00%). |
5 | If AREA ≤ 10,050, CS > 0.075, MP_6_90 ≤ 0.5655, FBD ≤ 16.40, and BWS > 0.1828, then class = CIRC (80.00%). |
6 | If AREA ≤ 10,050, CS > 0.075, MP_6_90 ≤ 0.5655, and FBD > 16.40, then class = ASYM (66.67%). |
7 | If AREA > 10,050 and AREA ≤ 11,277, then class = NORM (97.18%). |
8 | If AREA > 11,277 and PHI6 ≤ - 56.28, then class = ASYM (60.00%). |
9 | If AREA > 11,277, PHI6 > – 56.28, and FBD ≤ 11.49, then class = ARCH (80.00%). |
10 | If AREA > 11,277, PHI6 > – 56.28, and FBD > 11.49, then class = SPIC (100.00%). |
Detailed Accuracy by Class of DT Results.
Class | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area |
---|---|---|---|---|---|---|
CIRC | 0.480 | 0.089 | 0.308 | 0.48 | 0.375 | 0.681 |
NORM | 1 | 0.057 | 0.967 | 1 | 0.983 | 0.967 |
MISC | 0.133 | 0.022 | 0.222 | 0.133 | 0.167 | 0.698 |
ASYM | 0.067 | 0.032 | 0.091 | 0.067 | 0.077 | 0.699 |
ARCH | 0.105 | 0.016 | 0.286 | 0.105 | 0.154 | 0.674 |
SPIC | 0 | 0.032 | 0 | 0 | 0 | 0.632 |
CALC | 0.633 | 0.070 | 0.475 | 0.633 | 0.543 | 0.744 |
Average | 0.736 | 0.054 | 0.704 | 0.736 | 0.715 | 0.865 |

Tree of DT for Predicting the Classification of Abnormal Lesions.
Figure 9 shows the CART tree used for classifying the tumor severity. AREA, DD, FBD, COMPACT, BWS, DIFF, GS, and PHI3 clearly play a critical role in rule induction. For example, if a mammogram ROI is characterized by AREA≤ 10,663.50, DD≤ 0.42, and FBD≤ 13.98, it falls into BENIGN class with an accuracy of 84.6%. Table 7 lists the classification rules for tumor severity obtained from the built tree. The average accurate rate is 87.14%. Figure 10 shows the ROC used for classifying tumor severity. The overall area under the curve is 0.9336.

Tree of CART for Classifying the Tumor Severity.

ROC for Classifying the Tumor Severity.
To test the performance of the proposed approach, a set of samples was collected from the DDSM database, which contains 2620 cases, including normal images and images with benign and malignant lesions. One hundred benign images and 100 malignant images were randomly selected. Each image has an assessment code between 1 and 5 that is assigned according to the American College of Radiology Breast Imaging and Reporting Data System (ACR Bi-RADS) standard. The distributions of the data are shown in Figure 11. For the lesion severity classification, the average accuracy rate of CART was 82.5%, and the overall area under the ROC curve was 0.844. For the ACR Bi-RADS assessment classification, the average accuracy rate of CART was 65% and the overall area under the ROC curve was 0.69. Although the results were not sufficient compared with those of the mini-MIAS, the accuracy rates of the ACR Bi-RADS assessment classification were acceptable compared with radiologist operations.

Distributions of DDSM Samples.
Table 8 lists the accuracy and ROC area obtained using the proposed method relative to those obtained by applying other methods to the mini-MIAS and DDSM databases. The methods were naïve Bayes [13], SVM [34], multilayer perception [28], and AdaBoost [1]. The highlighted sections of the table indicate that the proposed approach obtained the classification rules and outperformed other methods when applied to the mini-MIAS database. For the DDSM database, the proposed approach was not the superior classification method. CART and DT provided the classification rules; however, it is difficult to interpret the results obtained using other methods.
Rules for Classifying the Tumor Severity.
Node | Rule |
---|---|
1 | If AREA ≤ 10,663.50, DD ≤ 0.42, and FBD ≤ 13.98, then severity = BENIGN (84.6%). |
2 | If AREA ≤ 10,663.50, DD ≤ 0.42, FBD > 13.98, and FBD ≤ 21.10, then severity = MALIG (85.7%). |
3 | If AREA ≤ 10,663.50, DD ≤ 0.42, and FBD > 21.10, then severity = BENIGN (100.0%). |
4 | If AREA ≤ 10,663.50, DD > 0.42, and COMPACT ≤ 17.45, then severity = MALIG (84.6%). |
5 | If AREA ≤ 10,663.50, DD > 0.42, COMPACT > 17.45, BWS ≤ 0.14, and DIFF ≤ 0.05, then severity = MALIG (72.2%). |
6 | If AREA ≤ 10,663.50, DD > 0.42, COMPACT > 17.45, BWS ≤ 0.14, and DIFF > 0.05, then severity = BENIGN (80.0%). |
7 | If AREA ≤ 10,663.50, DD > 0.42, COMPACT > 17.45, and BWS > 0.14, then severity = BENIGN (80.0%). |
8 | If AREA ≤ 1283.00, then severity = NORMAL (98.1%). |
9 | If AREA > 1283.00, GS ≤ 0.64, and PHI3 ≤ – 43.21, then severity = MALIG (100.0%). |
10 | If AREA > 1283.00, GS ≤ 0.64, and PHI3 > – 43.21, then severity = BENIGN (88.9%). |
11 | If AREA > 1283.00 and GS > 0.64, then severity = MALIG (88.9%). |
Comparison of Accuracy and ROC Area.

Radiologists visually search mammograms for abnormalities; however, the task is repetitive and time consuming. The rules generated in this approach can assist them in detecting mammographic lesions that may indicate the presence of breast cancer. First, radiologists determine an ROI that is then tested using the generated rules. If a suspicious abnormality is detected, then further examination is performed to determine the course of action that may be required. However, CAD acts only as a second reader, and the final decision is made by the radiologist [12, 24]. Moreover, the computational complexity for a new occurrence is constant because the features of the classification rules are determined.
5 Conclusion
Breast cancer is a leading cause of early mortality in women. Mammography is the most effective method for the early detection of breast cancer. Early diagnosis and treatment are crucial to reducing the mortality rate and increasing patients’ lifespan. This article proposed a computer-aided diagnosis system consisting of five main steps. First, the original images are obtained from the mini-MIAS database. Second, after selecting the ROI, we computed three distinct features: the shape, spatial, and spectral domain features. Third, the number of features is reduced by information gain. Fourth, these data are inputted into an SEM to calculate the relationship between features and breast tissue type, lesion class, and tumor severity. Finally, DT and CART are applied to construct a mammogram-based computer-aided breast cancer diagnosis system. The proposed method generates 10 rules for predicting the classification of abnormal lesions with an average accurate rate of 73.64% and average ROC area of 0.865. There are 11 rules for classifying tumor severity with an average accurate rate of 87.14% and average ROC area of 0.934. These rules can help clinicians detect and interpret breast cancer efficiency from mammograms, and thus, improves the quality of medical care.
Acknowledgments
We are grateful to the National Science Council for the research grant (NSC 102-2221-E-415 -023).
Bibliography
[1] T. Arodz, M. Kurdziel, E. O. Sevre and D. A. Yuen, Pattern recognition techniques for automatic detection of suspicious-looking anomalies in mammograms, Comput. Methods Prog. Biomed.79 (2005), 135–149.10.1016/j.cmpb.2005.03.009Search in Google Scholar
[2] L. Breiman, J. H. Friedman, R. A. Olshen and C. J. Stone, Classification and Regression Trees, Wadsworth, Pacific Grove, CA, 1984.Search in Google Scholar
[3] Z. Chen, H. Strange, E. Denton and R. Zwiggelaar, Analysis of mammographic microcalcification clusters using topological features, Lect. Notes Comput. Sci.8539 (2014), 620–627.Search in Google Scholar
[4] H. D. Cheng, X. J. Shi, R. Min, L. M. Hu, X. P. Cai and H. N. Du, Approaches for automated detection and classification of masses in mammograms, Pattern Recognit.39 (2006), 646–668.10.1016/j.patcog.2005.07.006Search in Google Scholar
[5] R. C. Gonzalez and R. E. Woods, Digital image processing, 2nd Ed., Prentice Hall, Upper Saddle River, NJ, 2002.Search in Google Scholar
[6] J. Han and M. Kamber, Data mining: concepts and techniques, 3rd Ed., Morgan Kaufmann, New York, 2011.Search in Google Scholar
[7] R. Haralick, M. K. Shanmugam and I. Dinstein, Texture features for image classification, IEEE Trans. Syst. Man Cybern.SMC-3 (1973), 610–621.10.1109/TSMC.1973.4309314Search in Google Scholar
[8] D.-C. He and L. Wang, Texture features based on texture spectrum, Pattern Recognit.24 (1991), 391–399.10.1016/0031-3203(91)90052-7Search in Google Scholar
[9] M. K. Hu, Visual pattern recognition by moment invariants, IRE Trans. Inf. Theory8 (1962), 179–187.10.1109/TIT.1962.1057692Search in Google Scholar
[10] K. Hu, X. Gao and F. Li, Detection of suspicious lesions by adaptive thresholding based on multiresolution analysis in mammograms, IEEE Trans. Instrum. Meas.60 (2011), 462–472.10.1109/TIM.2010.2051060Search in Google Scholar
[11] S. Joseph and K. Balakrishnan, Local binary patterns, Haar wavelet features and Haralick texture features for mammogram image classification using artificial neural networks, Adv. Comput. Inf. Technol. Commun. Comput. Inf. Sci.198 (2011), 107–114.10.1007/978-3-642-22555-0_12Search in Google Scholar
[12] N. Karssemeijer, J. D. Otten, H. Rijken and R. Holland, Computer aided detection of masses in mammograms as decision support, Br. J. Cardiol.79 (2006), S123–S126.10.1259/bjr/37622515Search in Google Scholar PubMed
[13] E. J. Kendall, M. G. Barnett and K. Chytyk-Praznik, Automatic detection of anomalies in screening mammograms, BMC Med. Imag.13 (2013), 43.10.1186/1471-2342-13-43Search in Google Scholar PubMed PubMed Central
[14] G. Kom, A. Tiedeu and M. Kom, Automated detection of masses in mammograms by local adaptive thresholding, Comput. Biol. Med.37 (2007), 37–48.10.1016/j.compbiomed.2005.12.004Search in Google Scholar PubMed
[15] D. Kopans, Breast imaging, Lippincott-Raven, Philadelphia, 1998.Search in Google Scholar
[16] E. Kozegar, M. Soryani, B. Minaei and I. Domingues, Assessment of a novel mass detection algorithm in mammograms, J. Cancer Res. Ther.9 (2013), 592–600.10.4103/0973-1482.126453Search in Google Scholar
[17] T.-S. Lee, C.-C. Chiu, Y.-C. Chou and C.-J. Lu, Mining the customer credit using classification and regression tree and multivariate adaptive regression splines, Comput. Stat. Data Anal.50 (2006), 1113–1130.10.1016/j.csda.2004.11.006Search in Google Scholar
[18] X. Lladó, A. Oliver, J. Freixenet, R. Martí and J. Martí, A textural approach for mass false positive reduction in mammography, Comput. Med. Imag. Graph.33 (2009), 415–422.10.1016/j.compmedimag.2009.03.007Search in Google Scholar
[19] T. Ojala, M. Pietikainen and D. Harwood, A comparative study of texture measures with classification based on feature distributions, Pattern Recognit.29 (1996), 51–59.10.1016/0031-3203(95)00067-4Search in Google Scholar
[20] A. Oliver, J. Freixenet, J. Martí, E. Pérez, J. Pont, E. R. Denton and R. Zwiggelaar, A review of automatic mass detection and segmentation in mammographic images, Med. Image Anal.14 (2010), 87–110.10.1016/j.media.2009.12.005Search in Google Scholar PubMed
[21] A. Oliver, A. Torrent, X. Lladó, M. Tortajada, L. Tortajada, M. Sentís, J. Freixenet and R. Zwiggelaar, Automatic microcalcification and cluster detection for digital and digitised mammograms, Knowl. Based Syst.28 (2012), 68–75.10.1016/j.knosys.2011.11.021Search in Google Scholar
[22] A. Papadopoulos, D. I. Fotiadis and L. Costaridou, Improvement of microcalcification cluster detection in mammography utilizing image enhancement techniques, Comput. Biol. Med.38 (2008), 1045–1055.10.1016/j.compbiomed.2008.07.006Search in Google Scholar PubMed
[23] Report of Department of Health, Executive Yuan, Taiwan, Statistical Outcome 2011, Available at: http://www.doh.gov.tw/CHT2006/index_populace.aspx. Accessed 15 June, 2014.Search in Google Scholar
[24] M. P. Sampat, M. K. Markey and A. C. Bovik, Computer-aided detection and diagnosis in mammography, Handb. Image Video Process2 (2005), 1195–1217.10.1016/B978-012119792-6/50130-3Search in Google Scholar
[25] A. Skrondal and S. Rabe-Hesketh, Structural equation modeling: categorical variables, Entry for the Encyclopedia of Statistics in Behavioral Science, Wiley, Hoboken, NJ, 2005.10.1002/0470013192.bsa596Search in Google Scholar
[26] J. Suckling, J. Parker, D. Dance, S. Astley, I. Astley, I. Hutt and C. Boggis, The mammographic images analysis society digital mammogram database, Int. Congr. Ser. Exerpt. Med.1069 (1994), 375–378.Search in Google Scholar
[27] M. Suliga, R. Deklerck and E. Nyssen, Markov random field-based clustering applied to the segmentation of masses in digital mammograms, Comput. Med. Imag. Graph.32 (2008), 502–512.10.1016/j.compmedimag.2008.05.004Search in Google Scholar PubMed
[28] U. K. Veena and V. Jayakrishna, CAD based system for automatic detection and classification of suspicious lesions in mammograms, Int. J. Emerg. Trends Technol. Comput. Sci.3 (2014), 338–345.Search in Google Scholar
[29] L. Vona-Davis and D. P. Rose, Adiposity and diabetes in breast and prostate cancer, in: Kolonin, M.G. (eds.), Adipose Tissue and Cancer, pp. 33–51, Springer, New York, 2013.10.1007/978-1-4614-7660-3_3Search in Google Scholar
[30] J.-Y. Yeh, T.-H. Wu and C.-W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients, Decis. Support Syst.50 (2011), 439–448.10.1016/j.dss.2010.11.001Search in Google Scholar
[31] J.-Y. Yeh, T.-H. Wu and W.-J. Tsai, Bleeding and ulcer detection using wireless capsule endoscopy images, J. Softw. Eng. Appl.7 (2014), 422–432.10.4236/jsea.2014.75039Search in Google Scholar
[32] S. Yu and L. Guan, A CAD system for the automatic detection of clustered microcalcifications in digitized mammogram films, IEEE Trans. Med. Imaging19 (2000), 115–126.10.1109/42.836371Search in Google Scholar PubMed
[33] X. Zhang and X. Gao, Twin support vector machines and subspace learning methods for microcalcification clusters detection, Eng. Appl. Artif. Intell.25 (2012), 1062–1072.10.1016/j.engappai.2012.04.003Search in Google Scholar
[34] E. Zhang, F. Wang, Y. Li and X. Bai, Automatic detection of microcalcifications using mathematical morphology and a support vector machine, Bio-Med. Mater. Eng.24 (2014), 53–59.10.3233/BME-130783Search in Google Scholar PubMed
[35] H. C. Zuckerman, The role of mammography in the diagnosis of breast cancer, in: I. M. Ariel and J. B. Clearly (Eds.), Breast Cancer: Diagnosis and Treatment, pp. 152–172, McGraw-Hill, New York, 1987.Search in Google Scholar
©2016 by De Gruyter
Articles in the same Issue
- Frontmatter
- Editorial
- Special Issue on Intelligent Healthcare Systems
- Research Articles
- Employing Emotion Cues to Verify Speakers in Emotional Talking Environments
- Mining Breast Cancer Classification Rules from Mammograms
- Telehealth Monitoring of Patients in the Community
- Ambient Assisted Living Technologies for Aging Well: A Scoping Review
- Everyday Life Sounds Database: Telemonitoring of Elderly or Disabled
Articles in the same Issue
- Frontmatter
- Editorial
- Special Issue on Intelligent Healthcare Systems
- Research Articles
- Employing Emotion Cues to Verify Speakers in Emotional Talking Environments
- Mining Breast Cancer Classification Rules from Mammograms
- Telehealth Monitoring of Patients in the Community
- Ambient Assisted Living Technologies for Aging Well: A Scoping Review
- Everyday Life Sounds Database: Telemonitoring of Elderly or Disabled