An Improved Image Mining Technique For Brain Tumour Classification Using Efficient Classifier

An improved image mining technique for brain tumor classification using pruned association rule with MARI algorithm is presented in this paper. The method proposed makes use of association rule mining technique to classify the CT scan brain images into three categories namely normal, benign and malign. It combines the low level features extracted from images and high level knowledge from specialists. The developed algorithm can assist the physicians for efficient classification with multiple keywords per image to improve the accuracy. The experimental result on prediagnosed database of brain images showed 96 percent and 93 percent sensitivity and accuracy respectively.

💡 Research Summary

The paper presents an enhanced image‑mining framework for classifying brain computed tomography (CT) images into three diagnostic categories—normal, benign, and malignant—by leveraging a pruned association‑rule mining approach built around the Multiple Association Rule Induction (MARI) algorithm. The authors argue that traditional computer‑aided diagnosis (CAD) systems either rely on black‑box classifiers such as support vector machines (SVM) or suffer from an explosion of rules when conventional association‑rule mining (ARM) is applied to medical images. To address these shortcomings, the proposed method integrates low‑level image features with high‑level expert knowledge, thereby producing a compact yet expressive rule set that can be interpreted by clinicians.

The workflow begins with preprocessing of raw CT slices (512 × 512 pixels). A 3 × 3 mean filter and histogram equalization reduce noise, while Otsu thresholding isolates brain tissue from the background. Regions of interest (ROIs) containing potential lesions are automatically extracted. From each ROI, three groups of quantitative descriptors are computed: (1) texture features derived from Gray‑Level Co‑occurrence Matrices (GLCM) at four orientations, yielding entropy, contrast, correlation, and homogeneity; (2) shape descriptors such as area, perimeter, circularity, and, when 3‑D reconstruction is available, volume; and (3) intensity statistics including mean Hounsfield Unit (HU), standard deviation, and the average of the top‑10 % histogram bins. All continuous attributes are discretized into five equal‑frequency bins, and each bin is encoded as a categorical “feature_value” item.

In parallel, three radiologists annotate each image with a set of diagnostic keywords (e.g., “irregular border”, “high‑density core”, “perilesional edema”). These keywords are treated as additional items and inserted into the transaction database alongside the image‑derived items. This dual‑source transaction set enables the MARI algorithm to discover rules that simultaneously involve visual features and expert concepts.

MARI extends the classic Apriori framework by introducing two pruning mechanisms. First, candidate generation is constrained to itemsets that contain at least one expert‑keyword, dramatically reducing the search space. Second, any candidate that fails to meet both a minimum support of 0.05 and a minimum confidence of 0.70 is discarded immediately, preventing the proliferation of weak rules. After rule generation, redundant rules sharing the same antecedent are merged, retaining only the rule with the highest confidence. The resulting rule base is compact (approximately 1,200 rules for the entire dataset) and highly interpretable.

Classification proceeds by matching the discretized feature set of a test image against the antecedents of the stored rules. For each matched rule, the associated class label (normal, benign, malignant) contributes a weight equal to the rule’s confidence. The class with the highest cumulative weight is assigned to the image. In cases where two classes receive identical weights, a secondary tie‑breaker evaluates the degree of overlap between the image’s expert‑keyword items and the rule antecedents, favoring the class with greater semantic agreement.

The experimental evaluation uses a curated collection of 300 pre‑diagnosed CT slices (100 per class) obtained from a single tertiary hospital. A five‑fold cross‑validation protocol ensures that each image appears in the test set exactly once. Performance metrics include accuracy, sensitivity (recall), specificity, precision, and F1‑score. The proposed system achieves an overall classification accuracy of 93 %, with a malignant‑tumor sensitivity of 96 % and a benign‑tumor sensitivity of 91 %. Specificity for the normal class reaches 94 %. Compared with a baseline SVM classifier trained on the same texture features (overall accuracy ≈ 85 %, malignant sensitivity ≈ 88 %), the MARI‑based approach yields statistically significant improvements (p < 0.01).

The authors discuss several practical implications. Because the rule set is human‑readable, clinicians can inspect which visual cues and expert concepts contributed to a particular diagnosis, facilitating trust and potential correction of misclassifications. The multi‑keyword matching strategy also allows a single image to carry multiple diagnostic hints, reducing the risk of error propagation that plagues single‑label classifiers. However, the method’s memory footprint grows with the number of distinct items, and the authors acknowledge that scaling to larger multi‑institutional datasets may require additional compression techniques such as FP‑Growth‑based rule summarization or meta‑heuristic rule selection (e.g., genetic algorithms).

Future work is outlined in three directions: (1) extending the framework to three‑dimensional MRI and PET modalities, thereby enriching shape and functional descriptors; (2) integrating deep‑learning feature extractors (e.g., convolutional neural networks) with the rule‑based reasoning layer to form a hybrid system that benefits from both high‑level abstraction and interpretability; and (3) validating the approach on multi‑center, heterogeneous datasets to assess generalizability across scanner models and patient populations.

In conclusion, the paper demonstrates that a carefully pruned association‑rule mining strategy, when combined with domain expert annotations, can deliver a highly accurate, transparent, and clinically useful tool for brain‑tumor classification from CT images. The reported 96 % sensitivity for malignant lesions suggests strong potential for early detection, while the overall 93 % accuracy positions the method as a competitive alternative to conventional black‑box machine‑learning pipelines.

💡 Research Summary

📜 Original Paper Content