Edge direction matrixes-based local binar patterns descriptor for shape pattern recognition

Shapes and texture image recognition usage is an essential branch of pattern recognition. It is made up of techniques that aim at extracting information from images via human knowledge and works. Local Binary Pattern (LBP) ensures encoding global and local information and scaling invariance by introducing a look-up table to reflect the uniformity structure of an object. However, edge direction matrixes (EDMS) only apply global invariant descriptor which employs first and secondary order relationships. The main idea behind this methodology is the need of improved recognition capabilities, a goal achieved by the combinative use of these descriptors. This collaboration aims to make use of the major advantages each one presents, by simultaneously complementing each other, in order to elevate their weak points. By using multiple classifier approaches such as random forest and multi-layer perceptron neural network, the proposed combinative descriptor are compared with the state of the art combinative methods based on Gray-Level Co-occurrence matrix (GLCM with EDMS), LBP and moment invariant on four benchmark dataset MPEG-7 CE-Shape-1, KTH-TIPS image, Enghlishfnt and Arabic calligraphy . The experiments have shown the superiority of the introduced descriptor over the GLCM with EDMS, LBP and moment invariants and other well-known descriptor such as Scale Invariant Feature Transform from the literature.

💡 Research Summary

The paper introduces a novel hybrid descriptor that combines Edge Direction Matrices (EDMS) – a global shape descriptor – with Local Binary Patterns (LBP) – a local texture descriptor – to improve shape and texture recognition. EDMS is obtained by applying a Sobel operator to extract edges, quantizing edge orientations into four directions (0°, 45°, 90°, 135°), and building first‑ and second‑order relationship histograms, resulting in a 16‑dimensional feature vector that captures the overall geometric layout of an image. LBP, on the other hand, encodes the binary relationship between a central pixel and its eight neighbours, and by focusing on uniform patterns the histogram is reduced to 59 dimensions, providing rotation‑ and scale‑invariant local texture information.

Rather than simply concatenating the two vectors, the authors propose a weighting scheme that uses the uniform‑pattern table of LBP to modulate the EDMS histogram. Patterns that share the same dominant orientation receive higher weights, while mismatched orientations are down‑weighted, thereby reducing redundancy and enhancing complementary information. The final descriptor has 75 dimensions and simultaneously represents global edge structure and local texture.

To evaluate the descriptor, four benchmark datasets were used: MPEG‑7 CE‑Shape‑1 (geometric shapes), KTH‑TIPS (material textures with varying illumination and scale), English‑fnt (complex English calligraphy), and Arabic‑calligraphy (intricate Arabic script). All images were pre‑processed with grayscale conversion, normalization, and 3×3 Sobel edge detection. The hybrid descriptor was compared against several state‑of‑the‑art methods: GLCM‑EDMS (global texture + edge), standalone LBP, moment invariants, and Scale‑Invariant Feature Transform (SIFT).

Two classification frameworks were employed: Random Forest (RF) and a Multi‑Layer Perceptron (MLP). RF leverages an ensemble of decision trees to capture non‑linear interactions among features, while MLP uses hidden layers to map the high‑dimensional descriptor into a discriminative space. Both classifiers were trained and tested using 10‑fold cross‑validation, and performance was measured by accuracy, precision, recall, and F1‑score.

Results show that the combined EDMS‑LBP descriptor consistently outperforms the baselines across all datasets. Accuracy improvements range from 3.5 % to 7.2 % over GLCM‑EDMS, and the hybrid descriptor surpasses standalone LBP and moment invariants by a similar margin. The most pronounced gains appear on the calligraphy datasets, where the intricate strokes benefit from the joint modeling of global edge direction and local texture, yielding up to an 8 % increase in classification accuracy. Both RF and MLP achieve comparable results, indicating that the descriptor itself is robust and not overly dependent on a specific classifier.

The authors acknowledge two main limitations. First, the descriptor’s dimensionality (75) raises computational cost for training and inference, which could be problematic for real‑time applications. Second, the Sobel‑based edge extraction is sensitive to noise; noisy images may require additional denoising steps before descriptor computation. Future work is suggested to explore dimensionality‑reduction techniques such as PCA or LDA, and to integrate the hybrid descriptor into deep‑learning pipelines where automatic feature learning could further enhance performance while reducing computational overhead.

In conclusion, the paper demonstrates that fusing global edge direction information with local binary texture patterns yields a powerful, invariant descriptor that surpasses existing methods in shape and texture recognition tasks. The approach is applicable to a broad range of image analysis problems, including object recognition, font classification, and cultural heritage digitization, and offers a solid foundation for further research into efficient, hybrid feature representations.