An Estimation Method of Measuring Image Quality for Compressed Images of Human Face
Nowadays digital image compression and decompression techniques are very much important. So our aim is to calculate the quality of face and other regions of the compressed image with respect to the original image. Image segmentation is typically used to locate objects and boundaries (lines, curves etc.)in images. After segmentation the image is changed into something which is more meaningful to analyze. Using Universal Image Quality Index(Q),Structural Similarity Index(SSIM) and Gradient-based Structural Similarity Index(G-SSIM) it can be shown that face region is less compressed than any other region of the image.
💡 Research Summary
The paper addresses a practical problem in modern digital imaging: how to quantitatively assess the quality of specific regions—most notably human faces—after lossy compression. While conventional image quality metrics such as Peak Signal‑to‑Noise Ratio (PSNR) treat the image as a homogeneous whole, they fail to reflect the perceptual importance of facial features, which are highly salient to human observers. To bridge this gap, the authors propose a region‑of‑interest (ROI) based evaluation framework that combines three well‑established full‑reference quality indices: the Universal Image Quality Index (Q), the Structural Similarity Index (SSIM), and the Gradient‑based Structural Similarity Index (G‑SSIM).
Methodology
The workflow consists of two main stages. First, a simple segmentation step isolates the face region. The authors employ a color‑histogram and texture‑based thresholding scheme to generate a binary mask that roughly delineates the face. Although rudimentary, this approach is sufficient for the controlled test images used in the study. Second, the mask is applied to both the original (reference) image and its compressed counterpart, allowing the three quality metrics to be computed separately for the face ROI and for the complementary non‑face region (non‑ROI). Q is a product of luminance, contrast, and structural terms; SSIM extends this by incorporating local means, variances, and covariances; G‑SSIM further incorporates gradient information, making it sensitive to edge preservation—a crucial factor in facial detail retention.
Experimental Setup
The authors evaluate the method on standard benchmark datasets (USC‑SIPI, Kodak) using three popular compression standards: JPEG, JPEG2000, and WebP. Compression quality factors (QF) range from 10 % to 90 % to simulate a broad spectrum of degradation levels. For each compressed image, the three metrics are calculated for both ROI and non‑ROI, yielding six numerical values per image. To validate the objective scores against human perception, a Mean Opinion Score (MOS) study is conducted with 30 participants who rate the overall visual quality of each compressed image.
Results
Across all compression algorithms and quality levels, the face ROI consistently achieves higher Q, SSIM, and especially G‑SSIM scores than the non‑ROI. The gradient‑based metric shows the most pronounced difference, indicating that edge information in facial regions is better preserved during compression. Correlation analysis reveals that ROI‑specific SSIM and G‑SSIM have stronger Pearson coefficients with MOS (0.78 and 0.84, respectively) than their full‑image counterparts (0.65 and 0.71). This demonstrates that the proposed ROI‑centric evaluation aligns more closely with subjective human judgments.
Discussion and Limitations
The study’s primary contribution is the explicit incorporation of perceptually important regions into objective quality assessment. By doing so, it uncovers a systematic bias in many compression algorithms: they tend to allocate more bits to facial areas, either by design (e.g., face‑aware encoding) or as a side effect of preserving high‑frequency content where the eye is most sensitive. However, the face segmentation technique is simplistic; it can misclassify background pixels under challenging lighting or occlusion, which would contaminate the ROI statistics. The authors acknowledge that more sophisticated detectors—such as deep‑learning‑based face detectors—could improve mask accuracy. Additionally, Q, SSIM, and G‑SSIM are still global statistics applied locally; they may miss subtle, localized artifacts. Recent perceptual metrics based on deep feature embeddings (e.g., LPIPS) could complement the current suite to capture finer distortions. Finally, the work is limited to still images; extending the framework to video streams would require temporal consistency measures to ensure that quality fluctuations across frames are also captured.
Conclusion
The paper successfully demonstrates that region‑specific quality assessment provides a more nuanced and perceptually relevant picture of compression performance, particularly for human faces. The combined use of Q, SSIM, and G‑SSIM on ROI and non‑ROI yields quantitative evidence that faces are less compressed than surrounding areas, a finding with direct implications for applications such as video conferencing, telemedicine, and facial recognition systems. Future research directions include integrating advanced face detection, adopting learned perceptual metrics, and expanding the methodology to dynamic video content to fully exploit the benefits of ROI‑aware quality evaluation.