Parallel image thinning through topological operators on shared memory parallel machines
In this paper, we present a concurrent implementation of a powerful topological thinning operator. This operator is able to act directly over grayscale images without modifying their topology. We introduce an adapted parallelization methodology which combines split, distribute and merge (SDM) strategy and mixed parallelism techniques (data and thread parallelism). The introduced strategy allows efficient parallelization of a large class of topological operators including, mainly, {\lambda}-leveling, skeletonization and crest restoring algorithms. To achieve a good speedup, we cared about coordination of threads. Distributed work during thinning process is done by a variable number of threads. Tests on 2D grayscale image (512*512), using shared memory parallel machine (SMPM) with 8 CPU cores (2x Xeon E5405 running at frequency of 2 GHz), showed an enhancement of 6.2 with a maximum achieved cadency of 125 images/s using 8 threads.
💡 Research Summary
The paper presents a concurrent implementation of a topological thinning operator that works directly on grayscale images while preserving their topology. Traditional thinning algorithms are primarily designed for binary images; extending them to grayscale requires careful handling of connectivity, holes, and other topological features. To address this, the authors introduce a parallelization methodology that combines a Split‑Distribute‑Merge (SDM) strategy with mixed parallelism, i.e., data‑level parallelism inside each image block and thread‑level parallelism across blocks.
In the SDM approach, the input image is partitioned into equally sized blocks. Each block is processed independently by a worker thread, and an overlapping margin is added to the block borders to avoid topological inconsistencies when pixels near the edges are removed. After processing, the results from all blocks are merged, and a final consistency check ensures that the global topology remains unchanged. This block‑wise decomposition improves cache locality, reduces memory‑bandwidth contention, and limits synchronization to the relatively cheap merge phase.
The mixed parallelism layer further enhances performance. Within a block, SIMD‑friendly operations evaluate the thinning criteria for many pixels simultaneously (data parallelism). Across blocks, a dynamic work‑stealing scheduler assigns blocks to a pool of threads. Because the thinning process is iterative—each iteration may leave a different number of deletable pixels—the scheduler adapts the number of active threads on‑the‑fly, balancing load and preventing idle cores.
The core thinning operator is generic enough to support several well‑known topological algorithms, including λ‑leveling, skeletonization, and crest restoration. The operator defines deletion rules based on grayscale intensity differences while explicitly checking that each deletion does not alter connectivity, the number of holes, or other topological invariants. Consequently, the algorithm can thin grayscale images without first binarizing them, preserving more image information and producing higher‑quality results.
Experimental evaluation was performed on a 512 × 512 2‑D grayscale test image using an 8‑core shared‑memory parallel machine (2 × Xeon E5405, 2 GHz each). The parallel implementation achieved a speed‑up of 6.2× compared with a sequential baseline and reached a peak throughput of 125 images per second when eight threads were employed. The authors attribute this performance to the reduced synchronization overhead, efficient cache usage due to block partitioning, and the adaptive thread‑count mechanism that kept all cores busy throughout the iterative thinning process.
The study also discusses limitations. Block size selection is critical: too small blocks increase scheduling overhead, while too large blocks raise memory consumption and border‑handling costs. Moreover, the current implementation targets 2‑D images; extending the approach to 3‑D volumes would require careful management of significantly larger data sets and more complex topological checks. Future work is suggested in the areas of automatic block‑size tuning, NUMA‑aware memory placement, and hybrid CPU‑GPU execution to further improve scalability.
In summary, the paper delivers a practical, high‑performance framework for topologically safe grayscale image thinning on shared‑memory multiprocessors. By integrating SDM with mixed parallelism, it demonstrates that a broad class of topological operators can be parallelized effectively, opening the door to real‑time applications in medical imaging, computer vision, and other domains where preserving image topology is essential.
Comments & Academic Discussion
Loading comments...
Leave a Comment