Analysis of Different Approaches of Parallel Block Processing for K-Means Clustering Algorithm
📝 Abstract
Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square-shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.
💡 Analysis
Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square-shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.
📄 Content
Analysis of Different Approaches of Parallel Block Processing for K-Means Clustering Algorithm Rashmi Ca
a
Abstract: Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square- shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.
Keywords: Approaches, K-Means, Parallel Processing, Satellite Images.
1. INTRODUCTION
Image Classification plays a major role in
processing the remotely sensed images. The intent of
classifying the images is to group the pixels based on similar
properties [1]. Satellite image data sets are of size in
Kilobytes, Megabytes and Gigabytes are common in
processing the image operations for classification. K-means
is an unsupervised classification method is most popularly
used
for
image
analysis,
Bioinformatics,
Pattern
Recognition and Statistical Data analysis [2]. One of major
usage of clustering is the classification of satellite images.
Since the time complexity of sequential K means clustering
is higher for processing the high resolution satellite images
whose dimension greater than 1000x1000, hence it is
considered for parallel processing.
Parallel Block Processing exhibits single program
multiple data (SPMD) [3] parallel programming model,
which allows for a greater control over the parallelization of
tasks. The tasks could be distributed and assigned to
processes or labs or sessions. An operation in which an
image is processed in blocks at once. Some operation is
applied to blocks parallely at a time by distributing an
operation/task among the workers hence the name task
parallelism. Then blocks are reassembled to form an output
image. Existing clustering algorithms exhibits high
computational time for larger images of pixel dimension
greater than 1000. The main focus of this paper is to tackle
this problem by using block processing which performs in
parallel using Matlab Programming environment. The main
advantage is to make use of current multi-core architectures
available in commercial processors efficiently which in turn
increases speedup of clustering process for the larger
images. Therefore, the presented approach doesn’t require
special hardware and can run on machines that are
commercially available.
An Efficient parallel Block processing algorithm is designed
in this paper for the reduction in the processing time of
satellite images of dimension more than 1000x1000 which
lead to maximum usage of CPU in standalone systems. In
this paper three approaches of block considered such as row-
shaped,
column-shaped
and
square-shaped.
These
approaches are experimentally analyzed and studied. The
experimental results of different approaches of block
processing is illustrated in the following sections. Hence the
proposed algorithm is more efficient for processing the
satellite images.
The remainder of the paper is organized as section 2 depicts
the literature survey, section 3 explains about the different
approaches of parallel block processing such as row-shaped,
column-shaped and square-shaped followed by the
experimental results comparing the serial execution time
with different approaches of parallel block processing with
its performance.
2. LITERATURE SURVEY
Jerril Mathson Mathew, Jyothis Joseph [4]
exploited the usage of Hadoop for the implementation of K-
Means algorithm. Map Reduce Programming Model is used
for clustering algorithm. Clustering is one of the momentous
High-Performance Computing Project, Department of Studies in Computer Science, University of Mysore,
Mysuru 570 006, India, Contact: rashmi.hpc@gmail.com
task in data mining. In this paper authors describe the basic
components of Hadoop platform [5] workflow of all stages
of Map-Reduce including the structural relationship of
HDFS framework. Manasi N. Joshi [6] exploited data
parallelism for the clustering K-Means clustering algorithm
with the usage of message passing model. This paper mainly
focuses on the splitting and distributing data sets among
processes for computation. Shared memory programming
model a message passing interface (MPI) [7] has been
employed for computation of vector clus
This content is AI-processed based on ArXiv data.