Analysis of Different Approaches of Parallel Block Processing for K-Means Clustering Algorithm

Reading time: 5 minute
...

📝 Abstract

Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square-shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.

💡 Analysis

Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square-shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.

📄 Content

Analysis of Different Approaches of Parallel Block Processing for K-Means Clustering Algorithm Rashmi Ca

a

Abstract: Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. This paper mainly focuses on the different approaches of parallel block processing such as row-shaped, column-shaped and square- shaped. These approaches are applied for classification problem. These approaches is applied to the K-Means clustering algorithm as this is widely used for the detection of features for high resolution orthoimagery satellite images. The different approaches are analyzed, which lead to reduction in execution time and resulted the influence of improvement in performance measurement compared to sequential K-Means Clustering algorithm.

     Keywords: Approaches, K-Means, Parallel Processing, Satellite Images. 
   1. INTRODUCTION 

Image Classification plays a major role in processing the remotely sensed images. The intent of classifying the images is to group the pixels based on similar properties [1]. Satellite image data sets are of size in Kilobytes, Megabytes and Gigabytes are common in processing the image operations for classification. K-means is an unsupervised classification method is most popularly used for image analysis, Bioinformatics, Pattern Recognition and Statistical Data analysis [2]. One of major usage of clustering is the classification of satellite images. Since the time complexity of sequential K means clustering is higher for processing the high resolution satellite images whose dimension greater than 1000x1000, hence it is considered for parallel processing. Parallel Block Processing exhibits single program multiple data (SPMD) [3] parallel programming model, which allows for a greater control over the parallelization of tasks. The tasks could be distributed and assigned to processes or labs or sessions. An operation in which an image is processed in blocks at once. Some operation is applied to blocks parallely at a time by distributing an operation/task among the workers hence the name task parallelism. Then blocks are reassembled to form an output image. Existing clustering algorithms exhibits high computational time for larger images of pixel dimension greater than 1000. The main focus of this paper is to tackle this problem by using block processing which performs in parallel using Matlab Programming environment. The main advantage is to make use of current multi-core architectures available in commercial processors efficiently which in turn increases speedup of clustering process for the larger images. Therefore, the presented approach doesn’t require special hardware and can run on machines that are commercially available. An Efficient parallel Block processing algorithm is designed in this paper for the reduction in the processing time of satellite images of dimension more than 1000x1000 which lead to maximum usage of CPU in standalone systems. In this paper three approaches of block considered such as row- shaped, column-shaped and square-shaped. These approaches are experimentally analyzed and studied. The experimental results of different approaches of block processing is illustrated in the following sections. Hence the proposed algorithm is more efficient for processing the satellite images. The remainder of the paper is organized as section 2 depicts the literature survey, section 3 explains about the different approaches of parallel block processing such as row-shaped, column-shaped and square-shaped followed by the experimental results comparing the serial execution time with different approaches of parallel block processing with its performance.
2. LITERATURE SURVEY
Jerril Mathson Mathew, Jyothis Joseph [4] exploited the usage of Hadoop for the implementation of K- Means algorithm. Map Reduce Programming Model is used for clustering algorithm. Clustering is one of the momentous High-Performance Computing Project, Department of Studies in Computer Science, University of Mysore, Mysuru 570 006, India, Contact: rashmi.hpc@gmail.com task in data mining. In this paper authors describe the basic components of Hadoop platform [5] workflow of all stages of Map-Reduce including the structural relationship of HDFS framework. Manasi N. Joshi [6] exploited data parallelism for the clustering K-Means clustering algorithm with the usage of message passing model. This paper mainly focuses on the splitting and distributing data sets among processes for computation. Shared memory programming model a message passing interface (MPI) [7] has been employed for computation of vector clus

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut