Public Cluster : parallel machine with multi-block approach

Reading time: 5 minute
...

📝 Original Info

  • Title: Public Cluster : parallel machine with multi-block approach
  • ArXiv ID: 0708.0603
  • Date: 2007-08-07
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (원문에 저자명 및 소속이 누락되어 있음) **

📝 Abstract

We introduce a new approach to enable an open and public parallel machine which is accessible for multi users with multi jobs belong to different blocks running at the same time. The concept is required especially for parallel machines which are dedicated for public use as implemented at the LIPI Public Cluster. We have deployed the simplest technique by running multi daemons of parallel processing engine with different configuration files specified for each user assigned to access the system, and also developed an integrated system to fully control and monitor the whole system over web. A brief performance analysis is also given for Message Parsing Interface (MPI) engine. It is shown that the proposed approach is quite reliable and affect the whole performances only slightly.

💡 Deep Analysis

📄 Full Content

Along with the advances of scientific researches, especially in the field of basic natural sciences in the last decades, the needs on advanced computing is increasing exponentially. Such kind of advanced computings require higher specifcations on hardwares which leads then to astronomical cost to realize. One key solution for this problem is the parallel / cluster machine.

Nowadays, clustering (the low specs and low cost machines) becomes the mainstream to realize an advanced computing system comparable to or in most cases better than the conventional mainframe based system with significant reduced cost [1]. Generally the cluster is designed to perform a single (huge) computational task at certain period. This makes the cluster system is in general exclusive and not at the level of appropriate cost for most potential users, neither the young beginners nor the small research group, especially in the developing countries.

It is clear that the cluster is in that sense still costy, although there is certainly needs to educate the young generation to be the next users familiar with parallel programmings etc. This background motivates us to develop an open and free cluster system for public, namely the LIPI Public Cluster (LPC) [2, 3,4].

Concerning those characteristics, the public cluster should be accessible and user-friendly for all users with various level of knowledges on parallel programming and also various ways of accessing the system in any platforms. This can be achieved by deploying the web-based interfaces in all aspects. Therefore, the main issues are then the security from anonymous users, avoiding the interferences among different tasks running on multi blocks of the nodes and the real-time monitoing and control over web for both administrators and users.

In this paper we present the simplest method to fill such requirements. In the subsequent section we first introduce the multi-block system, followed by our concept on public cluster. Finally we provide a brief performance test on the current system before coming to the conclusion.

In order to run a particular task on several machines, we must implement an interface for the process management and communication among the nodes. In ur system we deploy the popular one, that is the Message-Passing Interface (MPI) [5]. MPI is widely used due to its portability to run MPI in any platforms through the message-passing protocol. Further, we have also implemeted the MPICH2 that is the upgraded version of the MPI developed by the Argonne National Laboratory [6]. InMPICH2 the process management and the communication is completely splitted off. So, the initial runtime environment contains several daemons called MPD’s which each of them has a task to initiate the communication among the nodes before running the main programme. This mechanism is crucial since it enables us to distinguish the errors either in the process or communication.

Each MPD daemon is configurable through its configuration file or optional parameters in command line. Since in our system each user has their own configuration and daemon of MPD, MPICH suits well our needs on a cluster system with different parallel computation at the same time. Further, each user who is assigned to use the cluster with partiicular configuration and daemon is called as a block. Therefore, we define the multi-block as a case where several users with different blocks access the system at the same time.

MPD is a process management assembles daemons which execute the MPI modules. In order to form these daemons as a ring, there should be a node works as a master. This master node will boot the MPD in each node member. This flow is depicted in Fig. 1. Fig. 1. The multi-block system at the LPC. Fig. 1 shows logically the creation of two rings of MPD. The first block (a) contains three nodes in addition to one MPD in the master node. On the other hand, the second block (b) has two nodes and also one MPD in the master node. Each node is restricted to have only one running MPD, while the master node is allowed to run multiple MPD’s as long as each MPD belongs to different user. Under this mechanism the interferences among the nodes could be completely avoided.

The mechanism to enable multi-block system requires several procedures :

• Each node, including the master node, communicates each other using passwordless access, for instance ssh with RSA-key. This can be accomplished easily without copying the authorized keys to all assigned nodes, for example by exporting the user’s home directory through NFS.

• The node names which form the ring of MPD are located at the file mpd.hosts in the user’s home directory. In the LPC, the node’s names are assigned by the administrator and unchangeable by the users. • The main configuration file for MPD, .mpd.conf, is located in the user’s home directory. The important parameter is MPD_SECRETWORD= which is crucial for security of the MPD’s ring. The shoul

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut