Predicting Efficiency in master-slave grid computing systems

This work reports a quantitative analysis to predicting the efficiency of distributed computing running in three models of complex networks: Barab'asi-Albert, Erd\H{o}s-R'enyi and Watts-Strogatz. A master/slave computing model is simulated. A node is selected as master and distributes tasks among the other nodes (the clients). Topological measurements associated with the master node (e.g. its degree or betwenness centrality) are extracted and considered as predictors of the total execution time. It is found that the closeness centrality provides the best alternative. The effect of network size was also investigated.

💡 Research Summary

The paper investigates how the topology of a complex network influences the overall execution time of a master‑slave grid computing system. Three canonical network models are used as testbeds: Barabási‑Albert (BA), which generates scale‑free graphs with a few highly connected hubs; Erdős‑Rényi (ER), which produces random graphs with a relatively uniform degree distribution; and Watts‑Strogatz (WS), which yields small‑world graphs characterized by high clustering and short average path lengths due to a rewiring process. For each model, networks of various sizes (from 100 up to 1,000 nodes) are generated, and a standard master‑slave protocol is simulated. One node is designated as the master; it partitions a fixed workload into identical tasks, distributes them to all other nodes (clients), and collects the results. Communication cost is assumed to be proportional to the length of the shortest path between the master and each client.

The study extracts three centrality measures for every candidate master node: degree (the number of direct connections), betweenness (the fraction of shortest paths that pass through the node), and closeness (the inverse of the average shortest‑path distance to all other nodes). These metrics are treated as predictors of the total execution time. Pearson correlation coefficients and multiple linear regression are employed to quantify the predictive power of each metric across the three network families and across different network sizes.

Results show that closeness centrality consistently exhibits the strongest negative correlation with execution time (≈ ‑0.85), meaning that a master node that is, on average, close to all other nodes yields the fastest overall computation. This outcome aligns with intuition: shorter communication paths reduce both the distribution latency of tasks and the collection latency of results. Degree centrality proves useful only in the BA networks, where hubs are naturally present, but a high degree alone does not guarantee optimal performance because a hub situated far from the network’s geometric center can still suffer long average distances. Betweenness centrality offers modest predictive value in limited scenarios; in WS networks, the high clustering concentrates betweenness on a few rewired links, yet this does not translate into a substantial reduction of average communication distance, so its correlation with execution time is weak.

When the network size is increased, the predictive strength of closeness centrality remains stable, but the absolute execution time grows non‑linearly with the network diameter. This scaling behavior suggests that in very large grids a single master may become a bottleneck, and hierarchical or multi‑master architectures could be required to maintain efficiency.

The authors discuss practical implications. In real grid deployments, the topology can often be measured or estimated, allowing system administrators to compute closeness centrality (or approximate it) and select a master node with the highest value before launching the computation. Periodic recomputation of centralities can adapt the system to dynamic changes such as node failures or additions. The paper also outlines future research directions, including the incorporation of heterogeneous task sizes, asynchronous communication, multi‑master coordination, and more realistic physical network latency models to validate the generality of the findings.

In summary, the work demonstrates that among common topological descriptors, closeness centrality is the most reliable predictor of master‑slave grid computing efficiency across diverse complex network structures, and it provides a concrete, easily computable guideline for optimizing master node placement in distributed computing environments.