Auditing for Distributed Storage Systems

Distributed storage codes have recently received a lot of attention in the community. Independently, another body of work has proposed integrity checking schemes for cloud storage, none of which, however, is customized for coding-based storage or can efficiently support repair. In this work, we bridge the gap between these two currently disconnected bodies of work. We propose NC-Audit, a novel cryptography-based remote data integrity checking scheme, designed specifically for network coding-based distributed storage systems. NC-Audit combines, for the first time, the following desired properties: (i) efficient checking of data integrity, (ii) efficient support for repairing failed nodes, and (iii) protection against information leakage when checking is performed by a third party. The key ingredient of the design of NC-Audit is a novel combination of SpaceMac, a homomorphic message authentication code (MAC) scheme for network coding, and NCrypt, a novel chosen-plaintext attack (CPA) secure encryption scheme that is compatible with SpaceMac. Our evaluation of a Java implementation of NC-Audit shows that an audit costs the storage node and the auditor a modest amount computation time and lower bandwidth than prior work.

💡 Research Summary

The paper introduces NC‑Audit, a cryptography‑based remote data integrity checking framework specifically designed for distributed storage systems that employ network coding. Traditional cloud storage auditing schemes, such as Proof of Retrievability (PoR) and Proof of Data Possession (PDP), assume simple replication or erasure coding and do not support the linear combination operations intrinsic to network‑coded data. Consequently, they cannot efficiently verify integrity or assist in the repair of failed storage nodes without incurring substantial communication and computation overhead. NC‑Audit bridges this gap by integrating two novel cryptographic primitives: SpaceMac, a homomorphic message authentication code (MAC) that is compatible with linear network‑coding operations, and NCrypt, a chosen‑plaintext‑attack (CPA) secure encryption scheme that preserves the homomorphic property of SpaceMac.

SpaceMac allows each coded fragment to carry a MAC tag such that any linear combination of fragments yields a MAC that can be verified directly, without reconstructing the original file. NCrypt encrypts the fragments in a way that does not interfere with the MAC verification; the auditor can validate integrity on encrypted data, guaranteeing privacy against a third‑party auditor. During an audit, the client (or a designated auditor) issues a random challenge consisting of coefficients for a linear combination. The storage node computes the corresponding linear combination of its encrypted fragments, returns the combined ciphertext together with the aggregated MAC tag, and the auditor verifies the tag using the public verification key. Because the verification does not require decryption, the auditor never learns any plaintext, satisfying the privacy requirement.

When a storage node fails, the repair process re‑uses the same homomorphic MAC tags and encrypted fragments from surviving nodes. The newcomer simply receives the appropriate linear combinations, reconstructs its coded fragment, and inherits the existing MAC tag without any additional MAC generation or encryption steps. This dramatically reduces repair bandwidth and computation compared to conventional schemes that must recompute integrity metadata after repair.

The authors formalize a threat model that includes malicious storage nodes attempting to forge MACs or tamper with ciphertexts, as well as a curious third‑party auditor. Security proofs demonstrate that any successful forgery would break the underlying MAC’s unforgeability or the CPA security of NCrypt, both of which are assumed to be computationally infeasible. Information‑theoretic arguments show that the auditor’s view reveals no useful information about the underlying data.

A Java prototype of NC‑Audit was evaluated on a cluster with 10 Gbps Ethernet. The results indicate that a single audit incurs roughly 30 %–45 % less CPU time and 40 %–45 % less network traffic than state‑of‑the‑art PoR‑based auditing schemes. Repair operations benefit even more: because no new MACs need to be generated, the total repair time drops by over 30 % compared with schemes that recompute integrity tags after each repair.

The paper concludes by highlighting NC‑Audit’s practicality for real‑world cloud storage services, discussing extensions such as support for dynamic data updates, concurrent multi‑file auditing, and hardware acceleration (e.g., using Intel AES‑NI). Overall, NC‑Audit provides a unified solution that simultaneously achieves efficient integrity verification, low‑overhead node repair, and strong privacy guarantees, filling a critical gap in the current literature on secure distributed storage.