In symmetric private information retrieval (SPIR), a user communicates with multiple servers to retrieve from them a message in a database, while not revealing the message index to any individual server (user privacy), and learning no additional information about the database (database privacy). We study the problem of SPIR on graph-replicated database systems, where each node of the graph represents a server and each link represents a message. Each message is replicated at exactly two servers; those at which the link representing the message is incident. To ensure database privacy, the servers share a set of common randomness, independent of the database and the user's desired message index. We study two cases of common randomness distribution to the servers: i) graph-replicated common randomness, and ii) fully-replicated common randomness. Given a graph-replicated database system, in i), we assign one randomness variable independently to every pair of servers sharing a message, while in ii), we assign an identical set of randomness variable to all servers, irrespective of the underlying graph. In both settings, our goal is to characterize the SPIR capacity, i.e., the maximum number of desired message symbols retrieved per downloaded symbol, and quantify the minimum amount of common randomness required to achieve the capacity. To this goal, in setting i), we derive a general lower bound on the SPIR capacity, and show it to be tight for path and regular graphs through a matching converse. Moreover, we establish that the minimum size of common randomness required for SPIR is equal to the message size. In setting ii), the SPIR capacity improves over the first, more restrictive setting. We show this through capacity lower bounds for a class of graphs, by constructing SPIR schemes from PIR schemes.
Private information retrieval (PIR) [1] is a cryptographic primitive that safeguards the privacy of a user attempting to retrieve a message in a database, without revealing which message is retrieved. Over the past decade, PIR has generated renewed interest with one of its major focus being the characterization of the capacity [2], i.e., the maximum number of desired message symbols retrieved per downloaded symbol. The PIR capacity, for the canonical setting of fully-replicated databases was established by Sun and Jafar [3], and was optimized for minimum message length by Tian, Sun and Chen [4]. The PIR capacity was studied in more realistic settings, such as, PIR with colluding servers [5][6][7], coded storage [8][9][10][11][12], eavesdroppers and Byzantine servers [13,14], single server [15][16][17], constrained server storage [18,19], and graph-replicated storage [20][21][22]; we refer the readers to [23] and the references therein for more variants of PIR and its applications in private computation and learning. In PIR, user privacy is provided at the cost of revealing parts of the database to the user. This is detrimental in applications where the database contains sensitive information and the privacy of its contents should be preserved. As a solution, symmetric PIR (SPIR) was formulated by Gertner et al. [24] with the additional requirement of database privacy, where no information beyond the desired message is revealed to the user.
SPIR is not feasible unless some private common randomness, independent of the messages, is shared among the servers [24]. The optimal trade-off between SPIR capacity and the minimum amount of common randomness was characterized in [25] in the canonical fullyreplicated database setting. Other variants of SPIR were explored in follow-up works, such as SPIR with colluding servers, eavesdroppers, unresponsive and Byzantine servers [26][27][28], SPIR on MDS coded messages [29,30], SPIR with side information [31,32], SPIR with controlled privacy leakage [33,34] and SPIR to retrieve a random message [35]. These works assume the availability of all messages and randomness across servers in a coded or an uncoded form. However, the full replication of sensitive information can be limited due to security constraints, e.g., only specific servers are trusted to store certain messages. Moreover, even in fully-replicated database systems, only a subset of messages may be accessible to a user, owing to which the effective storage is not fully-eplicated. For instance, in attributebased private authentication systems, [36,37] the accessible messages present a database system, whose replication pattern is modeled by a non-uniform hypergraph. This motivates us to study SPIR on non fully-replicated databases, particularly where every message in the database is replicated on two distinct servers.
In this work, we adopt the graph-based storage architecture of the PIR counterpart [20]; here, vertices represent servers, and every message representing an edge between two vertices, is replicated on the two servers corresponding to these vertices. Our system model naturally generalizes to multigraph-based replicated systems, where r messages are replicated between the respective server pairs. We consider two replication models for the server-side common randomness. The first model assumes that every pair of servers that share a set of messages, also share an independent common randomness variable. This may arise when the data distributor assigns a randomness per replicated message only to the designated servers. We refer to this as the graph-replicated randomness setting. Next, we relax the constrained randomness availability to the setting where the data distributor can populate common randomness across all servers. We refer to this as the fully-replicated randomness setting. The randomness distribution in the first setting enforces stricter privacy constraint, since the randomness is now associated with the messages through shared graph-based replication. Note that, in the canonical setting of [25], there is no distinction between (hyper)graphreplicated and fully-replicated common randomness settings.
We establish bounds on the SPIR capacity and minimum common randomness size for various families of graph-and multigraph-based replicated systems. In the graph-replicated common randomness setting, we show that regular (e.g., cyclic, complete) graphs and path graphs P N on N vertices have the same SPIR capacity of 1 N . For general graphs, the capacity is at least 1 N , and the minimum required randomness size is 1 relative to the message size. Further, we show that the SPIR metrics remain unaffected upon increasing the number of messages to r under the multigraph-based replication model. The fully-replicated common randomness setting improves the capacity, which we derive exactly for P 3 to be 1 2 . We provide capacity lower bounds and randomness size upper bounds for path, cyclic, complete a
This content is AI-processed based on open access ArXiv data.