Optimal Content Placement for Peer-to-Peer Video-on-Demand Systems

The amount of multimedia traffic accessed via the Internet, already of the order of exabytes (10 18 ) per month, is expected to grow steadily in the coming years. A peer-to-peer (P2P) architecture, whereby peers contribute resources to support service of such traffic, holds the promise to support such growth more cheaply than by scaling up the size of data centers. More precisely, a large-scale P2P system based on resources of individual users can absorb part of the load that would otherwise need to be served by data centers. In the present work we address specifically the Video-on-Demand (VoD) application, for which the critical resources at the peers are storage space and uplink bandwidth. Our objective is to ensure that the largest fraction of traffic is supported by the P2P system. More precisely, we look for content placement strategies that enable content downloaders to maximally use the peers' uplink bandwidth, and hence maximally offload the servers in the data centers. Such strategies must adjust to the distinct popularity of video contents, as a more popular content should be replicated more frequently. We consider the following mode of operation: Video requests are first submitted to the P2P system; if they are 1 Part of the results developed in this paper have made the object of a "brief announcement" in [12] and further shown in more detail in [13]. accepted, uplink bandwidth is used to serve them at the video streaming rate (potentially via parallel substreams from different peers). They are rejected if their acceptance would require disruption of an ongoing request service. Rejected requests are then handled by the data center. Alternative modes of operation could be envisioned (e.g., enqueueing of requests, service at rates distinct from the streaming rate, joint service by peers and data center,...). However the proposed model is appealing for the following reasons. It ensures zero waiting time for requests, which is desirable for VoD application; analysis is facilitated, since the system can be modeled as a loss network [7], for which powerful theoretical results are available; and finally, as our results show, simple placement strategies ensure optimal operation in the present model. In the P2P system we are considering, there are two kinds of peers: boxes and pure users. Their difference is that boxes do contribute resources (storage space and uplink bandwidth) to the system, while pure users do not. This paper focuses on the following two architectures (illustrated in Figure 1): • Distributed Server Network (DSN): Requests to download contents come only from pure users, and can be regarded as external requests. • Pure P2P Network (PP2PN): There are no pure users in the system, and boxes do generate content requests, which can be regarded as "internal". The rest of the paper is organized as follows: We review related work in Section II and introduce our system model in Section III. For the Distributed Server Network scenario, the so-called "proportional-to-product" content placement strategy is introduced and shown to be optimal in a large system limit in Section IV, where extensive simulation results are also provided. For the Pure P2P Network scenario, a distinct placement strategy is introduced and proved optimal in Section V. These results apply for a catalogue of contents of limited size. An alternative model in which catalogue size grows with the user population is introduced in Section VI, where it is shown that the "proportional-to-product" placement strategy remains optimal in the DSN scenario in this large catalogue setting, for a suitably modified request management technique. The number and location of replicas of distinct content objects in a P2P system have a strong impact on such system's performance. Indeed, together with the strategy for handling incoming requests, they determine whether such requests must either be delayed, or served from an alternative, more expensive source such as a remote data center. Requests which cannot start service at once can either be enqueued (we then speak of a waiting model) or redirected (we then speak of a loss model). Previous investigations of content placement for P2P VoD systems were conducted by Suh et al. [11]. The problem tackled in [11] differs from our current perspective, in particular no optimization of placement with respect to content popularity was attempted in this work. Performance analysis of both queueing and loss models are considered in [11]. Valancius et al. [17] considered content placement dependent on content popularity, based on a heuristic linear program, and validated this heuristic's performance in a loss model via simulations. Tewari and Kleinrock [14], [15] advocated to tune the number of replicas in proportion to the request rate of the corresponding content, based on a simple queueing formula, for a waiting model, and also from the standpoint of the load on network links. They further established via simulations that Least Recently Used (LRU) storage management policies at peers emulated rather well their proposed allocation. Wu et al. [18] considered a loss model, and a specific timeslotted mode of operation whereby requests are submitted to randomly selected peers, who accommodate a randomly selected request. They showed that in this setup the optimal cache update strategy can be expressed as a dynamic program. Through experiments, they established that simple mechanisms such as LRU or Least Frequently Used (LFU) perform close to the optimal strategy they had previously characterized. Kangasharju et al. [6] addressed file replication in an environment where peers are intermittently available, with the aim of maximizing the probability of a requested file being present at an available peer. This differs from our present focus in that the bandwidth limitation of peers is not taken into account, while the emphasis is on their intermittent presence. They established optimality of content replication in proportion to the logarithm of its popularity, and identified simple heuristics approaching this. Boufkhad et al. [3] considered P2P VoD from yet another viewpoint, looking at the number of contents that can be simultaneously served by a collection of peers. Content placement problem has also been addressed towards other different optimization objectives. For example, Almeida et al. [1] aim at minimizing total delivery cost in the network, and Zhou et al. [19] target jointly maximizing the average encoding bit rate and average number of content replicas as well as minimizing the communication load imbalance of video servers. Cache dimensioning problem is considered in [9], where Laoutaris et al. optimized the storage capacity allocation for content distribution networks under a limited total cache storage budget, so as to reduce average fetch distance for the request contents with consideration of load balancing and workload constraints on a given node. Our paper takes a different perspective, focusing on many-user asymptotics so the results show that the finite storage capacity per node is never a bottleneck (even in the "large catalogue model", it also scales to infinity more slowly than the system size). There are obvious similarities between our present objective and the above works. However, none of these identifies explicit content placement strategies at the level of the individual peers, which lead to minimal fraction of redirected (lost) requests in a setup with dynamic arrivals of requests. Finally, there is a rich literature on loss networks (see in particular Kelly [7]); however our present concern of optimizing placement to minimize the amount of rejected traffic in a corresponding loss network appears new. We now introduce our mathematical model and related notations. Denote the set of all boxes as B. Let |B| = B and index the boxes from 1 to B. Box b has a local cache J b that can store up to M contents, all boxes having the same storage space M . We further assume that each box can simultaneously serve U concurrent requests, where U is an integer, i.e., each box has an uplink bandwidth equal to U times the video streaming rate. In particular we assume identical streaming rates for all contents. The set of available contents is defined as C. Let |C| = C and index contents from 1 to C. Thus a given box b will be able to serve requests for content c for all c ∈ J b . In a Pure P2P Network, when box b has a request for a certain content c, which is coincidentally already in its cache, a "local service" is provided and no download service is needed, hence the service to this request consumes no bandwidth resource. The effect of local service on deriving an optimal content placement strategy will be discussed in detail in Section V. In a Distributed Server Network, however, local service will never occur since all the requests are external with respect to the system resources 2 . For a new request that needs a download service, an attempt is made to serve this request by some box holding content c, while ensuring that previously accepted requests can themselves be assigned to adequate boxes, given the cache content and bandwidth resources of all boxes. This potentially involves "repacking" of requests, i.e., reallocation of all the bandwidth resources in the system ("box-serving-request" mapping) to accommodate this new download demand pattern. If such repacking can be found, then the request is accepted; otherwise, it is rejected from the P2P system. It will be useful in the sequel to characterize the concurrent numbers of requests that are amenable to such repacking. Let n = {n c } c∈C be the vector of numbers n c of requests per content c. Clearly, a matching of these requests to server boxes is feasible if and only if there exist nonnegative integers z cb (number of concurrent downloads of content A more compact characterization of feasibility follows by an application of Hall's theorem [2] (detailed in Appendix B), giving that n is feasible if and only if: We now introduce statistical assumptions on request arrivals and durations. New requests for content c occur at the instants of a Poisson process with rate ν c . We assume that the video streaming rate is normalized to 1, and is the same for all contents. We further assume that all videos have the same duration, again normalized at 1. Under these assumptions, the amount of work per time unit brought into the system by content c equals ν c . With the above assumptions at hand, assuming fixed cache contents, the vector n of requests under service is a particular instance of a general stochastic process known as a loss network model. Loss networks were introduced to represent ongoing calls in telephone networks, and exhibit rich structure. In particular, the corresponding stochastic process is reversible, and admits a closed-form stationary distribution. For the Distributed Server Network model, the stationary distribution reads: In words, the numbers of requests n c are independent Poisson random variables with parameter ν c , conditioned on feasibility of the whole vector n. Our objective is then to determine content placement strategies so that in the corresponding loss network model, the fraction of rejected requests is minimal. The difficulty in doing this analysis resides in the fact that the normalizing constant Z is cumbersome to evaluate. Nevertheless, simplifications occur under large system asymptotics, which we will exploit in the next sections. We conclude this section by the following remark. For simplicity we assumed in the above description that a particular content is either fully replicated at a peer, or not present at all, and that a request is served from only one peer. It should however be noted that we can equally assume that contents are split into sub-units, which can be placed onto distinct peers, and downloaded from such distinct peers in parallel sub-streams in order to satisfy a request. This extension is detailed in Appendix F. We first describe a simple adaptive cache update strategy driven by demand, and show why it converges to a "predetermined" content placement called "proportional-to-product" strategy. We then establish the optimality of this "proportionalto-product" placement in a large system asymptotic regime. A simple method to adaptively update the caches at boxes driven by demand is described as follows: Whenever a new request comes, with probability ǫB (ǫ is chosen such that ǫB ≤ 1), the server picks a box b uniformly at random, and attempts to push content c into this box's cache. If c is already in there, do nothing; otherwise, remove a content selected uniformly at random from the cache. Since external demands for content c are according to a Poisson process with rate ν c , we find that under the above simple strategy, content c is pushed at rate ǫν c into a particular box which is not caching content c. Recall that each box stores M distinct contents, and let j denote a candidate "cache state", which is a size M subset of the full content set C. For convenience, let J denote the collection of all such j. With the above strategy, the caches at each box evolve independently according to a continuous-time Markov process. The rate at which cache state j is changed to j ′ , where j ′ = j + {c} \ {d} for some contents d ∈ j, c / ∈ j, which we denote by q(j, j ′ ), is easily seen to be q(j, j ′ ) = ǫν c /M . Indeed, content d is evicted with probability 1/M , while content c is introduced at rate ǫν c . It is easy to verify that the distribution p(•) given by for some suitable normalizing constant Z, verifies the follwing equation: p(j)q(j, j ′ ) = p(j ′ )q(j ′ , j), j, j ′ ∈ J . The latter relations, known as the local balance equations, readily imply that p(•) is a stationary distribution for the above Markov process; since the process is irreducible, this is the unique stationary distribution. Thus, we can conclude that under this cache update strategy, the random cache state at any box eventually follows this stationary distribution. This is what we refer to as the "proportional-to-product" placement strategy, and it is the one we advocate in the Distributed Server Network scenario. The customized parameter ǫ should not be too large, otherwise the burden on the server will be increased due to use of "push". Neither should it be too small, otherwise the Markov chain will converge too slowly to the steady state. ⋄ Under the cache update strategy, the distribution of cache contents needs time to converge to the steady state. However, if we have a priori information about content popularity, we can use a sampling strategy as an alternative way to directly generate proportional-to-product content placement in one go. One method works as follows: Select successively M contents at random in an i.i.d. fashion, according to the probability distribution {ν c }, where νc = ν c / c ′ ∈C ν c ′ is the normalized popularity. If there are duplicate selections of some content, re-run the procedure. It is readily seen that this yields a sample with the desired distribution. An alternative sampling strategy which can be faster than the one described above when very popular items are present is given in the Appendix C. We now consider the asymptotic regime called "many userfixed catalogue" scaling: The number of boxes B goes to infinity. The system load, defined as is assumed to remain fixed, which is achieved in the present section by assuming that the content collection C is kept fixed, while the individual rates {ν c } scale linearly with B. We also assume that the normalized content popularities {ν c } remain fixed as B increases. It thus holds that ν c = νc ρBU for all c ∈ C. Note that although boxes are pure resources rather than users, scaling of {ν c } with B to infinity actually indicates a "many-user" scenario. To analyze the performance of our proposed proportionalto-product strategy, we require that the cache contents are sampled at random according to this strategy and are subsequently kept fixed. This can either reflect the situation where we use the previously introduced sampling strategy, or alternatively the situation where the cache update strategy has already made the distribution of cache states converge to the steady state, and occurs at a slower time scale than that at which new requests arise and complete. Note that, as B grows large, the right-hand side in the feasibility constraint (2) verifies, by the strong law of large numbers, Here, {m j } corresponds to a particular content placement strategy, under which each box holds a size M content set j with probability m j , and this happens independently over boxes. Specifically, m j = 1 Z c∈j νc (where Z is a normalizing constant) corresponds to our proportional-to-product placement strategy. We now establish a sequence of loss networks indexed by a large parameter B. For the B th loss network, requests for content c ∈ C (regarded as "calls of type c") arrive at rate ν (B) c = (ρU νc ) • B, each "virtual link" S ⊆ C has a capacity and c ∈ S represents that virtual link S is part of the "route" which serves call of type c. 3 This particular setup has been identified as the "large capacity network scaling" in Kelly [7]. There, it is shown that the loss probabilities in the limiting regime where B → ∞ can be characterized via the analysis of an associated variational problem. We now describe the corresponding results in [7] relevant to our present purpose. For the B th loss network, consider the problem of finding the mode of the stationary distribution (3), which corresponds to maximizing c∈C (n (B) c log ν (B) clog n (B) c !) over feasible n (B) . Then, approximate log n (B) c ! by n (B) c log n (B) cn (B) c according to Stirling's formula and replace the integer vector n (B) by a real-valued vector x (B) . This leads to the following optimization problem: The corresponding Lagrangian is given by: where {y (B) S } S⊆C are Lagrangian multipliers. The KKT conditions for this convex optimization problem comprise the original constraints and the following ones: ) is a solution to the optimization problem. From equation ( 11), we further get Then the result that we will need from Kelly [7] is the following: for the B th loss network, the steady state probability of accepting request for c, denoted by A (B) c , verifies where ȳ(B) S are the Lagrangian multipliers of the previous optimization problem. Note that the global acceptance probability, denoted by A sys , which also reads A sys = c∈C νc A c , cannot exceed min(1, 1/ρ). Indeed, it is clearly no larger than 1. It cannot exceed 1/ρ either, otherwise the system would treat more requests than its available resources. We now prove that the proportional-to-product content placement not only achieves the optimal global acceptance probability A sys = min(1, 1/ρ), but also achieves fair individual acceptance probabilities, i.e., A c = A sys for all c. More precisely, we have the following theorem: Before giving the proof, we comment on the result. One point to note is that because of (7), the above optimal acceptance rate is achieved with probability one under any random sampling which follows the proportional-toproduct scheme. Secondly, the optimality of the asymptotic acceptance probability does not depend on M , as long as M ≥ 1. Thus for this particular scaling regime, storage space is not a bottleneck. As we shall see in the next two sections, increasing M does improve performance if either local services occur, as in the Pure P2P Network scenario (Section 4), or if the catalogue size C scales with the box population size B, a case not covered by the classical literature on loss networks, and to which we turn in Section VI-B. Proof: First, we consider ρ ≥ 1. Letting exp - we have ∀c ∈ C, Putting equation ( 15) into ( 12) leads to Thus, inequality (10) in OPT 1 becomes Since ν (B) c = ρBU • νc and c∈C νc = 1, inequality ( 16) further becomes, upon explicitly writing out the normalization constant Z: Two types of product terms (mapped to subsets K ⊆ C) appear on both sides: I. To show whether inequality (17) hold, we only have to prove that given any S ⊆ C, for each product term (related to a K) which appears in one inequality corresponding to a certain S, its multiplicity on the left hand side is no more than that on the right hand side. 1. For a product term of Type I: , so the multiplicity of this product term on the LHS equals |S ∩ K|. • On the RHS: When |S ∩ K| ≥ 2, for any c ′ ∈ K, K\{c ′ } is a size M content set of which the intersect with S is not empty, hence the multiplicity equals |K| (= M + 1). When |S ∩ K| = 1, the exception to the above case is that if c ′ ∈ S ∩ K, then K \ {c ′ } is a size M content set which has no intersect with S and is actually impossible to appear in the second summation term (over all size M content sets G s.t. G ∩ S = ∅) in inequality (17). Thus, the multiplicity equals |K| -1 (= M ). From above, we can see that the multiplicity of the product term on the LHS is always no more than that on the RHS. K is actually already a size M content set G s.t. G ∩C = ∅. Therefore, it is easy to see that on both sides, the multiplicities of this product term are both 1. Now we can conclude that inequality (17) holds for all S ⊆ C, and continue to check the complementary slackness. Given ρ ≥ 1, one simple solution to equation ( 15) reads: Besides, inequality ( 17) is tight for S = C (we even do not need to check this when ρ = 1). Therefore, complementary slackness is always satisfied with solution (18). So far we have proved that the KKT condition holds when ρ ≥ 1. When ρ < 1, we modify ( 14) by letting exp - and hence there is an additional factor 1/ρ > 1 on the RHS of inequality (17). Since the old version of inequalities ( 17) is proved to hold, the new version automatically holds, but none of them is tight now. However, from (19) we have ȳ(B) S = 0, ∀ S ⊆ C, which means complementary slackness is always satisfied (similar to ρ = 1). Therefore, according to equation ( 13), it can be concluded that by using m j = c∈j νc /Z for all j, we can achieve In this subsection, we use extensive simulations to evaluate the performances of the two implementable schemes proposed in Subsection IV-A which follow the "proportional-to-product" placement strategy, namely the sampling-based preallocation scheme and the demand-driven cache update (labeled as "SAMP" and "CU", respectively). We compare the results with the theoretical optimum (i.e., loss rate for each content equals (1 -1/ρ) + ; the curves are labeled as "Optimal") and a uniform placement strategy (labeled as "UNIF") defined as the following: first, permute all the contents uniformly at random, resulting in a content sequence {c i }, for 1 ≤ i ≤ C; then, push the M contents indexed by subsequence {c (j mod C) } bM+1≤j≤(b+1)M into the cache of box b, for 1 ≤ b ≤ B. UNIF is also used to generate the initial content placement for CU so that the loss rate can be reduced during the warm-up period. If not further specified, the default parameter setting is as follows: The popularity of contents {ν c } follows a zipf-like distribution (see e.g. [4]), i.e., with a decaying factor α > 0 and the shift c 0 ≥ 0. We use α = 0.8 and c 0 = 0. The content catalogue size C = 500 and the number of boxes B = 4000. The duration of downloading each content is exponentially distributed with mean equal to 1 time unit. The parameter ǫ in the cache update algorithm is set as 1/B such that upon a request, one box will definitely be chosen for cache update. For every algorithm, we take the average over 10 independent repetitive experiments, each of which is observed for 10 time units. According to the sample path, the initial 1/5 of the whole period is regarded as a "warm-up" period and hence ignored in the calculation of final statistics. 4Some implementation details are not captured by our theoretical model, but should be considered in simulations. Upon a request arrival, the most idle box (i.e., with the largest number of free connections) among all the boxes which hold the requested content is chosen to provide the service, for the purpose of load balancing. If none of them is idle, we use a heuristic repacking algorithm which iteratively reallocates the ongoing services among boxes, in order to handle as many requests as possible while still respects load balancing. One important parameter which trades off the repacking complexity and the performance is the maximum number of iterations t max r , which is set as "undefined" by default (i.e., the iterations will continue until the algorithm terminates; theoretically there are at most C iterations). Other details regarding the repacking algorithm can be found in Appendix D. We will see an interesting observation about t max r later. Figure 2 evaluates system loss rates under different traffic loads ρ. Our two algorithms SAMP and CU, which target the proportional-to-product placement, both match the theoretically optimum very well. 5 On the other hand, the UNIF algorithm, which does not utilize any information about content popularity, incurs a large loss even if the system is underloaded (ρ < 1). The gain of proportional-to-product placement over UNIF becomes less significant as the traffic Fig. 4: Effect of repacking on the system loss rate load grows, which can be easily expected. In Figure 3, when the decaying factor α in the zipflike distribution increases, the distribution of placed contents generated by UNIF has a higher discrepancy from the real content popularity distribution, so UNIF performs worse. On the other hand, the two proportional-to-product strategies are insensitive to the change of content popularity, as we expected. Figure 4 shows the effect of repacking on the system loss rate. In sub-figure (a), we find that under SAMP, repacking is not necessary. In sub-figure (b) which shows the performances of CU, when ρ is low, one iteration of repacking is sufficient to make the performance close enough to the optimum; when ρ is high, repacking also becomes unnecessary. The main takeaway message from this figure is that we can execute a repacking procedure of very small complexity without sacrificing much performance. The reason is that when the server picks a box to serve a request, it already respects the rule of load balancing. We then explain why CU still needs one iteration of repacking to improve the performance when ρ is low. Note that during the cache update, it is possible that the box is currently uploading the "to-be-kicked-out" content to some users. If repacking is enabled, those ongoing services can be repacked to other boxes (see details in Appendix D), but if t max r = 0 (no repacking), they will be terminated and counted as losses. When ρ is high, however, boxes are more likely to be busy, which leads to the failure of repacking, so repacking Recall that the proportional-to-product placement is only optimal when the number of boxes B → ∞. Figures 5 and6 then show the impact of a finite B. In Figure 5, as B decreases, the system loss rate of every algorithms increases (compared to the two proportional-to-product strategies, UNIF is less sensitive to B). In Figure 6, non-homogeneity in the individual loss rates of requests for each content also reflects a deviation from the theoretical result (when B → ∞, the loss rates of the requests for all the contents are proved to be identical). As expected, increasing the number of boxes (from 4000 to 8000) makes the system closer to the limiting scenario and the individual loss rates more homogeneous. Another observation is that as the popularity of a content decreases (in the figure, the contents are indexed in the descending order of their popularity), the individual loss rate increases. However, according to Figure 2, those less popular contents do not affect the system loss rate much even if they incur high loss, since their weights {ν c } are also lower. In fact, if we choose a smaller content catalogue size C or a larger cache size M , simulations show the negative impact of a finite B will be reduced (the figures are omitted here). This tells us that if C scales with B rather than being fixed, the proof of optimality under the loss network framework in Subsection IV-B is no longer valid and M must be a bottleneck against the performance of the optimal algorithm. We will solve this problem by introducing a certain type of "large catalogue model" later in Section VI. PEER-TO-PEER NETWORKS In the Pure P2P Network scenario, when box b has a request for content c which is currently in its own cache, a "local service" will be provided and no download bandwidth in the network will be consumed. To simplify our analysis, each request for a specific content is assumed to originate from a box chosen uniformly at random (this in particular assumes identical tastes of all users). This means that the effective arrival rate of the requests for content c which generates traffic load actually equals νc ν c (1mc ), where mc is defined as the fraction of boxes who have cached content c. Let ρ c ρν c denote the traffic load generated by requests for content c, and λ c denote the fraction of the system bandwidth resources used to serve requests for content c. Obviously, c∈C λ c ≤ 1. The traffic load absorbed by the P2P system either via local services or via service from another box is then upper-bounded by where "∧" denotes the minimum operator. We will use this simple upper bound to identify an optimal placement strategy in the present Pure P2P Network scenario. To this end, we shall establish that our candidate placement strategy asymptotically achieves this performance bound, namely absorbs a portion ρ in the limit where B tends to infinity. To find the optimal strategy, we introduce a variable x c [ρ c (1mc )] ∧ λ c for all c. Note further that the fraction λ c is necessarily bounded from above by mc , as only those boxes holding c can devote their bandwidth to serving c. It is then easy to see that the quantity ρ in (21) is no larger than the optimal value of the following linear programming problem: The following theorem gives the structure of an optimal solution to OPT 2, and as a result suggests an optimal placement strategy. Theorem 2: Assume that {ν c } are ranked in descending order. The following solution solves OPT 2: The proof consists in checking that the KKT conditions are met for the above candidate solution. Details are given in Appendix E. The above optimal solution suggests the following placement strategy: "Hot-Warm-Cold" Content Placement Strategy Divide the contents into three different classes according to their popularity ranking (in descending order): • Hot: The M -1 most popular contents. At each box, M -1 cache slots are reserved for them to make sure that requests for these contents are always met via local service. • Warm: The contents with indices from M to c * + 1 (or c * if c * c=M mc = 1). For these contents, a fraction mc of all the boxes will store content c in their remaining one cache slots, where the value of mc is given in Theorem 2. All requests for these contents (except c * + 1 if it is classified as "warm") can be served, at the expense of all bandwidth resources. • Cold: The other less popular contents are not cached at all. The requests for the c * most popular contents ("hot" contents and "warm" contents except content c * + 1) incur zero loss, while the requests for the Cc * -1 least popular contents incur 100% loss. There is a partial loss in the requests for content c * + 1 if c * c=M mc < 1. Note that the placement for "warm" contents looks like the "water-filling" solution in the problem of allocating transmission powers onto different OFDM channels to maximize the overall achievable channel capacity in the context of wireless communications [16]. ⋄ Under this placement strategy, the maximum upper bound on the absorbed traffic load reads We then have the following corollary: Corollary 1: Considering the large system limit B → ∞, with fixed catalogue and associated normalized popularities { νc } as considered in Subsection IV-B, the proposed "hotwarm-cold" placement strategy achieves an asymptotic fraction of absorbed load equal to the above upper bound ρ, and is hence optimal in this sense. ⋄ Proof: With the proposed placement strategy, hot (respectively, cold) contents never trigger accepted requests, since all incoming requests are handled by local service (respectively, rejected). For warm contents, because each box holds only one warm content, it can only handle requests for that particular warm content. As a result, the processes of ongoing requests for distinct warm contents evolve independently of one another. For a given warm content c, the corresponding number of ongoing requests behaves as a simple one-dimensional loss network with arrival rate ν c (1mc ) and service capacity mc BU . For c = M, . . . , c * , one has mc = ρ c /(1 + ρ c ) where ρ c = ν c /(BU ), so both the arrival rate and the capacity of the corresponding loss network equal mc BU . The asymptotic acceptance probability as B → ∞ then converges to 1 and the accepted load due to both local service and services from other boxes converges to ρ c . For content c * + 1 (if mc * +1 > 0), the corresponding loss network has arrival rate ν c * +1 (1-mc * +1 ) and service capacity mc * +1 BU . Then, in the limit B → ∞, the accepted load (due to both local services and services from other boxes) reads ρ c * +1 mc * +1 + mc * +1 (which is actually smaller than ρ c * +1 ). Summing the accepted loads of all contents yields the result. Keeping the many-user asymptotic, we now consider an alternative model of content catalogue, which we term the "large catalogue" scenario. The set of contents C is divided into a fixed number of "content classes", indexed by i ∈ I. In class i, all the contents have the same popularity (arrival rate) ν i . The number of contents within class i is assumed to scale in proportion to the number of boxes B, i.e., class i contains α i B contents for some fixed scaling factor α i . We further define α i α i . With the above assumptions, the system traffic load ρ in equation ( 6) reads The primary motivation for this model is mathematical convenience: by limiting the number of popularity values we limit the "dimensionality" of the request distribution, even though we now allow for a growing number of contents. It can also be justified as an approximation, that would result from batching into a single class all contents with a comparable popularity. Such classes can also capture the movie type (e.g. thriller, comedy) and age (assuming popularity decreases with content age). We use υi to denote the normalized popularity of content class i ∈ I and it reads i∈I υi = 1. It is reasonable to regard each υi as fixed. νi υi /(α i B) represents the normalized popularity of a specific content in class i, which decreases as the number of contents in this class α i B increases, since users now have more choices within each class. In practice, an online video provider company which uses the Distributed Server Network architecture adds both boxes and available movies of each type to attract more user traffic, under a constraint of a maximum tolerable traffic load ρ. Returning to the Distributed Server Network model of Section IV, we consider the following questions: What amount of storage is required to ensure that memory space is not a bottleneck? Is the proportional-to-product placement strategy still optimal under the large-catalogue scaling? We first establish that bounded storage will strictly constrain utilization of bandwidth resources. To this end we need the following lemma: Lemma 1: Consider the system under large catalogue scaling, with fixed weights α i and cache size M per box. Define M ′ ⌈2M/α⌉. Then (i) More than half of the contents are replicated at most M ′ times, and (ii) For each of these contents, the loss probability is at least E(inf i ν i , M ′ U ) > 0, where E(•, •) is the Erlang function [7] defined as: . ⋄ Proof: We first prove part (i). Note that the total number of content replicas in the system equals BM . Thus, denoting by f the fraction of contents replicated at least M ′ + 1 times, it follows that f αB(M ′ + 1) ≤ BM , which in turn yields which implies statement (i). To prove part (ii), we establish the following general property for a loss network (equivalent to our original system) with call types j ∈ J , corresponding arrival rates ν j , and capacity (maximal number of competing calls) C l on link ℓ for all ℓ ∈ L. We use ℓ ∈ j to indicate that the route for calls of type j comprises link ℓ. Denoting the loss probability of calls of type j in such a loss network as p j , we then want to prove where C ′ j min ℓ∈j C ℓ , i.e., the capacity of the bottleneck link on the route for calls of type j. Note that the RHS of the above inequality is actually the loss probability of a loss network with only calls of type j and capacity C ′ j . Fixing index j, we define this loss network as an auxiliary system and consider the following coupling construction which allows us to deduce inequality (23): Let X k be the number of active calls of type k in the original system for all k, and let X ′ j denote the number of active calls of type j in the auxiliary system. Initially, X j (0) = X ′ j (0). The nonzero transition rates for the joint process ({X k } k∈K , X ′ j ) are given by where It follows from Theorem 8.4 in [5] that {X k } is indeed a loss network process with the original dynamics, and that X ′ j is a one-dimensional loss network with capacity C ′ j and arrival rate ν j . From the construction, we can see that all transitions preserve the inequality X j (t) ≤ X ′ j (t) for all t ≥ 0, due to the following reason: Once X j increases by 1, X ′ j either increases by 1 or equals the capacity limit C ′ j , and for the latter case, the corresponding transition rate ν ori j implies that X j ≤ C ′ j = X ′ j . Similarly, once X ′ j decreases by 1, either X j also decreases by 1, or in the case that X j does not decrease, it must be that the transition rate X ′ j -X j is strictly positive. In any case, the above inequality is preserved. We further let A j (t), A ′ j (t) denote the number of type j external calls, L j (t), L ′ j (t) the number of type j call rejections, and D j (t), D ′ j (t) the number of type j call completions, respectively in the original and auxiliary systems, during time interval [0, t]. It follows from our construction that whenever the service for a call of type j completes in the original system, the service for a call of type j also completes in the auxiliary system, hence Upon dividing this inequality by A(t) and letting t tend to infinity, one retrieves the announced inequality (23) by the ergodic theorem. Back to the context of our P2P system, for those contents which are replicated at most M ′ times (i.e., the contents considered in part (i)), the rejection rate of content c of type The above lemma readily implies the following corollary: Corollary 2: Under the assumptions in Lemma 1, The overall rejection probability is at least 1 2 E(min i ν i , M ′ U ). Indeed, for bounded M , M ′ is also bounded, and E(min i ν i , M ′ U ) is bounded away from 0. ⋄ Thus, even when the system load ρ is strictly less than 1, with bounded M there is a non-vanishing fraction of rejected requests, hence a suboptimal use of bandwidth. We consider the following "Modified Proportional-to-Product Placement": Each of the M storage slots at a given box b contains a randomly chosen content. The probability of selecting one particular content c is ν i /(ρBU ) if it belongs to class i. In addition, we assume that the selections for all such MB storage slots are done independently of one another. Remark 3: This content placement strategy can be viewed as a "balls-and-bins" experiment. All the MB cache slots in the system are regarded as balls, and all the |C| (= i α i B) contents are regarded as bins. We throw each of the M B balls at random among all the |C| bins. Bin c (corresponding to content c which belongs to class i) will be chosen with probability ν i /(ρBU ). Alternatively, the resulting allocation can be viewed as a bipartite random graph connecting boxes to contents. ⋄ Note that this strategy differs from the "proportional-toproduct" placement strategy proposed in Section IV, in that it allows for multiple copies of the same content at the same box. However, by the birthday paradox, we can prove the following lemma which shows that up to a negligible fraction of boxes, the above content placement does coincide with the proportional-to-product strategy. Lemma 2: By using the above content placement strategy, at a certain box, if M ≪ (min i α i )B, Pr(all the M cached contents are different) ≈ 1. (24) ⋄ Proof: In the birthday paradox, if there are m people and n equally possible birthdays, the probability that all the m people have different birthdays is close to 1 whenever m ≪ √ n. Here in our problem, at a certain box, the M cache slots are regarded as "people" and the |C| contents are regarded as "birthdays." Although the probability of picking one content is non-uniform, the probability of picking one content within a specific class is uniform. One can think of picking a content for a cache slot as a two-step process: With probability α i ν i / j α j ν j , a content in class i is chosen. Then conditioned on class i, a specific content is chosen uniformly at random among all the α i B contents in class i. Contents from different classes are obviously different. When M ≪ √ α i B, even if all the M cached contents are from class i, the probability that they are different is close to 1. Thus, M ≪ √ min i α i B is sufficient for (24) to hold. To prove that under this particular placement, inefficiency in bandwidth utilization vanishes as M → ∞, we shall in fact consider a slight modification of the "request repacking" strategy considered so far for determining which contents to accept: A parameter L > 0 is fixed. Each box b maintains at all times a counter Z b of associated requests. For any content c, the following procedure is used by the server whenever a request arrives: A random set of L distinct boxes, each of which holds a replica of content c, is selected. An attempt is made to associate the newly arrived request with all L boxes, but the request will be rejected if its acceptance would lead any of the corresponding box counters to exceed LU . Remark 4: Note that in this acceptance rule, associating a request to a set of L boxes does not mean that the requested content will be downloaded from all these L boxes. In fact, as before, the download stream will only come from one of the L boxes, but here we do not specify which one is to be picked. It is readily seen that the above rule defines a loss network. Moreover, it is a stricter acceptance rule than the previously considered one. Indeed, it can be verified that when all ongoing requests have an associated set of L boxes, whose counters are no larger than LU , there exist nonnegative integers Z cb such that b: We introduce an additional assumption, needed for technical reasons. Assumption 1: A content which is too poorly replicated is never served. Specifically, a content must be replicated at least M 3/4 times to be eligible for service. ⋄ Our main result in this context is the following theorem: Theorem 3: Consider fixed M , α i , ν i , and corresponding load ρ < 1. Then for suitable choice of parameter L, with high probability (with respect to placement) as B → ∞, the loss network with the above "modified proportional-to-product placement" and "counter-based acceptance rule" admits a content rejection probability φ(M ) for some function φ(M ) decreasing to zero as M → ∞. ⋄ The interpretation of this theorem is as follows: The fraction of lost service opportunities, for an underloaded system (ρ < 1), vanishes as M increases. Thus, while Corollary 2 showed that M → ∞ is necessary for optimal performance, this theorem shows that it is also sufficient: there is no need for a minimal speed (e.g. M ≥ log B) to ensure that the loss rate becomes negligible. The proof is given in Appendix A. In peer-to-peer video-on-demand systems, the information of content popularity can be utilized to design optimal content placement strategies, which minimizes the fraction of rejected requests in the system, or equivalently, maximizes the utilization of peers' uplink bandwidth resources. We focused on P2P systems where the number of users is large. For the limited content catalogue size scenario, we proved the optimality of a proportional-to-product placement in the Distributed Server Network architecture, and proved optimality of "Hot-Warm-Cold" placement in the Pure P2P Network architecture. For the large content catalogue scenario, we also established that proportional-to-product placement leads to optimal performance in the Distributed Server Network. Many interesting questions remain. To name only two, more general popularity distributions (e.g. Zipf) for the large catalogue scenario could be investigated; the efficiency of adaptive cache update rules such as the one discussed in Section IV-A, or classical alternatives such as LRU, in conjunction with a loss network operation, also deserves more detailed analysis. As N c = MB i=1 Z i , where Z i ∼ Ber(p) (p νi ρBU ) are i.i.d., according to the Chernoff bound, where a M 2/3 + νiM ρU /MB and I(x) sup θ {xθln(E[e θZi ])} is the Cramér transform of the Bernoulli random variable Z i . Instead of directly deriving the RHS of inequality (26), which can be done but needs a lot of calculations (see Appendix G), we upper bound it by using a much simpler approach here: For the same deviation, a classical upper bound on the Chernoff bound of a binomial random variable is provided by the Chernoff bound of a Poisson random variable which has the same mean (see e.g. [5]). Therefore, the RHS of inequality (26) can be upper bounded by , where Î(x) is the Cramér transform of a unit mean Poisson random variable, i.e., Î(x) = x log xx + 1. By Taylor's expansion of Î(x) at x = 1, the exponent in the last expression is equivalent to On the other hand, when M is large, -M 2/3 + νiM ρU ≥ 0 holds, hence we have where (-Ẑi ) ∼ Ber(p), â M -1/3 /B -p ∈ [-1, 0] when B is large, and it is easy to check that Î(â) = I(-â). Similarly as above by upper bounding e -MB•I(-â) , we can find that the exponent of the upper bound is also -Θ M 1/3 . Therefore, 2) The number of "good contents" in each class Denoting by X i the number of good contents in class i, we want to use a corollary of Azuma-Hoeffding inequality (see e.g. Section 12.5.1 in [10] or Corollary 6.4 in [5]) to upper bound the chance of its deviation from its mean. This corollary applies to a function f of independent variables ξ 1 , . . . , ξ n , and states that if the function changes by an amount no more than some constant c when only one component ξ i has its value changed, then for all t > 0, Back to our problem, each independent variable ξ j correspond to the choice of a content to be placed in a particular memory slot at a particular box (we index a slot by j for 1 ≤ j ≤ MB), and f (ξ) corresponds to the number of good contents in class i based on the placement ξ, i.e., X i = f (ξ). It is easy to see that in our case c = 1, hence we have Taking t = (MB) 2/3 in the above inequality further yields where (a) holds since Note that in order for the lower bound on X i shown in the above probability to be Θ(B), M ∼ o(B 1/2 ) is a sufficient condition. 3) The chance for a box to be "good" We call a replica "good" if it is a replica of a good content, and use C i to denote the number of good replicas of class i. We also call a box "good" if the number of good replicas of class i held by this box lies within ). As we did for "good contents," we will also use the Chernoff bound to prove that a box is good with high probability. Let E i represent an event that the number X i of good contents within class i satisfies which has a probability of at least 1 -2e -Ω((MB) 1/3 ) , according to inequality (29) when M ∼ o(B 1/2 ). Conditional on E i , according to the lower bound in inequality (25) (i.e., the definition of "good contents") and inequality (30), we have On the other hand, from the upper bound in inequality (25) and the fact X i ≤ α i B, we obtain that where the second parameters of the distributions of G i and H i are determined according to inequalities (32) and (31) respectively. We will see why we need these two "binomial bounds" on ζ i . By definition, where for all i ∈ I, By definition of stochastic ordering, where (a) can be obtained using a similar Chernoff bounding approach as for N c in Stage 1 of this proof. Thus, continuing from inequality (34), we further have Putting inequality (35) back to inequality (33) immediately results in 4) The number of "good boxes" We use a similar approach as in Stage 2 to bound the number of good boxes, say Y , which can be represented as a function g(ξ) where ξ = (ξ 1 , ξ 2 , • • • , ξ MB ) is the same content placement vector defined in Stage 2. Still, g(ξ) changes by an amount no more than 1 when only one component ξ i has its value changed, then for all t > 0, Pr(|Y -E[Y ]| ≥ t) ≤ 2e -2t 2 /(MB) , and taking t = (MB) 2/3 further yields Similarly as we obtain inequality (29), we finally come to Finally, consider the performance of the loss network defined by the "Counter-Based Acceptance Rule." We introduce an auxiliary system to establish an upper bound on the rejection rate. In the auxiliary system, upon arrival of a request for content c, L different requests are mapped to L distinct boxes holding a replica of c, but here they are accepted or rejected individually rather than jointly. Letting Z b (respectively, Z ′ b ) denote the number of requests associated to box b in the original (respectively, auxiliary) system, one readily sees that Z b ≤ Z ′ b at all times and all boxes and for each box b, the process Z ′ b evolves as a one-dimensional loss network. We now want to upper bound the overall arrival rate of requests to a good box: (a) Non-good contents Assume that upon a request arrival, we indeed pick L content replicas, rather than L distinct boxes holding the requested content (as specified in the acceptance rule). This entails that, if two replicas of this content are present at one box, then this box can be picked twice. However, since a vanishing fraction of boxes will have more than one replicas of the same content when M ≪ (min i α i )B (as proved in Lemma 2), we can strengthen the definition of a "good" box to ensure that, on top of the previous properties, a good box should hold M distinct replicas. It is easy to see that the fraction of good boxes will still be of the same order as with the original weaker definition. With these modified definitions, consider one non-good content c of class i cached at a good box. Its unique replica will be picked with probability L/N c when the sampling of L replicas among the N c existing ones is performed. Thus, since we ignore requests for all content c with N c ≤ M 3/4 (according to Assumption 1), the request rate will be at most ν i LM -3/4 . Besides, there are at most O(M 2/3 ) non-good content replicas held by one good box. The reason is as follows: By definition, a good box holds at least good content replicas among all classes, so the remaining slots, being occupied by non-good content replicas, are at most O(M 2/3 ). Therefore, the overall arrival rate of requests for non-good contents to a good box is upper bounded by The rate generated by a good content c of class i is ν i L/N c . Now, by definition of a good content, one has: This entails that the rate of requests for this content is upper bounded by ρLU M (1 + O(M -1/3 )). By definition of a "good box," there are at most α i ν i M/ρU + O(M 2/3 ) good content replicas of class i cached in this good box. Therefore, the overall arrival rate of requests for good contents to a good box is upper bounded by To conclude, for any good box b, the process Z ′ b evolves as a one-dimensional loss network with arrival rate no larger than ν = ν non-good + ν good = ρLU + O(LM -1/12 ), by combining the two results in (39) and (40). Using the Chernoff bound, we have Pr(S ≥ C) ≤ e -λI(C/λ) , where Back to the Erlang function in our problem, I(C/λ) = I((ρ + O(M -1/12 )) -1 ), hence, where the second inequality holds under the assumption that ρ < 1 (otherwise, the exponent will become 0 or +Θ(L)). The number of good replicas in good boxes is, due to inequality (37) and equation (38), at least MB(1-O(M -1/3 )), with a high probability (at least 1-2e -2(MB) 1/3 ). On the other hand, the total number of replicas of good contents is at most MB, which is the total number of replicas (or available cache slots). Now pick some small ǫ ∈ (0, 1/3) and let X denote the number of good contents which have at least M 2/3+ǫ replicas outside good boxes. Then necessarily, with a probability of at least 1 -2e -2(MB) 1/3 , XM 2/3+ǫ ≤ MB -MB(1 -O(M -1/3 )) = O(BM 2/3 ), i.e., X ≤ O(BM -ǫ ). According to inequality (29), the total number of good contents is Θ(B) (specifically, very close to |C| = αB) with a probability of at least 1 -2|I|e -2(MB) 1/3 , hence we can conclude that, with high probability, for a fraction of at least 1-O(M -ǫ ) of good contents, each of them has at least a fraction 1 -O(M -1/3+ǫ ) of its replicas stored in good boxes (since a good content has νi ρU M ± O(M 2/3 ) replicas in total by definition). We further use C to represent the set of such contents. Recall that A c was defined in Subsection IV-B as the steadystate probability of accepting a request for content c in the original system. For all c ∈ C, We then consider the computational complexity of this approximation algorithm. Assuming that {ν c } is sorted in the descending order, we have So the computational complexity is upper bounded by O(BC/P * ). Note that the constant parameter β can be adjusted to get a higher Pr( c∈C X c = M ) in order to reduce computational complexity. To achieve this, we can just choose a β which maximizes its lower bound P * , so The server can use any numerical methods (e.g., Newton's method) to seek a root of equation (44). In fact, this lower bound P * on Pr( c∈C X c = M ) is not tight, since it is just the largest item in the sum expression. When the popularity is close to uniformness (e.g., in a zipf-like distribution, α is small), this largest item is no longer dominant, so the lower bound P * is quite untight, which means we actually overestimate the computation complexity by only evaluating its upper bound. However, this will not affect the real gain we obtain after choosing the optimal β according to equation (44). Recall that we also proposed a simple sampling strategy in Section IV-A. It is easy to see that when some contents are much more popular than the others (e.g., zipf-like α is large), the probability that duplicates appear in one size-M sample is high, hence largely increases the number of resampling. Thus, it would be faster if we choose the Bernoulli sampling. However, when the popularity is quite uniform, the simple sampling works very well. An extreme case is that under the uniform popularity distribution, which shows that when C is large, you can get a valid sample almost every time. 1) A Heuristic Repacking Algorithm: We first describe the concept of "repacking." When the cache size M = 1, all the bandwidth resources at a certain box belongs to the content the box caches. When M ≥ 2, however, this is not the case: all the contents cached in one box are actually competitors for the bandwidth resources at that box. Let's consider a simple example in which B = 2, M = 2 and U = 1: Box 1 which caches content 1 and 2 is serving a download of content 2, while box 2 which caches content 2 and 3 is idle. When a request for content 1 comes, the only potential candidate to serve it is box 1, but since the only connection is already occupied by a download of content 2, the request for content 1 has to be rejected. However, if this ongoing download can be "forwarded" to the idle box 2, the new request can be satisfied without breaking the old one. We call this type of forwarding "repacking." In the the feasibility condition (1) and its equivalent form (2), we actually allow perfect repacking to identify a feasible {n c }. In a real system, perfect repacking needs to enumerate all the possible serving patterns and choose the best one based on some criterion, which is usually computationally infeasible. We then propose a heuristic repacking algorithm which is not so complex but can achieve similar functionality and improve performances, although imperfect. Several variables need to be defined before we describe the algorithm: • n c : the system-wide ongoing downloads of content c, which does not count the downloads from the server. • u b : a U -dimensional vector, of which the i-th component represents the content box b is using its i-th connection to upload (a value 0 represents a free connection). • c o : the "orphan content" which is affiliated with a new request or an ongoing download but has not been assigned with any box. • C o : the set of contents which has once been chosen as orphan contents. • t R : the number of repacking already done. Note that when choosing a box to serve a request, load balancing is already considered, which to some extent reduces the chance of necessary repacking in later operations. However, repacking is still needed for an incoming request for content c as soon as ∪ k>0 B k c = ∅. After getting a request for content c while ∪ k>0 In fact the external users issuing requests could keep local copies of previously accessed content, and hence experience "local service" upon reaccessing the same content. But we do not need consider this as this happens outside the perimeter of our system. Note that this construction in fact admits a form of fixed routing which is equivalently transformed from a dynamic routing model where each particular box is regarded as a link and calls of type c can use any single-link route corresponding to a box holding content c. This equivalent transform is based on the assumption that repacking is allowed (cf. Section 3.3. in[7]). We have already found this equivalent transform by converting feasibility condition (1) to (2) in Section III. We can get enough samples during each observation period of 10 time units (for example, when ρ = 1, B = 4000 and U = 4, the average arrivals would be 160000). It has also been checked that after the warm-up period, the distribution of cache states well approximates the proportional-to-product placement and is kept quite stably for the remaining observation period. In fact, around ρ = 1, they perform a little worse than the optimum. The reason is that ρ = 1 is the "critical traffic load" (a separation point between zero-loss and nonzero-loss ranges), under which the simulation results are easier to incur deviation from the theoretical value.

Optimal Content Placement for Peer-to-Peer Video-on-Demand Systems

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment