Scalable Distributed Video-on-Demand: Theoretical Bounds and Practical Algorithms

We analyze a distributed system where n nodes called boxes store a large set of videos and collaborate to serve simultaneously n videos or less. We explore under which conditions such a system can be scalable while serving any sequence of demands. We…

Authors: Laurent Viennot (INRIA Rocquencourt), Yacine Boufkhad (INRIA Rocquencourt, LIAFA)

Scalable Distributed Video-on-Demand: Theoretical Bounds and Practical   Algorithms
apport   de recherche ISSN 0249-6399 ISRN INRIA/RR--6496--FR+ENG Thème COM INSTITUT N A TION AL DE RECHERCHE EN INFORMA TIQUE ET EN A UTOMA TIQUE Scalable Distrib uted V ideo-on-Demand: Theoretical Bounds and Practical Algorithms Laurent V iennot — Y acine Boufkhad — F abien Mathieu — Fa bien de Mo ntgolfier — Die go Perino N° 6496 A v ril 2008 Unité de recherche INRIA Rocquenco urt Domaine de V oluceau, Rocquen court, BP 105, 781 53 Le Chesnay Cedex (France) Téléphone : +33 1 39 63 55 11 — Télécopie : +33 1 39 63 53 30 Salable Distributed Video-on-Demand: Theoretial Bounds and Pratial Algorithms Lauren t Viennot ∗ , Y aine Boufkhad † , F abien Mathieu ‡ , F abien de Mon tgoler † , Diego P erino ‡ Thème COM  Systèmes omm unian ts Pro jet GANG Rapp ort de re her he n ° 6496  A vril 2008  19 pages Abstrat: W e analyze a distributed system where n no des alled b oxes store a large set of videos and ollab orate to serv e sim ultaneously n videos or less. W e explore under whi h onditions su h a system an b e salable while serving an y sequene of demands. W e mo del this problem through a om bination of t w o algorithms: a video allo ation algorithm and a onnetion s heduling algorithm. The latter pla ys against an adv ersary that inremen tally prop oses video requests. Our main parameters are: the ratio u of the a v erage upload bandwidth of a b o x to the pla yba k rate of a video; the maxim um n um b er of onnetions c used for do wnloading a video; the n um b er m of distint videos stored in the system, i.e. its atalog size. In an homogeneous system (i.e. all no de apaities are equal) where a b o x do wnloads its video with no more than c equal rate onnetions, w e giv e neessary onditions for a hieving salable atalog size. In partiular, w e pro v e for that ase a lo w er b ound u ≥ max  1 + 1 c , µ  , where µ ≥ 1 is the maxim um gro wth fator of an y sw arm of b o xes viewing the same video during a p erio d of time equiv alen t to start-up dela y (our mo del tolerates sw arms gro wing exp onen tially with time). On the other hand, w e pro v e that atalog size Ω( n ) an b e a hiev ed with a en tralized s heduling algorithm when u ≥ max  1 + 1 c , µ  , c ≥ 2 and no des are reliable. A dditionally , w e prop ose a distributed onnetion s heduling algorithm asso iated to a random video al- lo ation s heme for heterogeneous systems where b o x upload apait y is prop ortional to storage apait y . It a hiev es atalog size Ω ( n/ log n ) and allo ws to suessfully handle a sequene of O ( n ) adv ersarial ev en ts with high probabilit y as long as u ≥ µ + 1 c . As a sp eial ase, it an b e used to solv e single video distribution with O (1) reliable seed b o xes, or O (log n ) unreliable seed b o xes, with onstan t apaities. Key-w ords: video-on-demand, salabilit y , p eer-to-p eer Supp orted b y ANR pro jet ALADDIN. Supp orted b y ollab orativ e pro jet MARDI I I b et w een INRIA and Orange Labs. ∗ INRIA Ro quenourt, F rane † LIAF A, P aris, F rane ‡ Orange Labs, Issy-les-Moulineaux, F rane P assage à l'é helle de servies distribués de vidéos-à-la-demande Résumé : Nous onsidérons un système de n n÷uds ( b oîtes ) qui héb ergen t un ensem ble de lms et  her he à diuser jusqu'à n ux vidéos sim ultanés. Une question qui se p ose est de sa v oir sous quelles onditions un tel système p eut passer à l'é helle tout en supp ortan t n'imp orte quelle séquene de demandes. Ce problème se déomp ose en deux parties: la répartition initiale des vidéos dans les b oîtes et l'allo ation des ressoures en fontion des demandes. P our e dernier problème, nous supp osons qu'un adv ersaire émet des demandes de manière inrémen tale. Les prinipaux paramètres du problème son t : le rapp ort u en tre l'upload mo y en des b oîtes et le débit néessaire à la leture de la vidéo ; le nom bre maximal c de onnetions utilisables dans la réup ération d'une vidéo ; le nom bre m de lms distints sto  k és dans le system (la taille du atalogue). Dans un système homogène (toutes les b oîtes on t les mêmes apaités), nous donnons à c xé les ondi- tions néessaires à la réalisation d'un système apable de passer à l'é helle. Nous mon trons en partiulier que max  1 + 1 c , µ  est une b orne inférieure p our u , µ ≥ 1 étan t le fateur de roissane maximal des ensem bles de demandes d'une vidéo p endan t une p ério de de temps de l'ordre du temps d'amore de leture d'une vidéo (notre mo dèle tolère ainsi une roissane exp onen tielle des demandes). Réipro quemen t, nous prouv ons que si u ≥ max  1 + 1 c , µ  , c ≥ 2 et si les b oîtes son t ables, alors il est p ossible d'a v oir une taille de atalogue m = Ω( n ) , a v e un algorithme d'allo ation en tralisé. Enn, nous prop osons un algorithme d'allo ation distribué, asso ié à un algorithme de répartition aléatoire adapté aux systèmes hétérogènes où la apaité d'upload des b oîtes est prop ortionnelle à leur apaité de sto  k age. Il est alors p ossible de servir une séquene de O ( n ) demandes adv ersariales a v e forte probabilité, a v e une taille de atalogue en Ω ( n/ log n ) , à la ondition d'a v oir u ≥ µ + 1 c . Mots-lés : vidéo-à-la-demande, passage à l'é helle, pair-à-pair S alable Distribute d Vide o-on-Demand 3 Streamrate Upload Capacity Storage Capacity Caching Reception (a) Bo x desription BOX SERVER BOX BOX BOX (b) F ully en tralized BOX MAIN SERVER BOX BOX BOX SERVER CACHE SERVER CACHE () Ca hed serv ers BOX SERVER BOX BOX BOX (d) P eer-assisted BOX BOX BOX BOX BOX BOX BOX BOX (e) F ully distributed Figure 1: Generi b o x desription, and p ossible Video-on-Demand ar hitetures 1 In tro dution 1.1 Ba kground The quest for salabilit y has yield a tremendous amoun t of w ork in the eld of distributed systems in the last deade. Most reen tly , the p eer-to-p eer omm unit y has gro wn up on the extreme mo del where small apait y en tities ollab orate to form a system whose o v erall apait y gro ws prop ortionally to its size. Historially , rst p eer-to-p eer systems w ere dev oted to ollab orativ e storage (see, e.g., [ 11 , 22 , 13 ℄). The aademi omm unit y has prop osed n umerous distributed solutions to index the on ten ts stored in a su h a system. Most prominen tly , one an men tion the n umerous distributed hash table prop osals (see, e.g., [21 , 23 , 20 , 24 ℄). Extreme atten tion has then b een paid to  ontent distribution . There no w exists eien t s hemes for single le distribution [ 8℄. Sev eral prop osals w ere made to o op erativ ely distribute a stream of data (see, e.g., [ 5, 17 , 26 , 27 , 14 , 10 ℄). The main diult y in streaming is to obtain lo w dela y and balaned forw arding load. Most reen tly , the problem of ollab orativ e video-on-demand has b een addressed. It has mainly b een studied under the single vide o distribution problem: ho w to ollab orativ ely do wnload a video le and view it at the same time [15 , 6, 3 , 2, 14 , 12 , 7 , 16 , 9℄. This someho w om bines b oth le sharing and streaming diulties. On the one hand, partiipan ts are in terested b y dieren t parts of the video. On the other hand, an imp ortan t design goal resides in a hieving a small start-up delay , i.e. the dela y b et w een the request for the video and the start of pla yba k. Most of these solutions rely on a en tral serv er for pro viding the primary op y of a video to the set of en tities ollab orativ ely viewing it. F ollo wing the pioneering idea of Suh et al. [25 ℄, w e prop ose to explore the onditions for a hieving fully distributed salable video-on-demand systems. One imp ortan t goal is then to enable a large distribute d  atalo g , i.e. a large n um b er of distint primary video opies distributiv ely stored. W e th us onsider the en tities storing the primary opies of the videos as part of the video-on-demand system. This mo del an enompass v arious ar hitetures lik e a en tralized system with do wnload-only lien ts, a p eer-assisted serv er as assumed in man y prop osed solutions, a distributed serv er with do wnload-only lien ts or a fully distributed system as prop osed in [25 ℄. These senarios are illustrated b y Figure 1 . The fully distributed ar hiteture is mainly motiv ated b y the existene of set-top b o xes plaed diretly in user homes b y In ternet servie pro viders. As these b o xes ma y om bine b oth storage and net w orking apaities, they b eome an in teresting target for building a lo w ost distributed video-on-demand system that w ould b e an alternativ e to more en tralized systems. 1.2 Related W ork A signian t amoun t of w ork has b een done on p e er-assiste d video-on-demand, where there is still a serv er (or a serv er farm) whi h stores the whole atalog. Annapureddy et al. [ 3 ℄ in v estigate the distribution (on-demand) of a single video. They prop ose an algorithm that uses a om bination of net w ork o ding, segmen t s heduling and o v erla y managemen t in order to handle high streamrates and slo w start-up dela ys ev en under ashr owds senarios. This follo ws an approa h similar to [14℄ onsisting in grouping view ers of the same segmen t of the video together. A daptations of the BitT orren t proto ol to the single video distribution are prop osed in [16 , 7℄. Cheng & al. prop ose [6 ℄ onnetions to no des at dieren t p osition in the video to enable V CR-lik e features (seeking, fast-forw arding, . . . ). A thorough analysis of single video distribution under P oisson arriv al is made in [15 ℄, strategies for pre-fet hing of future on ten t are sim ulated against real traes. Ca hing strategies are tested against real traes in [2℄. It is prop osed in [16 ℄ to use a distributed hash table to index videos a hed b y RR n ° 6496 4 Viennot & al. ea h no de. Ho w ev er, there is no guaran tee that the videos sta y in a he. All these solutions rely on a en tralized serv er for feeding the system with primary opies of videos. T o the b est of our kno wledge, only a few attempts ha v e b een made so far to in v estigate the p ossibilit y of a serv er-free video-on-demand ar hiteture. Suh et al. prop osed the Push-to-P eer s heme [25 ℄ where the primary opies of the atalog are pushed on set-top b o xes that are used for video-on-demand. The pap er addresses the problem of fully distributing the system (inluding the storage of primary opies of videos), but salabilit y of the atalog is not a onern. Indeed, a onstan t size atalog is a hiev ed: ea h b o x stores a p ortion of ea h video. A o de-based s heme is om bined to a windo w sliing of the videos and a pre-fet hing of ev ery video. The pap er is mainly dediated to a omplex analysis of queuing mo dels to sho w ho w lo w start-up dela y and suien tly fast do wnload of videos an b e a hiev ed. The system is tailored for b o xes with upload apait y lo w er than pla yba k rate. As w e will see, this is a reason wh y salable atalog annot b e a hiev ed in this setting. Finally , in a preliminary w ork [ 4 ℄, w e b egun to analyze the onditions for atalog salabilit y . This w ork mainly fo uses on the problem of serving pairwise distint videos with a distributed system with homogeneous apaities and no no de failure. Most notably , an upp er b ound of n + O (1 ) is sho wn for atalog size when upload is to o sare. A distributed video-on-demand is sk et hed based on pairwise distint requests and using an y existing single video distribution algorithm for handling m ultiply requested videos. W e extend m u h further this w ork to m ultiple requests, heterogeneous ase and no de  h urn senarios. W e an no w pro vide an upp er b ound of o ( n ) for atalog size when upload is sare and m ultiple requests are allo w ed. Seondly , w e pro v e that the maxim um o w te hnique prop osed for pairwise distint requests an b e extended to answ er an y demand with p ossible m ultipliit y . This requires a m u h more in v olv ed pro of. A dditionally , w e giv e insigh t on heterogeneous systems where no des ma y ha v e dieren t apaities one from another. Finally , w e prop ose a distributed algorithm om bining b oth primary video op y distribution and repliation of m ultiply requested videos. Let us no w giv e more details ab out the on tributions of the presen t pap er. 1.3 Con tribution This pap er mainly prop oses a mo del for studying the onditions that enable salable video-on-demand. Most imp ortan tly , w e fo us on salable atalog size and salable omm uniation s hemes. Our approa h onsists in rst form ulating neessary requiremen ts for salabilit y and then try to design algorithms based on these minimal assumptions. W e all b oxes the en tities forming the system. Most notably , w e require that a b o x do wnloads a video using a limited n um b er c of onnetions. This is a lassial assumption for ha ving a salable omm uniation main tenane ost in an o v erla y net w ork. Note that eien t n -no de o v erla y net w ork prop osals usually try to a hiev e c = O (log n ) . Equiv alen tly , w e assume that video data and video stream annot b e divided in to innitely small units. With at most c onnetions, a single onnetion should ha v e rate at least 1 c where 1 orresp onds the normalized pla yba k rate of the video. Similarly , as onnetions ha v e to remain steady during long p erio d of times with regard to start-up dela y t S , a b o x should store p ortions of video data of size at least c t S . This assumptions of minimal unit of data or minimal onnetion rate pro vided b y a b o x of the system are partiularly natural when one faes the problem of distributing video data on sev eral en tities: one ha v e to dene some elemen tary  h unk size and distribute one or more of them p er en tit y . W e rst sho w that these disrete nature assumptions on onnetion rates and  h unk size giv e raise to an upload bandwidth threshold. If the aver age uplo ad u is no more than 1, salable atalog size annot b e a hiev ed, a minimal a v erage upload of 1 + 1 c is th us required. Theorem 2 states this as so on as c = O ( n ε ) for an y ε < 1 2 (e.g., c is onstan t or b ounded b y a p oly-logarithmi funtion of n ). Moreo v er, a distributed video-on-demand system annot a hiev e salable atalog size if the n um b er of arriv als for a giv en video inreases to o rapidly . W e all swarm of a video the set of b o xes pla ying it. If the sw arm of a video an inrease b y a m ultipliativ e fator µ > 1 during a p erio d equiv alen t to start-up dela y t S , then it is neessary to ha v e upload u ≥ µ to repliate suien tly qui kly the video data (see Theorem 1). These lo w er b ounds on u mainly rely on the assumption that with large atalog size, some video m ust b e repliated on a limited n um b er of b o xes. (This assumption ma y b e dedued from our b ound c on the n um b er of onnetions or ma y b e tak en for itself ). On the other hand, w e giv e algorithms for enabling salable video-on-demand. W e mo del the algorithmi part of a video-on-demand system with t w o algorithms: a vide o al lo  ation algorithm is resp onsible for plaing video data on b o xes, and a she duling algorithm is resp onsible for managing video requests prop osed b y an adv ersary , i.e. prop ose onnetions for ea h b o x to do wnload its desired video. W e build t w o s heduling algorithms based on random allo ation of video data. Let us rst remark that is not p ossible to resist no de failures if some video has its data on a limited n um b er of b o xes: an adv ersary an plae no de failure ev en ts on these b o xes and then request the video. W e th us prop ose a rst s heduler under the assumption that no no de fails and that w e meet the onditions u ≥ max  1 + 1 c , µ  and c ≥ 2 . The problem of nding suitable onnetions for INRIA S alable Distribute d Vide o-on-Demand 5 n Num b er of b o xes for serving videos. m Num b er of videos stored in the system (atalog size). d i Storage apait y of b o x i (in n um b er of videos). d A v erage storage apait y of b o xes. k Num b er of dupliates opies of a video with random allo ation ( k ≈ nd/m ) u i Upload apait y of b o x i (in n um b er of full video streams). u A v erage upload apait y of b o xes. c Maxim um n um b er of onnetions for do wnloading a video. s Num b er of strip es of videos (a video an b e view ed b y do wnloading its s strip es sim ultaneously). a Minim um ratio of ativ e b o xes in an homogeneous system. t S start-up dela y: maxim um dela y to start pla ying a video. v S Maxim um n um b er of arriv als during t S for a video not b eing pla y ed. µ Bound on sw arm gro wth: if a sw arm has size p at time t , it has size less than µp at time t + t S . T able 1: Key parameters do wnloading all videos redue to a maxim um o w problem for a giv en set of requests and a giv en allo ation of videos. W e th us prop ose a en tralized s heduler running a maxim um o w algorithm. If a en tralized tra k er for or hestrating onnetions has already b een prop osed in sev eral p eer-to-p eer ar hitetures, it is not lear whether this maxim um o w omputation ould b e made in a salable w a y . The b enet of this algorithm is th us mainly theoretial. It allo ws to understand the nature of the problem. Theorem 3 states that a random allo ation enables a atalog of size Ω( n ) and allo ws to manage an y innite sequene of adv ersarial requests with high probabilit y (as long as the adv ersary annot prop ose no de failures). The problem of salable video-on-demand an th us b e solv ed with optimal upload apait y in theory . In terestingly , this s heme allo ws to sho w that the b est atalog size is obtained when the storage apait y of b o xes is prop ortional to their upload apait y . A dditionally , w e prop ose a randomized distributed s heduler based on priorit y to playb ak  ahing , i.e. relying on the fat that b o xes pla ying a video an redistribute it. Giving priorit y to su h onnetions allo ws to b e resilien t to exp onen tial sw arm gro wth. W e sho w that with the random allo ation of Ω( n/ log n ) videos in a system where a v erage storage apait y is d = Ω(log n/c ) p er b o x, this s heduler an manage O ( n ) realisti adv ersarial ev en ts with high probabilit y under the assumption that u ≥ µ + 1 c and the adv ersary is not a w are of the s heduler and allo ation algorithm  hoies (see Theorem 4). In terestingly , our use of pla yba k a hing allo ws to build disjoin t forw arding trees for video data in a w a y similar to Splitstream [5℄. The main dierene is that rela ying no des buer data b efore forw arding it and tree lev els are ordered aording to the pla ying p osition in the video. The pap er is organized as follo ws. Setion 3 exp oses the requiremen ts that are needed for the atalog to b e salable. Setion 4 in v estigates the w orst ase analysis of the problem with no failures; while Setion 5 onsiders more realisti onditions. Then Setion 6 prop oses to onrm the results of previous setions b y the din t of sim ulations. Some pro ofs are in giv en in app endix due to spae limitations. W e no w in tro due our mo del for video-on-demand systems and the notations used throughout this pap er. 2 Mo del W e rst in tro due the k ey onepts of video-on-demand systems and disuss the asso iated parameters. W e rst desrib e the no des (often alled b o xes) of the system, then detail ho w they ma y onnet to ea h other to ex hange data. W e then explain ho w w e deomp ose the algorithmi part of the system and desrib e adv ersary mo dels for testing our algorithms. Video system. W e onsider a set of n b o xes used to serv e videos among themselv es. Bo x i has storage apait y of d i videos and upload apait y equiv alen t to u i video streams. F or instane if u i = 1 , b o x i an upload exatly one stream (w e supp ose all videos are eno ded at the same bitrate, normalized at 1 ). Su h a system will b e alled an ( n, u, d ) - vide o system where u = 1 n P n i =1 u i is the a v erage upload apait y and d = 1 n P n i =1 d i is the a v erage storage apait y . A system is homo gene ous when u i = u and d i = d for all i . Otherwise, w e sa y it is heter o gene ous . The sp eial ase when storage apaities are prop ortional to upload apaities (i.e. d i = d u u i for all i ) is alled pr op ortional ly heter o gene ous . RR n ° 6496 6 Viennot & al. The b o x ativit y is dened as a state . Bo x i is ative when it an a hiev e a stable upload apait y no less than u i or inative otherwise (e.g. when it is under failure or turned o b y user). W e supp ose that the ratio a of ativ e b o xes remains roughly onstan t. W e assume that the no des with higher apait y are not more prone to failure than the other no des, so the a v erage upload apait y of ativ e b o xes remains larger than u . An ativ e b o x ma y b e playing when it do wnloads a video or id le otherwise. The set of b o xes pla ying the same video v is alled swarm . No de  h urn o urs as sequene of events onsisting in  hanging the state of a b o x. Swarm hurn designates the ev en ts onerning a giv en sw arm. W e will see in Setion 3.1 that salabilit y annot b e a hiev ed when a sw arm gro ws to o rapidly . W e th us assume a b ounded gr owth fator µ : during a p erio d of time t S ( t S is dened b elo w), the size of a sw arm is m ultiplied b y a fator µ at most. More preisely , and to remo v e an y quan tiation issues, w e assume that the n um b er of ev en ts for a giv en sw arm and a giv en p erio d of time t is at most µ t/t S . (F or on v eniene, w e aggregate the v arious t yp es of sw arm  h urn within the same b ound). Connetions. W e assume that nding, establishing and setting up a small buer for starting video pla yba k tak es time. W e all start-up delay the maximal duration t S for a b o x to onnet to other b o xes and b egin pla yba k. W e onsider that the n um b er c n of onnetions for do wnloading a video is b ounded b y some onstan t c . The reason is that with onstan t sw arm  h urn rate, a b o x will ha v e to  hange Ω( c n ) onnetions p er unit of time. As  hanging a onnetion has some lateny Ω( t S ) , this n um b er should remain b ounded or gro w v ery slo wly with n . In onnetion with this assumption, w e supp ose that the data of video annot b e split in innitely small piees. W e th us onsider that a onnetion has minimal rate 1 c (this is ob viously the ase when onnetions rates are equally balaned, and it an b e mo deled b y aggregating unitary onnetions otherwise). Therefore the minimal piee of video data stored on a b o x is Ω  1 c  (a trivial lo w er b ound of t S c follo ws from previous assumptions). A p eer-to-p eer video system without an y external video souring relies on the p ossibilit y to repliate a video as it b eomes more p opular and the n um b er of requests for the video inreases. The most straigh tforw ard w a y to do this is to a he in ea h b o x the video it is urren tly pla ying, whi h is natural if w e w an t to pro vide some V CR funtionalities. W e all Playb ak  ahing this failit y: b o xes of a sw arm an serv e as a rela y for for the b o xes viewing a former part of the video. Note, that in order to bring some exibilit y in the sw arm, the video an b e split in to time windo ws, th us allo wing to a v oid linear viewing. Time windo wing also allo ws to redue the problem to the ase where all videos ha v e appro ximately the same duration. Video data manipulations. W e onsider that all videos ha v e same pla yba k rate, same size and same duration (all three equal to 1 as they are tak en as referene for expressing quan tities). T o enable m ulti-soure upload of a video, ea h video ma y b e divided in s equal size strip es using some balaned eno ding s heme. The video an then b e view ed b y do wnloading sim ultaneously the s strip es at rate 1 /s . A v ery simple w a y of a hieving stripping onsists in splitting the video le in a sequene of small pa k ets. Strip e i is then made of the pa k ets with n um b er equal to i mo dulo s . Note that our onnetion n um b er limitation imp oses s ≤ c . There are t w o main reasons for using strip es: it allo ws to build in ternal-no de-disjoin t trees as disussed in Setion 5 and it let a b o x upload sub-streams of rate 1 s to fully use its upload apait y . Strip es ma y also enable redundany through orreting o des at the ost of some upload o v erhead: do wnloading (1 − ε ) s strip es is then suien t to deo de the full video stream (e.g. using L T-o des [ 18 ℄ or rateless eno ding [ 19 ℄). F or the sak e of simpliit y , w e assume that s an b e large enough to onsider all u i s and d i s as in tegrals. As men tioned previously , a video an b e distributed among sev eral b o xes b y splitting it aording to time windo ws. Ho w ev er, onsidering all the time windo ws of all videos b eing pla y ed at giv en time, w e are ba k to the same problem fundamen tally . F or that reason, w e do not dev elop time windo wing. Video s heme. A vide o al lo  ation algorithm is resp onsible for plaing primary opies of ea h video in the system resp eting storage apait y onstrain ts of b o xes. The most simple s heme onsists in storing them statially: video data ma y b e repliated but primary opies of videos are stati. Video allo ation only  hanges when new b o xes are added to the system or when the atalog is up dated. F or instane, re-allo ating primary opies under no de  h urn w ould not b e pratial when liv e onnetions onsume most of the upload apait y of the system. W e assume that the atalog renew al is made at a m u h larger time sale. Its size and storage allo ation are th us onsidered xed during a p erio d of sev eral pla yba k times. W e ma y assume that the atalog remains the same during su h p erio ds. Of ourse as the system ev olv es o v er a long p erio d of time, some videos are added or remo v ed. The  atalo g size is the n um b er of distint videos allo ated. When a b o x state  hanges, a she duling algorithm deides ho w to up date the onnetions of pla ying b o xes so that their video is do wnloaded at rate greater than 1 and all b o x upload apaities are resp eted. F or INRIA S alable Distribute d Vide o-on-Demand 7 our theoretial b ounds (Setion 4 ), w e use a en tralized s heduler that has full kno wledge of the system. F or pratial algorithms ( Setion 5), w e onsider distributed s heduling algorithms: ea h time a b o x  hanges it state, it runs the s heduling algorithm on its o wn. The s heduling algorithm sueeds if it an establish onnetions to do wnload the full video stream in time less than t S . W e all vide o sheme a om bination of an allo ation s heme and a s heduling algorithm. W e sa y that a video s heme ahieves atalog size m if the allo ation s heme an store m videos in the system so that the s heduling algorithm sueeds in handling all requests of an adv ersary . The adversary kno ws the list of videos in the atalog and prop oses an y sequene of no de state  hanges that resp ets our mo del assumptions. In its w eak er form, it is not a w are of the deisions made b y the allo ation and s heduling algorithms. This is a realisti assumption as there is no reason for user requests to b e orrelated to something the users of the system are not a w are of. W orst ase analysis is obtained with the str ong adversary whi h is the most p o w erful adv ersary p ossible. It is additionally a w are of the  hoies made b y the allo ation and s heduling algorithms. In partiular, it kno ws whi h b o xes on tain replias of a giv en video. If not sp eied, the adv ersary is not strong. 3 Neessary Conditions for Catalog Salabilit y Let us rst giv e some trivial requiremen ts. The total upload is at most un and, as all ativ e b o xes ma y b e pla ying, the total do wnload apait y needed ma y b e n so w e trivially dedue the lo w er b ound u ≥ 1 . As the total storage spae of ativ e b o xes an b e as lo w as adn (assuming that a v erage storage apait y of ativ e b o xes remains d ), w e ha v e m ≤ adn . Let us rst remark that if w e release the onstrain t on b ounded onnetivit y , then ideal storage of adn videos an theoretially b e a hiev ed in an y prop ortionally heterogeneous ( n, 1 , d ) -video system when c = n . As stated in the homogeneous ase [25 ℄, ful l stripping an a hiev e this. It onsists in splitting ea h video in n strip es, one p er b o x. Viewing a video then requires to onnet to all other b o xes. This result an easily b e generalized to the prop ortionally heterogeneous ase with no de failures using orreting o des. Su h s heme are unpratial for large n but giv e a theoretial solution. On the other hand, w e sho w that some upload pro visioning is neessary in our more realisti mo del. The main h yp othesis implying these results is that some video is repliated on O  n m  b o xes at most, i.e. o ( n ) if atalog sales. First note that as so on as a video spans at most o ( n ) b o xes, the system annot tolerate n strong adv ersarial ev en ts. Indeed, the strong adv ersary an prop ose failure ev en ts on all b o xes p ossessing a giv en video and then prop ose a request with the video. 3.1 Maximal Sw arm Ch urn Rate W e no w state that arriv al rate in a giv en sw arm m ust b e lo w er than a v erage upload. This is our rst non trivial lo w er b ound on a v erage upload. Theorem 1 A ny homo gene ous ( n, u, d ) -vide o system ahieving  atalo g size m and r esilient to swarm gr owth µ satises u ≥ max  2 , µ − O  1 m  F or small start-up dela y , a realisti v alue of µ w ould ertainly b e less than 2 . Salable atalog size is then a hiev able for u ≥ µ only . Pro of. W e onsider a senario where b o xes are viewing dieren t videos, and all of them swit h to the same video forming a sw arm with gro wth fator µ . The sw arm of the video has th us size v S at time 0, v S µ at time t S , v S µ 2 at time 2 t S , and more generally size v S µ i at time it S . W e  ho ose a video that is repliated at most k = O  n m  times in the system. If this data is p ossessed b y suien tly man y b o xes, it an b e repliated k times initially . Consider the n um b er of times x i the data of the video is repliated outside the sw arm at time it S . Supp ose that all b o xes p ossessing the video either serv e new arriv als or pro-ativ ely repliate it with their remaining bandwidth. W e then ha v e v S µ i +1 + x i +1 ≤ v S uµ i + ( u − 1) x i as the video data m ust b e reeiv ed b y all b o xes in the sw arm and b o xes outside the sw arm that repliate it. Supp ose u ≤ µ (otherwise the pro of is already o v er). W e get x i +1 ≤ ( u − 1 ) x i , and th us x i ≤ ( u − 1 ) i k as x 0 ≤ k . The former inequalit y th us giv es u + k v S  u − 1 µ  i ≥ µ . If u < 2 , w e obtain for i = log µ n that u ≥ µ − k v S n = µ − O  1 m  .  A dditionally , w e an pro v e that a strit upload of 1 is not suien t ev en under lo w pae arriv als. RR n ° 6496 8 Viennot & al. 3.2 Upload Capait y v ersus Catalog Size W e th us assume in this setion that c = O ( n ε ) for some ε > 0 . This is for example the ase when c is a p oly-log of n as often assumed in o v erla y net w orks [21 , 23 , 24 ℄. (The rest of the pap er assumes a onstan t c ). With this b ound, w e an establish the follo wing trade-o b et w een a v erage upload apait y and a hiev able atalog size. Theorem 2 F or any ε > 0 , an homo gene ous ( n, u, d ) -vide o system with u ≤ 1 and c = O ( n ε ) that  an play any demand of n vide os in the no failur e str ong adversary mo del has  atalo g size m = O  n 1 / 2+ ε  . The ab o v e result states that a video system with sare apait y p o orly sales with n . As it is v alid in the no failure strong adv ersary mo del, it remains v alid in the strong adv ersarial mo del. With our disrete vision of onnetions, it implies that a minimal upload u ≥ 1 + 1 c is neessary for salabilit y . Pro of. Supp ose there exists ε > 0 with ε < 1 2 su h that c < n ε . As disussed in Setion 2, w e use our assumption that a b o x stores no less than t S c data of a giv en video. Supp ose b y on tradition that there exists a video system with atalog size m > 2 d t S n 1 / 2+ ε > 2 dc t S √ n . As the o v erall storage apait y is dn , there exists some video v whose data is repliated at most dn m ≤ t S 2 c √ n times. As useful p ortion of data of v ha v e size at least t S c , the set E of b o xes storing data of v has size at most 1 2 √ n . Let F = E b e its omplemen tary . Set p = | E | ≤ 1 2 √ n and q = | F | . No w onsider the p ossible request sequene where all b o xes b 1 , . . . , b q of F suessiv ely b egin to pla y v while b o xes of E pla y videos not stored at all among b o xes in E ∪ { b q } . Bo x b i an do wnload v from E i = E ∪ { b 1 , . . . , b i − 1 } . Bo xes of E an only do wnload from F ′ = F \ { b q } . Supp ose that data of v o ws from E to F ′ at rate p ′ and from E to b q at rate p ′′ . W e ha v e p ′ + p ′′ ≤ p sine the o v erall upload apait y of E is p . Data of v o ws in ternally to F ′ at rate at least q − 1 − p ′ . The remaining upload apait y to serv e E is th us p ′ − (1 − p ′′ ) ≤ p − 1 as E m ust additionally serv e b q at rate 1 − p ′′ . This implies that the n um b er of videos not stored at all on E ∪ { b q } is at most p − 1 . (Otherwise, w e ha v e a request that annot b e satised.) As a b o x on tains data of dc t S distint videos at most. W e th us dedue m ≤ dc t S ( p + 1) + p − 1 ≤ dc t S √ n < d t S n 1 / 2+ ε . This is a on tradition and w e dedue m = O  n 1 / 2+ ε  .  W e dedue from the previous results that u ≥ max  1 + 1 c , µ  is a minimal requiremen t for salabilit y . W e no w sho w that it is indeed suien t. 4 Strong A dv ersary Video S heme W e no w prop ose a video s heme a hieving atalog size Ω( n ) in the no failure strong adv ersary mo del for an y video system with a v erage upload u ≥ ma x  1 + 1 c , µ  . It is based on random allo ation of video strip es using s = c strip es p er video and uses a maxim um o w s heduler. 4.1 Random Allo ation Random allo ation onsists in storing k opies of ea h strip e b y  ho osing k b o xes uniformly at random. This approa h w as prop osed b y Boufkhad & al [4℄ using a purely random graph with indep enden t  hoies. This has the disadv an tage to un balane the quan tit y of data stored in ea h b o x. W e th us prefer to onsider a regular bipartite graph where all storage spae is used on all b o xes. W e ould obtain the same b ounds for the purely random graph. Analysis is sligh tly more ompliated in our ase. F or the sak e of simpliit y , w e assume k = dn/m is an in teger. A regular random allo ation onsists in op ying ea h strip e in k b o xes su h that ea h b o x on tains exatly ds strip e opies. W e mo del this through a random p erm utation π of the k ms strip e opies in to the dns storage slots of the n b o xes together: op y i is stored in slot π ( i ) (the d 1 s rst slots fall in to the rst b o x, the d 2 s next slots in to the seond b o x, and so on). The b est atalog size is obtained for the smallest p ossible v alue of k . W e all r andom al lo  ation sheme the video allo ation algorithm onsisting in seleting uniformly at random a p erm utation π and in allo ating videos aording to π . 4.2 Maxim um Flo w S heduler W e prop ose a onnetion s heduler relying on pla yba k a hing. Ea h time a no de state  hanges, a en tralized tra k er onsiders the multiset of strip e r e quests , i.e. the union of all the video strip es b eing pla y ed (some strip es INRIA S alable Distribute d Vide o-on-Demand 9 ma y b e pla y ed m ultiple times) and tries to mat h strip e requests against b o xes so that b o x i has degree at most u i s . W e an mo del this problem as a o w omputation in the follo wing bipartite graph b et w een strip e requests and the b o xes storing these strip es. An ar of apait y 1 links ev ery strip e request to all b o xes where it is stored (either through the stati allo ation s heme or through pla yba k a hing). The s heduling algorithm onsists in running a maximal o w algorithm to nd a o w from strip e requests to b o xes with the follo wing onstrain ts: ea h request has an outgoing o w of 1 and su h that b o x i has inoming o w of u i s at most. W e pro v e that a random regular graph using s ≤ c strip es with u ≥ ma x  1 + 1 s , µ  has the follo wing prop ert y with high probabilit y: for an y m ultiset of n requests at most, a o w with the desired onstrain ts exists. The pro of onsists in pro ving that a random regular allo ation graphs has some expander prop ert y with high probabilit y . A min-ut max-o w theorem allo ws to onlude and state the follo wing theorem. Theorem 3 Consider a pr op ortional ly heter o gene ous ( n, u, d ) -vide o system with u ≥ ma x  1 + 1 c , µ  and c ≥ 2 . R andom r e gular al lo  ation  ombine d with the maximum ow she duler al lows to ahieve  atalo g size Ω( dn/ log u d ) and to manage su  essful ly any innite se quen e of str ong adversarial events ex epting no de fail- ur es with high pr ob ability. The pro of generalizes in a non trivial manner the pro of of [4 ℄ that assumes a purely random graph allo- ation, pairwise distint requests and homogeneous apaities. Due to spae limitations, the pro of is giv en in App endix A. 4.3 Heterogeneous Capaities As disussed in App endix A, in the ase of heterogeneous apaities, the pro of requires the follo wing balane ondition. F or all set E of b o xes with o v erall upload apait y U E = P b ∈ E u b and o v erall do wnload apait y D E = P b ∈ E d b w e ha v e for some u ′ ≥ µ + 1 s : U E D E ≥ u ′ d (The n um b er of opies p er strip e in the allo ation graph is then k = O (log u ′ d ) ). Note that u ′ = u in the prop ortionally heterogeneous ase and that u ′ ≤ u in general. Ha ving storage apait y prop ortional to upload apait y is th us the b est situation to optimally b enet from the b o x apaities. In the general heterogeneous ase, a p ossible random allo ation s heme onsists in using only storage d ′ b = d u b u ′′ for ea h b o x b for some u ′′ ≥ u a hieving b est storage apait y . If b o x upload apaities are within a onstan t ratio, this will a hiev e a atalog size within a onstan t ratio of the balaned s heme. 4.4 P o or Upload Capait y Bo xes Sp eial are has to b e tak en for an heterogeneous ( n, u, d ) -video system where some b o xes ha v e upload apait y smaller than µ . W e sa y that su h b o xes are p o or . The ab o v e onnetion s heduler ma y b e defeated b y do wn- loading the same video on a large set E of su h p o or b o xes, as it ma y not supp ort exp onen tial gro wth. This omes from the fat that the storage spae for the video oming from pla yba k a hing ma y get larger than U E . The ab o v e ondition on the balane b et w een storage and upload is then violated b y pla yba k a hing storage. The general heterogeneous ase is redued to the ase where uploads apaities are all greater or equal to µ thanks to the follo wing lemma. (This is the last step of the pro of of Theorem 3 ). Due to spae limitations, the pro of is giv en in App endix A. Lemma 1 Consider an ( n, u, d ) -vide o system A with n P b oxes of uplo ad less than µ having over al l uplo ad  ap aity U P and a vide o al lo  ation sheme with s strip es satisfying u ≥ µ . Ther e exists an ( n, u, d + n P − U P /µ n ) - vide o system B with same vide o al lo  ation and, for e ah b ox b , uplo ad  ap aity u ′ b satisfying µ ≤ u ′ b ≤ u b , and same aver age uplo ad u , that  an emulate any sheme of A in the no no de failur e str ong adversary mo del. The idea b ehind this redution is to statially reserv e some upload bandwidth of ri h b o xes to p o or b o xes. The a v erage upload of b oth systems is th us the same. When a p o or b o x b with upload u b < µ do wnloads a video, it diretly do wnloads u b s/µ strip es as in the s heduling of A and do wnloads the others through rela ying b y the ri h b o xes it is asso iated to. The ri h b o xes insert also the strip es they forw ard in their pla yba k a he. This explains wh y more storage apait y is required. Pro of is giv en in App endix A. RR n ° 6496 10 Viennot & al. 5 Distributed Video S heme 5.1 Purely Random Allo ation The video are stored in the b o xes aording to a purely random allo ation s heme: ea h strip e of a video is repliated k times. s still denotes the n um b er of strip es p er video used. Ea h replia is stored in a b o x  hosen indep enden tly at random. Bo x i is  hosen with probabilit y d i dn . It is p ossible to add a video in the system as long as the k  hosen b o xes ha v e suien t remaining storage apait y . Su h an allo ation s heme is qualied as pur ely r andom . 5.2 Pla yba k Ca he First S heduler W e no w prop ose a randomized distributed s heduling algorithm. The main idea of our s heduler is to giv e priorit y to pla yba k a he o v er allo ated videos to allo w sw arm gro wth µ . Only one upload onnetion is reserv ed for video allo ation uploading. An a v erage upload u ≥ µ + 1 s will th us b e required. The s heduling algorithm is split in t w o parts: strip e sear hing and onnetion gran ting. Strip e se ar hing is the algorithm run b y a b o x for nding another b o x p ossessing a giv en strip e. This algorithm relies on a distributed hash table (or an y distributed indexing algorithm) to obtain information ab out a giv en strip e. This index allo ws a b o x to learn the omplete list of b o xes p ossessing the strip e through the video allo ation algorithm and a partial list of b o xes in the video sw arm (i.e. b o xes pla ying the video of the strip e). Strip e sear hing onsists in probing the b o xes in these lists un til a b o x aepts a onnetion for sending the strip e. A onnetion request inludes the strip e requested and the strip e p osition in the strip e le (i.e. an oset p osition indiating the next o tet of video data to b e reeiv ed). A b o x is eligible for a onnetion if it has suien tly man y video strip e data ahead that p osition and if it has suien tly man y upload. This is deided b y the onnetion gran ting algorithm of the b o x reeiving the request. T o giv e priorit y to pla yba k- a he forw arding, b o xes of the allo ation s heme are prob ed only when the sw arm size is less than v S or when a strip e is do wnloaded from a video allo ation op y less than v S times. T o balane upload, sev eral b o xes are rst prob ed at the same time, and an aepting b o x with least n um b er of upload onnetions for the requested video is seleted. Conne tion gr anting is the algorithm run b y a b o x that is prob ed for a onnetion request. Supp ose b o x x reeiv es a onnetion request from b o x y for a strip e of video v . The onnetion gran ting algorithm onsists in the follo wing steps. 1. If b o x x is not viewing v and is already uploading the strip e, it refuses. 2. If b o x x has suien t upload apait y , it aepts. 3. Otherwise, if b o x x is not pla ying v , it refuses. 4. Otherwise, if the strip e p osition of x for that strip e is not suien tly ahead the requested strip e p osition, it refuses. 5. Otherwise, if t w o or more upload onnetions of b o x x onern a strip e of a video dieren t from v , x selets one of them at random, loses it and aepts b o x y . 6. Otherwise, if b o x x is uploading the same strip e to some b o x z and and the requested strip e p osition of y is suien tly ahead the strip e p osition of z , it loses the onnetion to z and aepts. 7. Otherwise, it refuses. Note that Steps 4, 5 and 6 an b e exeuted only if b o x x pla ys v . Step 6 an b e exeuted only if it uploads us − 1 strip es of video v . (One onnetion is alw a ys reserv ed to serv e allo ated strip es). A simple optimization in Steps 6 and 7, onsists in  onne tion ipping . In Step 6, b o x x an send to b o x z the address of b o x y for re-onneting as the strip e p osition of y is suien tly ahead the strip e p osition of z in that ase. Bo x z an then prob e b o x y with the same algorithm. In Step 7, b o x y an b e redireted to an y b o x x ′ do wnloading v from x and ha ving strip e p osition suien tly ahead the strip e p osition of y . Bo x y an then prob e b o x x ′ with the same algorithm. This w a y , a b o x an nd its righ t p osition aording to strip e p osition in a do wnloading tree path of its sw arm. Similarly , in Step 4, b o x y an mak e a onnetion ipping with the b o x from whi h x is do wnloading, and go up the do wnloading  hain un til it nds its righ t p osition. INRIA S alable Distribute d Vide o-on-Demand 11 Note that this algorithm w orks in similar manner as Splitstream [ 5 ℄ builds parallel m ultiast trees for ea h strip e. The main dierene is that ea h in ternal no de of a tree reeiv es fresh data in a buer and forw ards data whi h is at least t S old. That w a y , a p erformane blip within one no de will not p erolate to all no des b ehind it in the sub-tree. Moreo v er, this ensures that a no de has suien t time to reo v er from a paren t failure. In addition, trees are ordered aording to strip e p osition: b o xes with foremost pla ying p osition in the video get loser to the ro ot whereas new omers in the sw arm tend to b e in lo w er tree lev els. Another in teresting p oin t is that no des do wnloading from a b o x with spare n um b er of onnetions b enet from this free upload apait y and do wnload at a rate faster than needed, allo wing to ll their buer. 5.3 Corretness W e annot pro v e the resiliene of our video s heme against an y sequene of adv ersarial ev en ts. The follo wing te hnial assumption is neessary for our pro of and app ears as a realisti h yp othesis. W e assume that a giv en strip e is sear hed at most O (log r ) times on b o xes storing it through the video allo ation s heme. This requiremen t is met when the sequene of adv ersarial ev en ts resp et the t w o follo wing onditions. First, a onstan t n um b er of sw arms are started on a giv en video (a realisti assumption if w e onsider a p erio d of few pla yba k durations). (There is no restrition on sw arm size). Seond, no de failures are randomly  hosen and a giv en b o x is  hosen with probabilit y p f < 1 / v S . A sequene of r requests is said to b e str ess-less if it satises these onditions. Theorem 4 Consider a pr op ortional ly heter o gene ous ( n, u, d ) -vide o system with u ≥ µ + 1 c and dc u = Ω(log n ) . F or any b ound r = O ( n ) , it is p ossible to al lo  ate Ω( n/ log n ) vide os and su  essful ly manage r adversarial str ess-less events with high pr ob ability. T o pro v e this theorem, w e analyze a simpler unitary video system whi h an b e em ulated b y an y prop or- tionally heterogeneous system with same o v erall apaities. Again, w e  ho ose to use s = c strip es p er video and assume u ≥ µ + 1 s . W e view ea h b o x i as the union of u i s unitary b o xes with upload apait y 1 /s (one strip e) and storage apait y d i u i = d u . This redution is indeed p enalizing. Consider t w o unitary b o xes that are part of the same real b o x. In the mo del, strip es stored on one unitary b o x an not b e uploaded b y the other whereas the real b o x ould use t w o uploads slots for an y om bination of t w o strip es of an y of the unitary b o xes. F or some parameter k made expliit later on, a random allo ation of k replias p er strip e is made aording to the purely random allo ation s heme desrib ed previously . This is equiv alen t to supp ose that ea h replia is stored in a unitary b o x  hosen uniformly at random sine the system is prop ortionally heterogeneous. As ea h unitary b o x has a storage apait y of ds u strip es, Cherno 's upp er b ound allo ws to onlude that purely random allo ation of Ω( dsn/u ) strip e replias is p ossible with high probabilit y when ds u = Ω(log n ) . As w e will use k = O (lo g n ) , this a hiev es the required atalog size. Seond, w e simplify the s heduler to an algorithm where t w o s hedulers omp ete. One is allo ating  ahe strip e requests within a sw arm (i.e. the strip e will b e do wnloaded from a pla yba k a hing op y), the other is allo ating se e d strip e requests from the video allo ation p o ol (i.e. the strip e will b e do wnloaded from a unitary b o x p ossessing it through the random allo ation s heme). W e onsider that b oth s heduler op erate indep enden tly . This is a p enalt y with regard to pratial s heduling, where simple heuristis ma y redue onsiderably the n um b er of onits, but it simplies the sto  hasti analysis of the system. The a he s heduler allo ates sw arm strip es and has priorit y: it op erates at real b o x lev el aording to the ab o v e algorithm. F rom the unitary b o x p oin t of view of the seed s heduler, the a he s heduler disables some unitary b o xes. If the unitary b o x w as uploading some allo ated strip e, it is aneled and a seed strip e sear h is triggered. This is where the reserv ation of one seed strip e p er real b o x is useful in our analysis. A strip e upload onnetion is aneled when the real b o x has at least t w o of them. As the a he s heduler anels one b o x at random uniformly , a giv en seed strip e is sear hed at most O (log n ) times with high probabilit y . The seed s heduler sans the list of unitary b o xes p ossessing the strip e un til a free one is found. Note that a video request in the real system triggers at most s a he strip e requests and/or s seed strip e requests. A no de failure on a b o x uploading us − 1 a he strip es results in us − 1 a he strip e requests. Ea h of them ma y inur a seed request. The w orst ev en t is a video zapping whi h is equiv alen t to b oth ev en ts at the same time 1 . r adv ersarial requests th us result in ( u + 1 ) sr seed strip e sear hes at most. Claim 1 A l l se e d strip e se ar hes su  e e d with pr ob ability gr e ater than 1 − O ( 1 n ) . 1 In the video zapping ev en t from video v to video v ′ , the b o x an an indeed on tin ue to upload the data of v , but it annot on tin ue to do wnload more data. In the w orst ase, the buered data ma y b e sare for all b o xes do wnloading from the b o x. RR n ° 6496 12 Viennot & al. Pro of. W e tak e the p oin t of view of the seed s heduler: a unitary b o x is free if its real b o x is ativ e, and the a he s heduler is not using it. As the adv ersary and the a he s heduler op erate indep enden tly from strip e allo ation, w e mak e the analysis as if the random  hoies used for strip e allo ation w ere diso v ered as seed requests arriv e. In our ase, the purely random s heme onsists in allo ating ea h replia in a unitary b o x  hosen uniformly at random. W e sho w that for k = O (lo g n ) , ea h replia is onsidered at most one with high probabilit y . F or instane, onsider a seed request for a strip e i . Its list of allo ated replias is sanned forw ard. Ea h strip e replia falling in an o upied b o x is disarded un til a replia falls in a free unitary b o x. As observ ed b efore, the set X of unitary b o xes that are either under failure or pla yba k a he forw arding is  hosen indep enden tly from the replia p osition. The set Y of unitary b o xes uploading seeding strip es dep ends from indep enden t  hoies for other strip e replias. The probabilit y p that a replia of strip e i falls in one of the t = | X ∪ Y | o upied unitary b o xes is p = | X ∪ Y | usn . Considering that the n um b er of ativ e b o xes is n a ≥ an and that a v erage upload of ativ e b o xes remains u at least, w e obtain that the n um b er of failed unitary b o xes is at most usn − usn a . As the n um b er of urren t seed onnetions is | Y | , the n um b er of a he onnetions is at most sn a − | Y | . W e th us ha v e | X | ≤ u ( n − n a ) s + sn a − | Y | and | X ∪ Y | ≤ u ( n − n a ) s + sn a ≤ usn (1 − (1 − 1 u ) a ) . W e th us ha v e p ≤ 1 − (1 − 1 u ) a . As r = O ( n ) , the n um b er of strip e requests is at most λn for some λ > 0 . As disussed previously , the reserv ation of one strip e for seed onnetions in real b o xes ensures that a giv en seed onnetion is disarded with probabilit y at most 1 2 . A giv en strip e is th us disarded at most log 2 λn 2 times b y the a he s heduler with probabilit y 1 − 1 λn 2 at least. Similarly , for a giv en strip e, the ev en t that a b o x uploading it with a seed onnetion fails happ ens at most O (log n ) times with high probabilit y at least aording to our stress-less ev en ts h yp othesis. Ev ery strip e is th us disarded at most O (log n ) times with high probabilit y . There ma y b e up to v S seed onnetions for a giv en strip e and stress-less ev en ts start at most λ ′ log n sw arms on the video of the strip e for some λ ′ > 0 . This results in v S λ ′ log n strip e sear hes at most. Finally , w e note that with high probabilit y , ev ery strip e is sear hed at most λ ′′ log n times with high probabilit y for some onstan t λ ′′ > 0 . The list of replias of a strip e an th us b e seen as a sequene of zeros (when the replia falls in an o upied unitary b o x) and ones (when the replia is found). A zero o urs with probabilit y less than p and a one with probabilit y more than 1 − p . W e an onlude the pro of if the list of ones in all strip e lists is greater than λ ′′ log n with high probabilit y . As random  hoies for ea h replia are indep enden t, w e onlude using Cher- no 's upp er b ound that a sequene of k = O (lo g n/ (1 − p )) replias on tains the required n um b er of ones with high probabilit y . (Inluding the parameters of the mo del, w e use k = O ( v S a u u − 1 log n ) .)  Of ourse, this vision of onsuming the list of replias of a strip e is partiular to our pro of. In pratie, one an lo op ba k to the b eginning of the list when the end is rea hed. Claim 2 A l l  ahe strip e se ar hes su  e e d with pr ob ability gr e ater than 1 − O ( 1 n ) . Pro of. W e assume a  hoie of s su h that u ≥ µ + 1 s . As in Lemma 1, w e supp ose that a b o x i with p o or upload apait y u i < µ + 1 s reserv es an upload µ + 1 s − u i on some ri her b o xes. (Note that this augmen ts the probabilit y of failure for the no de, a problem w e do not try analyze here). A ri h b o x forw arding i strip es to a p o or b o x aepts preferen tially onnetions for these strip es (as for strip es of the video it is pla ying) up to an upload bandwidth of µi . First onsider the ase where a b o x b is en tering the sw arm (i.e. it requests p osition 0 in the strip e le). The sw arm Z of v an b e deomp osed in the set X of b o xes arriv ed in Z b efore time t − t S and the set Y of b o xes arriv ed in Z later on. W e th us ha v e Z = X ⊎ Y and b ∈ Y . If X = ∅ then | Y | ≤ v S aording to the arriv al b ound of our mo del and ea h seed strip e sear h sueeds with high probabilit y as disussed ab o v e. On the other hand, if X 6 = ∅ , w e ha v e | Z | ≤ µ | X | aording to the exp onen tial b ound on sw arm  h urn in our mo del. The b o xes in X ha v e o v erall upload apait y ( u − 1 s ) | X | (inluding the apait y reserv ed on ri her b o xes) and serv e at most | Z | − 1 s times the video (b o x b is still sear hing for a strip e). As u − 1 s ≥ µ , some onnetion slot is free for aepting the strip e onnetion of b o x b . It an alw a ys b e found if b has the full list of b o xes in the sw arm. The fration of b o xes with exeed apait y for the video is th us at most µ u − 1 /s . Note that a sligh tly higher v alue of u ≥ µ + 2 s w ould result in a onstan t fration of no des with exeeded apait y for their video. This w ould allo w to nd one with high probabilit y if the list of random no des in the sw arm has length O (log n ) . No w onsider the ase where a b o x b is reonneting in its sw arm due to some zapping or no de failure ev en t. W e an pro v e similarly that the onnetion ipping algorithm allo ws to nd a no de in the sw arm to onnet to. This relies on the h yp othesis that the n um b er of reonneting no des at p osition t in a video inreases b y a fator µ at most during a p erio d of time t S as assumed b y our mo del (all t yp es of sw arm  h urn are aggregated INRIA S alable Distribute d Vide o-on-Demand 13 in the b ound µ ).  6 Sim ulations In this setion w e ev aluate the p erformane of a pratial allo ation s heme b y the din t of sim ulations. This s heme is similar to the one desrib ed in setion 5 but it presen ts t w o main dierenes. Firstly , the storage allo ation is based on a random regular graph obtained b y a p erm utation π of the k ms strip es in to the dns storage slots. This  hoie is motiv ated b y the more pratial asp et of regular random allo ation that allo ws to ompletely ll-in b o xes. Seondly , one the onnetions are established, they annot b e re-negotiated when a new video-request is p erformed. The goal is to test the basi funtionning of the algorithm to understand where onnetion renego iations b eome neessary . W e assume that ev ery no de has a a he of size 1 where it stores all the strip es of the video it is w at hing. W e supp ose video requests arriv e at a onstan t rate, and t S = 2 min utes. As stated in previous setions, the eieny of an allo ation s heme dep ends on the requests pattern. In the follo wing, w e use v e kind of adv ersarial s hedulers to generate video request sequenes:  Greedy adv ersarial . The greedy adv ersarial s heduler  ho oses the request for whi h the system will selet a no de with minimal remaining upload bandwidth (among the set of no des that an b e seleted b y a request in the urren t onguration). This adv ersary mak e greedy deisions. It is strong in the sense that it is a w are of video allo ation and urren t onnetions.  Random . The random s heduler selets a video uniformly at random in the atalog.  Netix . m videos are randomly seleted from the Netix Prize dataset [ 1 ℄ as atalog for our sim ulated system. Requests are p erformed follo wing the real p opularit y distribution observ ed in the dataset.  Netix2 . The m most p opular videos of the Netix Prize dataset are seleted as atalog for our sim ulated system. Requests are p erformed follo wing the real p opularit y distribution of these m videos.  Zipf . The s heduler selets videos follo wing a Zipf 's la w p opularit y distribution with γ = 2 . The p eers that p erform a request follo w a sequene of random p erm utations of the n p eers. All our sim ulations are p erformed with n = 1 00 no des, and the results are a v eraged o v er m ultiple runs. 6.1 Impat of the n um b er of opies p er video W e study the maxim um n um b er of requests the system is able to satisfy as a funtion of the n um b er k of opies p er video ( k ≈ nd m ). W e supp ose that no des ma y w at h more than one video (for instane if m ultiple pla yba k devies dep end on a single b o x) so the total n um b er of requests an b e larger than n , ev en if n is the t ypial desired target. W e set s = 1 5 , u = 1 + 1 s and d = 32 . Figure 2 sho ws that the system is able to satisfy at least one request p er no de if k ≥ 6 , indep enden tly from the requests pattern. Moreo v er, for the Random, Netix and Netix2 s hedulers, k ≥ 3 is enough. W e indiate as referene the maxim um n um b er of requests the system an satisfy onsidering the global a v ailable upload bandwidth. Note, that for k ≥ 10 , no des almost fully utilize their upload bandwidth and the system asymptotially attains the maxim um p ossible n um b er of requests. 6.2 V arying the n um b er of strip es W e study the impat of the n um b er of strip es in to whi h videos are split. F or this purp ose, w e set k = 10 , d = 32 and u = 1 + 1 s . Figure 3 sho ws that the system an satisfy n requests or more for all s hedulers but the adv ersarial. With few strip es, the greedy s heduler ma y nd blo  king situations w ere re-onguration of onnetions w ould indeed b e neessary . F or lo w s , more requests are serv ed with other s hedulers. This is not surprising, onsidering that a redution of the n um b er of strip es leads to an inrease of the system global bandwidth. As s inreases, u tends to w ard 1 and the n um b er of satised requests to n . RR n ° 6496 14 Viennot & al. 0 2 4 6 8 10 12 14 16 18 20 10 20 30 40 50 60 70 80 90 100 110 Number of copies per video (k) # of Requests Satisfied Adversarial Random Netflix Netflix2 Zipf Max Figure 2: Requests satised as a funtion of k . n = 1 00 , d = 32 , s = 15 , u = 1 + 1 s 0 2 4 6 8 10 12 14 16 18 20 20 40 60 80 100 120 140 160 180 200 Number of stripes per video # of Requests Satisfied Adversarial Random Netflix Netflix2 Zipf Max Figure 3: Requests satised as a funtion of s . n = 1 00 , d = 32 , k = 10 , u = 1 + 1 s 6.3 Heterogeneous apaities W e analyze the impat on the n um b er of video requests satised in presene of no des with dieren t upload apaities. No de apait y distribution is a b ounded Gaussian distribution with u = 1 + 1 s and dieren t v ariane v alues. W e set k = 10 , s = 1 5 and d = 32 . Figure 4 sho ws the results. S hedulers an satisfy at least n requests for small or large v alues of upload v ariane, with a sligh t loss of eieny b et w een. This ma y ome from the fat that w e do not use a prop ortional allo ation s heme here. 6.4 No de failures W e ev aluate the impat of o-line p eers on the n um b er of video requests the system an satisfy . W e set k = 10 , s = 1 5 , d = 32 and u = 1 + 1 s . W e then randomly selet some no des and w e set them inativ e for the sim ulation. Figure 5 sho ws the system an satisfy video requests for at least all the ativ e no des in the system up to 40% failures ( a = 0 . 6 ). Then, a drasti derease in the p erformane o urs. As so on as there are 10% of b o xes o-line, the adv ersarial s heduler is able to blo  k the system. INRIA S alable Distribute d Vide o-on-Demand 15 2 4 6 8 10 12 14 50 60 70 80 90 100 110 Upload Capacity Variance # of Requests Satisfied Adversarial Random Netflix Netflix2 Zipf Figure 4: Requests satised with heterogeneous apaities. n = 100 , d = 32 , k = 10 , s = 1 5 , ¯ u = 1 + 1 s 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 100 Percentage of off−line nodes # of Requests Satisfied Adversarial Random Netflix Netflix2 Zipf Figure 5: Num b er of requests satised with stati o-line p eers. n = 100 , d = 32 , k = 10 , s = 15 , u = 1 + 1 s 7 Conlusion In this pap er, w e sho w an a v erage upload bandwidth threshold for enabling a salable fully distributed video- on-demand system. Under that threshold, salable atalog annot b e a hiev ed. Ab o v e the threshold, linear atalog size is then p ossible and the problem of onneting no des to serv e demands redues to a maxim um o w problem. A sligh t upload pro visioning allo ws to build distributed algorithms a hieving salabilit y . Referenes [1℄ The Netix prize. h ttp://www.netixprize.om/. [2℄ Matthew S. Allen, Ben Y. Zhao, and Ri h W olski. Deplo ying video-on-demand servies on able net w orks. In Pr o . of the 27th Int. Conf. on Distribute d Computing Systems (ICDCS) , pages 6371, W ashington, DC, USA, 2007. IEEE Computer So iet y . RR n ° 6496 16 Viennot & al. [3℄ Siddhartha Annapureddy , Saik at Guha, Christos Gk an tsidis, Dinan Guna w ardena, and P ablo Ro driguez. Exploring V oD in P2P sw arming systems. In INF OCOM , pages 25712575, 2007. [4℄ Y. Boufkhad, F. Mathieu, F. de Mon tgoler, D. P erino, and L. Viennot. A  hiev able atalog size in p eer-to- p eer video-on-demand systems. In Pr o . of the 7th Int. W orkshop on Pe er-to-Pe er Systems (IPTS) , pages 16, 2008. [5℄ M. Castro, P . Drus hel, A. Kermarre, A. Nandi, A. Ro wstron, and A. Singh. Splitstream: High-bandwidth m ultiast in o op erativ e en vironmen ts. In Pr o . of the 19th A CM Symp. on Op er ating Systems Priniples (SOSP) , 2003. [6℄ Bin Cheng, Xuezheng Liu, Zheng Zhang, and Hai Jin. A measuremen t study of a Peer-to-Peer Video-on- Demand system. In Sixth International W orkshop on Pe er-to-Pe er Systems (IPTPS) , pages 16, 2007. [7℄ Y ung Ryn Cho e, Derek L. S h u, Jagadeesh M. Dy ab eri, and Vija y S. P ai. Impro ving VoD serv er eieny with BitTorren t. In MUL TIMEDIA '07: Pr o  e e dings of the 15th international  onfer en e on Multime dia , pages 117126, New Y ork, NY, USA, 2007. A CM. [8℄ B. Cohen. Inen tiv es build robustness in BitTorren t. In W orkshop on E onomis of Pe er-to-Pe er Systems , 2003. [9℄ T ai Do, Kien A. Hua, and Mounir T an taoui. P2V oD: Pro viding fault toleran t video-on-demand streaming in p eer-to-p eer en vironmen t. In Pr o . of the IEEE Int. Conf. on Communi ations (ICC 2004) , jun 2004. [10℄ A.T. Gai and L. Viennot. Inen tiv e, resiliene and load balaning in m ultiasting through lustered de bruijn o v erla y net w ork (prexstream). In Pr o  e e dings of the 14th IEEE International Confer en e on Networks (ICON) , v olume 2, pages 16. IEEE Computer So iet y , Septem b er 2006. [11℄ P . Krishna Gummadi, Stefan Saroiu, and Stev en D. Gribble. A measuremen t study of napster and gn utella as examples of p eer-to-p eer le sharing systems. Computer Communi ation R eview , 32(1):82, 2002. [12℄ Y ang Guo, Y ang Guo, Ky oungw on Suh, Jim Kurose, and Don T o wsley . P2ast: p eer-to-p eer pat hing s heme for v o d servie. In WWW '03: Pr o  e e dings of the 12th international  onfer en e on W orld Wide W eb , pages 301309, New Y ork, NY, USA, 2003. A CM. [13℄ S. B. Handuruk ande, A.-M. Kermarre, F. Le F essan t, L. Massoulié, and S. P atarin. P eer sharing b eha viour in the edonk ey net w ork, and impliations for the design of serv er-less le sharing systems. SIGOPS Op er. Syst. R ev. , 40(4):359371, 2006. [14℄ X. Hei, C. Liang, J. Liang, Y. Liu, and K. W. Ross. Insigh ts in to ppliv e: A measuremen t study of a large-sale p2p iptv system. In In Pr o . of IPTV W orkshop, International W orld Wide W eb Confer en e , 2006. [15℄ Cheng Huang, Jin Li, and Keith W. Ross. Can in ternet video-on-demand b e protable? SIGCOMM Comput. Commun. R ev. , 37(4):133144, 2007. [16℄ V aishna v Janardhan and Henning S h ulzrinne. P eer assisted V oD for set-top b o x based IP net w ork. In Pe er-to-Pe er Str e aming and IP-TV W orkshop (P2P-TV) , pages 15, 2007. [17℄ D. K osti, A. Ro driguez, J. Albre h t, and A. V ahdat. Bullet: High bandwidth data dissemination using an o v erla y mesh, 2003. [18℄ Mi hael Lub y . Lt o des. In The 43r d A nnual IEEE Symp osium on F oundations of Computer Sien e , pages 271282, 2002. [19℄ P . Ma ymounk o v and D. Mazieres. Rateless o des and big do wnloads, 2003. [20℄ P etar Ma ymounk o v and Da vid Mazières. Kademlia: A p eer-to-p eer information system based on the xor metri. In IPTPS '01: First International W orkshop on Pe er-to-Pe er Systems , pages 5365, London, UK, 2002. Springer-V erlag. [21℄ Sylvia Ratnasam y , P aul F ranis, Mark Handley , Ri hard Karp, and Sott S henk er. A salable on ten t- addressable net w ork. In SIGCOMM , pages 161172, New Y ork, NY, USA, 2001. A CM. INRIA S alable Distribute d Vide o-on-Demand 17 [22℄ M. Rip ean u. P eer-to-p eer ar hiteture ase study: Gn utella net w ork. In Pr o . of the 1st IEEE Int. Conf. on Pe er-to-Pe er (P2P 2001) . IEEE Computer So iet y , 2001. [23℄ An ton y I. T. Ro wstron and P eter Drus hel. P astry: Salable, deen tralized ob jet lo ation, and routing for large-sale p eer-to-p eer systems. In Midd lewar e , pages 329350, 2001. [24℄ Ion Stoia, Rob ert Morris, Da vid Lib en-No w ell, Da vid R. Karger, M. F rans Kaasho ek, F rank Dab ek, and Hari Balakrishnan. Chord: a salable p eer-to-p eer lo okup proto ol for in ternet appliations. IEEE/A CM T r ans. Netw. , 11(1):1732, 2003. [25℄ Ky oungw on Suh, Christophe Diot, James F. Kurose, Lauren t Massoulié, Christoph Neumann, Don T o wsley , and Matteo V arv ello. Push-to-Peer Video-on-Demand system: design and ev aluation. IEEE Journal on Se- le te d A r e as in Communi ations, sp e ial issue on A dvan es in Pe er-to-Pe er Str e aming Systems , 25(9):1706 1716, Deem b er 2007. [26℄ D. T ran, K. Hua, and T. Do. Zigzag: An eien t p eer-to-p eer s heme for media streaming, 2003. [27℄ D. Xu, M. Hefeeda, S. Ham brus h, and B. Bharga v a. On p eer-to-p eer media streaming. In Pr o . of the 22nd Int. Conf. on Distribute d Computing Systems (ICDCS) , pages 363371, 2002. RR n ° 6496 18 Viennot & al. App endix A Maxim um o w s heduler W e pro v e Theorem 3 thanks to the t w o follo wing lemmas. F or the sak e of larit y , the pro of is written for the homogeneous ase. It is disussed later on ho w it generalizes to heterogeneous apaities. Lemma 2 (Min-ut max-o w) Consider a bip artite gr aph fr om U to V and an inte ger b > 0 . Ther e exist a b -mathing wher e e ah no de of no de of U has de gr e e 1 and e ah no de of V has de gr e e at most b i e ah subset U ′ ⊆ U has at le ast | U ′ | /b neighb ors in V (i.e., the gr aph is a 1 /b -exp ander). Pro of. The 1 /b -expander prop ert y is learly neessary . W e pro v e it is suien t b y onsidering the o w net w ork obtained b y adding a soure no de a and a sink no de z to the bipartite graph. An edge with apait y 1 is added from a to ea h no de in U . Edges of the bipartite graph are direted from U to V and ha v e apait y 1. An edge with apait y b is added from ea h no de in V to z . The 1 /b -expander prop ert y implies that ev ery ut has apait y | U | at least. The w ell-kno wn min-ut max-o w theorem allo ws to onlude.  Lemma 3 Consider a r andom r e gular p ermutation gr aph of k ms = dns strip e  opies into the dns memory slots of n b oxes. The pr ob ability that k i given  opies fal l into p given b oxes with pds ≥ k i is less than  p n  ki . Pro of. Dra wing uniformly at random a p erm utation of the k ms = dns strip es amoun ts to  ho ose uniformly at random a slot for the rst strip e, then a slot for the seond among the remaining slots and so on. The k i strip es are ordered. Let E a denotes the ev en t that the a th op y of strip e falls in to one of the pds slots of the p b o xes. P ( ∩ a ≤ ki E a ) = P ( E 1 ) .P ( E 2 | E 1 ) ...P ( E a | E 1 ∩ E 2 ... ∩ E a − 1 ) ... = pds nds . pds − 1 nds − 1 ... pds − a +1 nds − a +1 ... ≤  p n  ki (sine pds − i nds − i ≤ pds nds for p ≤ n ).  Pro of. [of Theorem 3 ℄ W e assume that s ≤ c is suien tly large to ensure u ≥ 1 + 1 s . W e supp ose u ≥ µ and s ≥ 2 . Consider the m ultiset of strip e requests at some time t . Its size is ns at most as there are no more than n videos pla y ed. Let S b e a sub-m ultiset of size i among the requested strip es. Let i 1 b e the n um b er of pairwise distint requests in S and i 2 = i − i 1 b e the n um b er of dupliated requests in S . As sw arm gro wth is b ounded b y µ , there are at least αi 2 no des where dupliate request an b e do wnloaded with α = 1 µ . Let B ( S ) denote the set of b o xes from whi h an y strip e of S ma y b e do wnloaded. F rom Lemma 2 , a onnetion mat hing for serving the request an alw a ys b e found if no m ultiset S of at most rs requested strip es v eries | B ( S ) | < j with j = i us . Note that B ( S ) inludes at least the giv en b o xes where dupliate requests ma y b e do wnloaded thanks to pla yba k a hing. This represen ts at least αi 2 b o xes and | B ( S ) | ≥ j for αi 2 ≥ j . W e ma y th us onsider only αi 2 < i/us (implying i 1 > (1 − 1 /αu s ) i ). By summing o v er all sets of j = i/ us b o xes and using Lemma 3 , w e get the follo wing b ound relying only on the strip e opies plaed aording to the video allo ation graph (this probabilit y is 0 for i ≤ us ): P ( | B ( S ) | < j ) ≤  n j   j n  ki 1 ≤  unse i  i/us  i uns  ki 1 . The last inequalit y is obtained b y using the standard upp er b ound of the binomial o eien t (  b a  ≤  be a  a ). Using Mark o v inequalit y , the probabilit y that some obstrution m ultiset S for some request exists is b ounded b y the exp eted n um b er of su h obstrutions. By summing the ab o v e inequalit y o v er all m ultisets S of at most ns strip es, w e get the follo wing b ound on the probabilit y p that the graph annot satisfy all p ossible requests: p ≤ P ns i = us P i i 1 = i (1 − 1 /αus ) M ( i, i 1 )  unse i  i/us  i uns  ki 1 where M ( i, i 1 ) is the n um b er of m ultisets of ardinalit y i tak en from sets of strip es of ardinalit y i 1 . M ( i, i 1 ) is at most M ( i, i 1 ) ≤  ⌊ nds/k ⌋ i 1  i + i 1 − 1 i 1 − 1  ≤  ndse ki  i  2 i i  ≤  4 ndse i  i sine i 1 ≤ i and onsidering that k ≥ 1 . Notie also that  i uns  ki 1 ≤  i uns  ki − ki/αus . The probabilit y is then at most: p ≤ P ns i = us i αus  i uns  κi δ i ≤ n αu P ns i = us  i uns  κi δ i where δ = 4 de 1+1 /us /u and κ = k − k /αus − 1 /us − 1 . It is easy to  he k that as a funtion of i the terms of the sum φ ( i ) =  i uns  κ δ i derease from φ ( us ) , rea h a minim um at φ ( i ⋆ ) = φ  uns δ 1 /κ e  then inrease to φ ( ns ) . Using this fat, w e b ound p b y onsidering separately the sum for i < i ⋆ and i > i ⋆ and b y replaing ea h term with the maxim um term on its side. On one hand, n αu P ⌊ i ⋆ ⌋ i = us  i uns  κ δ i ≤ n αu .ns.φ ( us ) = n αu .ns. 1 n κus δ us ≤ O  1 n κus  . On the other hand, the sum of the terms of rank greater than i ⋆ giv es n αu P ns ⌊ i ⋆ ⌋ +1  i uns  − κi δ i ≤ n αu .ns.u − κns δ ns ≤ O  n 2 ( u − κ δ ) ns  . Finally , p ≤ O  1 n κus  + O  ( u − κ δ ) ns  . F or the rst term to v anish, w e need u − κ δ < 1 and then κ > log u ( δ ) . F or this, w e need to repliate ea h strip e at least k is then k > log u ( d ) αus αus − 1 + α + αus log u (4 e 2 ) αus − 1 . F or the sak e of simpliit y , INRIA S alable Distribute d Vide o-on-Demand 19 onsider s ≥ 2 the lo w er b ound on the n um b er of repliates is k > 2 lo g u ( d ) + 2 log u (4 e 2 ) + 1 . In this ase the probabilit y of failure is at most p ≤ O  1 n κus  (note that κus > 0 ) and then the bipartite graph an satisfy all p ossible requests with high probabilit y . Sine the n um b er of videos that an b e stored is nd/k and giv en the ondition on k , the storage apait y is Ω( nd/ log u ( d )) .  No w onsider the heterogeneous ase. Lemma 2 and the ab o v e pro of ma y b e generalized. Reall that b o x b has storage apait y d b and upload apait y u b . The ondition for an obstrution then b eomes P b ∈ B ( S ) su b < | S | = i . W e an then onsider an y subset E of b o xes with o v erall apaities U E = P b ∈ E u b and D E = P b ∈ E d b su h that U E < i/s . As b o xes are  hosen aording to their apait y , the probabilit y to put a strip e in E with random allo ation is th us D E nd = D E U E ndU E < D E dU E i ns for an obstrution. Assuming d U E D E ≥ u ′ for all E , w e an follo w the same tra ks for the pro of with the probabilit y of obstrution b eing less than j n with j = i u ′ s . With a smallest upload apait y of 1 strip e, the total n um b er of su h sets E is b ounded b y  usn i  instead of  n j  in the ab o v e pro of. This larger fator in the sum o v er all m ultiset of request is not a problem when taking a sligh tly larger v alue of k . W e an th us get similar b ounds as long as d U E D E ≥ u ′ for some u ′ ≥ max  1 + 1 c , µ  . Bo xes with apait y lo w er than u ′ an b e group ed with high upload apait y b o xes to obtain the desired prop ert y as prop osed in Lemma 1. P o or Upload Capait y Bo xes Pro of. [of Lemma 1℄ Bo xes b with upload apait y b < µ are said to b e p o or . Bo xes with upload apait y exatly µ are said to b e me dium . Bo xes b with u b > µ upload apait y are said to b e rih . Let P , M and R denote the sets of p o or, medium and ri h b o xes resp etiv ely . W e set n P = | P | , n M = | M | , and n R = | R | (Note that n = n P + n M + n R ). Let u P = U P n P and u R = U R n R b e the mean and o v erall upload apaities of p o or and ri h b o xes resp etiv ely . W e onstrut B from A with same video allo ation. F or the sak e of simpliit y , w e assume that s/µ is an in tegral v alue as w ell as u b s for ea h b o x b . In a pre-pro essing step, for ea h p o or b o x b , w e reserv e µ (1 − u b µ ) s = µs − u b s upload slots from the ri h b o xes. This upload will b e used to forw ard (1 − u b µ ) s strip es to the p o or b o x and serv e new arriv als in the sw arm for up to ( µ − 1 )(1 − u b µ ) s strip es. This assignmen t should bal- ane the o v erall n um b er s b of slots reserv ed on a ri h b o x b su h that its remaining upload apait y u ′ b = u b − s b s remains no less than µ . (In a prop ortionally heterogeneous system, one w ould t ypially  ho ose s b prop ortional to u b .) A orresp onding spae of s b µs should also b e additionally b e reserv ed for pla yba k a hing. W e th us set d ′ j = d j + s j µs This assignmen t is p ossible when U R − µn R ≥ µn P − U P , i.e. u ≥ µ . No w w e use a video allo ation s heme for apaities u ′ b , d ′ b (where u ′ b = u b and d ′ b = d b for ea h p o or or medium b o x b ). The onnetion s hed- uler w orks as previously exept for the do wnload onnetions of p o or b o xes. When a p o or b o x b requests a video, the s strip es are do wnloaded from the b o xes deided b y the previous s heme. Ho w ev er, b do wnloads u b s µ strip es diretly but the (1 − u b µ ) s others are do wnloaded via the ri h b o xes with reserv ed upload slots for b o x b . These ri h b o xes partiipate in the a hing of the strip es they forw ard instead of b . This s heme allo ws to inrease the o v erall upload apait y of the set E of all b o xes a hing some strip e requested b y p b o xes so that U E ≥ µp .  RR n ° 6496 Unité de recherche INRIA Rocquenco urt Domaine de V oluce au - Rocquencourt - BP 105 - 78153 Le Chesnay C edex (France) Unité de reche rche INRIA Futurs : Parc Club Orsay Uni versit é - ZA C des V ignes 4, rue Jacques Monod - 91893 ORSA Y Cedex (Franc e) Unité de reche rche INRIA Lorraine : LORIA, T echnopôle de Nancy-Braboi s - Campus scienti fique 615, rue du Jardin Botani que - BP 101 - 54602 V illers-lè s-Nancy Cedex (France ) Unité de reche rche INRIA Rennes : IRISA, Campus uni versitai re de Beauli eu - 35042 Rennes Cede x (France) Unité de reche rche INRIA Rhône-Alpes : 655, ave nue de l’Europe - 38334 Montbonno t Saint-Ismier (France) Unité de reche rche INRIA Sophia Antipolis : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex (France) Éditeur INRIA - Domaine de V oluceau - Rocquencourt , BP 105 - 78153 Le Chesnay Cede x (France) http://www.inria.fr ISSN 0249 -6399

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment