Unraveling BitTorrents File Unavailability: Measurements, Analysis and Solution Exploration

BitTorrent suffers from one fundamental problem: the long-term availability of content. This occurs on a massive-scale with 38% of torrents becoming unavailable within the first month. In this paper we explore this problem by performing two large-sca…

Authors: Sebastian Kaune, Ruben Cuevas Rumin, Gareth Tyson

Unra veling BitT orr ent’ s File Una vailability: Measur ements, Analysis and Solution Exploration Sebastian Kaune 1 , Rub ´ en Cue vas Rum ´ ın 2 , Gareth T yson 3 , Andreas Mauthe 3 , Carmen Guerrero 2 , and Ralf Steinmetz 1 T ec hnische Univer sit ¨ at Darmstadt 1 , Universidad Carlos III de Madrid 2 , Lancaster University 3 Abstract BitT orr ent suffer s fr om one fundamental pr oblem: the long-term availability of content. This occurs on a massive- scale with 38% of torr ents becoming unavailable within the first month. In this paper we explor e this pr oblem by performing two lar g e-scale measur ement studies includ- ing 46K torr ents and 29M users. The studies go signifi- cantly be yond any pr evious work by combining per-node, per-torr ent and system-wide observations to ascertain the causes, characteristics and reper cussions of file unavail- ability . The study confirms the conclusion fr om pr evious works that seeder s have a significant impact on both per - formance and availability . However , we also pr esent some crucial ne w findings: ( i ) the pr esence of seeders is not the sole factor involved in file availability , ( ii ) 23.5% of nodes that oper ate in seedless torr ents can finish their downloads, and ( iii ) BitT orrent availability is discontinuous, operat- ing in cycles of tempor ary unavailability . Due to our new findings, we consider it is important to re visit the solution space; to this end, we perform lar g e-scale trace-based sim- ulations to explor e the potential of two abstract appr oac hes. 1 Introduction BitT orrent [3] has become a de-facto standard for scal- able content distribution over the Internet. The reason for its success is its ability to ef ficiently le verage the uplink capac- ity of nodes whilst achie ving high scalability during peak demands [10, 18]. This efficienc y is largely attributable to BitT orrent’ s tit-for-tat mechanism, which encourages users to share their resources whilst downloading files. Despite the success of BitT orrent, it still suffers from a significant problem: the long term availability of content . More specifically , content that is distributed using BitT or - rent often becomes unavailable after a relativ ely short pe- riod of time. For e xample, [7] found that the a vailable lifes- pan of most torrents is between 30-300 hours whilst 10% of all users fail to successfully download their desired content. A file can be considered una vailable if one or more of its data pieces are inaccessible to users wishing to down- load it. The most intuiti ve reason for this occurrence is that previously successful users in possession of the entire file (seeders) have left the system lea ving only users that pos- sess a subset of the file (leechers). Subsequently , unav ail- ability occurs when this subset cannot collecti vely rebuild the complete file with their remaining pieces. Previous re- search (such as [8][7][13]) has promoted the importance of seeders in regard to av ailability and concluded that a seed- less torrent is unable to reconstruct the file. Howe ver , this conclusion is challenged by the observation that some tor- rents continue to effecti vely serve files despite lacking any seeders. In this paper , we de vote our attention to understanding and characterizing BitT orrent’ s file unavailability problem. W e strive to discover the scale, causes and repercussions of the problem alongside in vestigating the possible solution space. T o achieve this we ha ve performed tw o large-scale measurement studies; the first in vestigates BitT orrent on a macroscopic lev el by periodically probing over 46K torrents to ascertain their high level characteristics, such as swarm size and seeder/leecher ratio. Whilst, the second study in- vestigates BitT orrent on a microscopic lev el by contacting ov er 700,000 individual peers in 832 torrents to discover relev ant properties such as their do wnload rates and piece av ailability . T o the best of the authors’ kno wledge, this is the largest dataset in terms of size and collected informa- tion used to in vestigate file av ailability in BitT orrent. This allow us to extend previous works to obtain far more accu- rate results; through this we make a number of interesting findings, • In 86% of cases, leechers are unable to reconstruct files in the absence of seeders. Ho wev er , in 14% of cases, leechers can reconstruct the file without any seeders present. W e therefore discover that seeders are not the sole factor in volved in BitT orrent’ s unav ailability prob- lem. Such torrents achiev e this through the posses- sion of large and stable populations as well as high ag- gregate do wnload rates that enable leechers to quickly 1 replicate rare chunks. • In 64% of torrents, una vailability is not immutable and, instead, occurs in cyclic periods followed by reoc- curring av ailability . This is due to old seeders returning to swarms where they pre viously participated in. • The combination of the two previous observations re- sults in 23.5% of users affected by a lack of seeders actually being able to complete their downloads. • Users often become frustrated with unav ailable tor- rents that e xhibit poor download rates. W e observe a chain reaction in which such users abort their do wn- loads thereby e xacerbating una vailability , resulting in further abortions. These new findings make it crucial to revisit the solution- space to in vestigate beha viour under the new , accurate workload defined by our lar ge scale dataset. As such, we perform trace-based simulations looking at both traditional single-torrent and cross-torrent mechanisms approaches to solving the file unav ailability problem; our primary results are, • Single-torrent incenti ve mechanisms must encourage users to increase the average seeding time to 10 times more than the current average to achiev e 99% av ail- ability . • Cross-torrent incentiv e mechanisms can easily achie ve 99% availability but with a performance decrease of 22% for 56% of the users. The rest of the paper is structured as follows; Section 2 provides related work. Section 3 then details the problem and our measurement methodology . Follo wing this, Section 4 characterises the causes and impact of unav ailability . W e discov er a primary cause is a lack of seeders and therefore Section 5 in vestigates seedless states in BitT orrent. Next, we utilise our measurement study data to e xplore the poten- tial solution space with trace-based simulations in Section 6. Finally , we conclude the paper in Section 7. 2 Related W ork BitT orrent Measur ements : BitT orrent measurement studies can be classified into two different groups. The first type uses log traces from trackers [10, 8, 7, 1] whereas the second type relies on cra wling techniques to retrie ve the in- formation from the system [18, 19, 21, 13, 17]. The first type of measurements is less intrusive since they do not ac- tiv ely interfere with the system. Howe ver , they are often problematic to obtain since they require the agreement from content providers. The crawling techniques, on the other hand, can be divided into two categories. In its simplest form, a crawler exploits the BitT orrent protocol to period- ically request the IP addresses of the clients participating in the torrent from the tracker [21]. This makes it possi- ble to study the demographics and dynamics of the torrents under analysis. This is what we name macr oscopic crawl- ing . More sophisticated crawlers also contact the clients and retriev e detailed information such as the client ID and their piece bitmap. W e name this micr oscopic crawling . Al- though the microscopic crawling giv es more detailed infor- mation, it is noticeably less scalable and only allows a few thousand torrents to be studied in parallel [18, 19]. Each ap- proach is ef fective for addressing particular needs; ho wev er , these hav e not yet been combined to in vestigate BitT orrent in a holistic way . BitT orrent’ s File A vailability Analysis : There are only a few w orks in vestigating a vailability issues in BitT or- rent systems [7, 15, 13]. Ne glia et al. mainly study the tracker/DHT a vailability of 22,000 torrents obtained from two torrent indexing sites [15]. Guo et al. [7] extended this, to model the lifespan of torrents by analyzing a limited number of tracker traces from [15]; it was found that most torrents are short-li ved because of an exponentially decreas- ing peer arri val rate. This model starts from the basis that content is unav ailable when there are no seeders present in the swarm. This is, so far , an unv erified hypothesis that is important to inv estigate. Similarly , Menasche et al. also use this hypothesis to inv estigate the av ailability of seeders in 45,000 torrents obtained from the Minino va website [13], finding that 40% of swarms lack seeders for more than 15 days in the first month after the torrent’ s birth. Impro ving BitT orrent File A vailability : Surprisingly , little research work has been performed into addressing the file a vailability in BitT orrent [8, 13, 20]. The most recent work improves file av ailability problem in BitT orrent by file bundling to enlarge the online times of the users. Using a queuing theoretic model and controlled experiments on PlanetLab, the authors show that this approach can reduce waiting-time for peers in torrents with highly unav ailable seeders. Howe ver , their results consider that peers arrive in a constant Poisson process which is a strong assumption giv en the measurement results presented in [18, 8] and also in this paper . Guo et al. were the first to propose intriguing ideas and results for cr oss-torr ent collaboration . Amongst other things, the authors sk etch an abstract mechanism for instant inter-torrent collaboration; follo wing this the y also e valuate the principles. Y ang et al. propose a v ariation of these ideas by designing a cross-torrent tit-for-tat strategy that assumes repeated interactions of the users. Howe ver , this method suffers because as Piatek et al. show through extensiv e mea- surements, 91.5% of peer pairs that occur in a single swarm will nev er meet again at any later point in time [17]. Pi- 2 atek et al. subsequently propose an alternati ve protocol that enables long-term incenti ves in BitT orrent with the aid of one-hop intermediaries. 3 Problem Backgr ound and Methodology 3.1 Defining File Av ailability T o study and understand the av ailability of files in Bit- T orrent we first present a simple model. Let’ s assume that we have a torrent T , formed by N nodes, managing the download of a file composed by P pieces. Thus, we can define the v ector V i = [ V i 1 , V i 2 , ..., V iP ] that contains the information about the pieces stored by peer i : V ij = 1 if node i has the piece j ; V ij = 0 if node i does not have piece j . V i is typically known as the bitfield of node i . W e define the P ercentag e of A vailable Pieces of torrent T at a time instant t as U ( T ) = P P j =1 O R ( V ij ) P . (1) Where OR ( V ij ) represents the logical OR -operation ov er the piece j across all the nodes in the torrent T . 3.2 The Circumstances of Una v ailabilit y It is important to understand in which circumstances a file becomes unav ailable, based on our definition. A file is considered unav ailable if at least one of its pieces is not ac- cessible within a swarm. This situation arises if there are no peers in the swarm that possess a giv en piece or , alterna- tiv ely , if the peer(s) that possess the piece are inaccessible (e.g. due to firewalls, N A T or o verlay graph disconnection). It is intuiti ve to consider the former as a far more likely circumstance (e.g. most BitT orrent clients implement tech- niques such as N A T trav ersal [14]. Moreover , they include neighbors discovery techniques such as the Peer Exchange Protocol -PEX- and periodical tracker polling that pre vent graph disconnection). Therefore, gi ven this assumption, a file can be considered av ailable if ( i ) there is at least one seeder or ( ii ) there is no seeder but the bitfields of the leech- ers collectiv ely fit the condition U ( T ) = 1 . W ithout de- tailed analysis, we can therefore currently state that: • With an accessible seeder , a file is av ailable • Without an accessible seeder , a file may be av ailable This paper uses these two observ ations as a starting point to in vestigate unav ailability in BitT orrent. In the following sections, we denote time periods in a torrent’ s lifecycle in which no seeder is online as a seedless state . T o this end, the file is unavailable if torrent T is in seedless state and U ( T ) < 1 . 3.3 Measuremen t Metho dology T o study the unav ailability problem and specifically the seeders’ role in it, we hav e performed two large-scale mea- surement studies using microscopic and macroscopic crawl- ing. T o the best of our knowledge, this paper is the first to combine both microscopic and macroscopic crawling tech- niques to better understand BitT orrent (specifically BitT or- rent’ s file av ailability). Microscopic Crawling : T o truly understand una vailabil- ity in BitT orrent, it is necessary to be able to view the micro- scopic characteristics of any given swarm, e.g. piece distri- bution or nodes’ download rates. Without this, one can only get a rough estimation of availability using metrics such as the number of seeders. The information regarding the be- haviour of individual peers provides the necessary data to make ne w , more accurate findings. T o gain this information we de veloped and deployed a distributed BitT orrent cra wler that can in vestigate sw arms on a microscopic lev el, using 20 nodes in the Emulab testbed [5]. The crawler operated from July 18, 2009 to July 29, 2009 ( micros-1 ) and then again from August 19, 2009 to September 5, 2009 ( micros-2 ). T o discover all the on- line users in a torrent it periodically contacted the torrent’ s tracker as well as using the Peer Exchange Protocol (PEX). From ev ery peer , e very 10 minutes it requested their piece bitmap to discover the real-time distribution of pieces. For the micros-1 study , the crawler followed 255 torrents ap- pearing on Mininov a 1 after the first measurement hour; in these torrents, we observed 246,750 users. The micros-2 dataset contains information from 577 torrents and 531,089 users. Macroscopic Crawling : The microscopic measurements provide detailed insight into the distribution of pieces and download rates within the sw arm, as well as between dif fer- ent peers. Howe ver , due to scalability issues it is dif ficult to perform such detailed measurements on a very large-scale (e.g. several thousand torrents). T o complement these re- sults we therefore also implemented a higher level crawler that follo wed ev ery torrent published on the Minino va web- site after December 09, 2008 for a period of 38 days. This crawler periodically requested, from multiple sites in Eu- rope, tracker information regarding each torrent’ s number of seeders and leechers alongside the members’ ip addresses (we were able to systematically collect 98% of all the ip ad- dresses from within the swarms). This study allowed us to gain an extremely large number of measurements re garding details such as peer arriv al patterns, seeder/leecher ratios and torrent sizes. This information can subsequently be cor - related with our smaller-scale microscopic measurements to deri ve such things as the scale of seedless states and the causes for seedless states occurring. Our final macro- 1 The largest BitT orrent Community based on Alexa Ranking. 3 scopic dataset consisted of reports from 46,227 torrents and 29,066,139 users. 4 Characterising Una vailability: Causes and Impact In this section, we first in vestigate the role that seeders play in file unavailability . Following this we study the ex- ceptions and variations we discovered. Lastly , we then in- vestigate the real-time impact that a lack of seeders has on client performance and their subsequent reactions that can be observed. 4.1 In v estigating the Role of Seeders in File Unav ailabilit y It is intuitive to think that U(T) < 1 in a torrent with- out any seeder (that is, leechers are unable to reconstruct the file). Ho wev er , this is, so far , an un verified assumption that must be in vestigated (and quantified). T o ascertain this, we inspect the (i) nodes’ bitfield and (ii) nodes’ do wnload rates in all the torrents of our microscopic traces af fected by seedless states. 4.1.1 Bitfield Analysis W e hav e collected every nodes’ bitfields for all the torrents in our microscopic measurements as the y ha ve e volved over time. For each torrent we hav e computed U ( T ) periodically ev ery 10 minutes during any period a torrent is without any seeders (i.e. it is in a seedless state). This allo ws us to ascer- tain whether a full copy of the file exists in the torrent at an y giv en time. Fig. 1 shows the CDF of max ( U ( T )) observ ed in the seedless state for each torrent that we studied. From this data, we can extract two pieces of information; first, in the majority of cases (86%) our hypothesis is confirmed and the contacted leechers are unable to collectiv ely recon- struct the file once a seeder has left (i.e. max ( U ( T )) < 1 ). Clearly , this means that seeders do hav e a significant im- pact on the av ailability of files in BitT orrent. Importantly , howe ver , we also find that a notable proportion of torrents (14%) actually remain a vailable ev en without a seeder . Col- lectiv ely , this makes up 24% of all leechers that operate in seedless swarms. This is a crucial finding that has not been observed before; it is therefore in contrast with previ- ous models [13, 8] that consider all seedless torrents to be unav ailable. 4.1.2 Download Rate Analysis A limitation of the bitfield analysis is that not all nodes are accessible due to N A Ts. T o address this, we also inspect the aggregate torrent download rates. Through this, we can 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Cumulative Fraction of Seedless States Piece Availability max(U(T)) Figure 1. Piece av ailability in torrents af- fected b y seedless states. infer that a file is una vailable when the do wnload rate of all the peers participating in a specific torrent drops close to 0 KBps. From this we can deri ve that the node cannot find any ne w pieces to download. T o highlight our findings, we first inspect a representa- tiv e torrent from our microscopic trace 2 , sho wn in Fig. 2. The figure shows the median instant do wnload rate of the online leechers over time, sampled ev ery 10 minutes. It also plots the number of seeders and leechers, as well as the number of copies of the least replicated piece. Note that when the number of seeders becomes 0, the torrent enters a seedless state. The torrent can be observed to enter a seedless state af- ter the middle of day 3, remaining in this state for roughly two days. When the final seed departs the download rate of the leechers drops to approximately 0-3 KBps after only a few minutes. This also coincides with the number of least replicated pieces dropping to zero. It can therefore be con- fidently inferred that the file is, indeed, unav ailable during this period due to the departure of the last seeder . Interestingly , it can also be seen that the torrent be- comes av ailable again during day 5. As the seeders return, the download rate increases and the file becomes av ailable again. In contrast to past assumptions, it is therefore evi- dent that unav ailability is not continuous. This important phenomenon will be in vestigated further in Section 5.3. The above analysis has inspected a representativ e tor- rent. T o validate its widespread applicability we also look at the download rate degradation in all torrents. T o achiev e this, we have taken all the users that have been affected by a seedless state and separated their downloading time into two periods: ( i ) periods in which they hav e suf fered from a seedless state and ( ii ) periods in which they hav e not. Fig. 3 presents the do wnload rate distribution for both pe- riods. First, we can observe that the download rate in a 2 W e have observ ed the same behaviour in most of the torrents affected by seedless states. 4 0 50 100 150 200 250 300 350 400 0 1 2 3 4 5 6 Number of users Time after torrent birth (in days) Seeders Leechers 0 10 20 30 3 3.5 4 0 50 100 150 200 250 0 1 2 3 4 5 6 Speed (in KBps) Time after the torrent birth (in days) Median over all leechers Figure 2. Snapshot from a torrent in our mi- croscopic trace. non-seedless state is much higher than in a seedless state. 80-85% of the nodes experience an av erage download rate lower than 1 KBps when in a seedless torrent, indicating that the peers cannot locate any required pieces and the file is, indeed, unav ailable. Second, howe ver , we also observe that 15-20% of users, in fact, maintain a reasonable lev el of performance e ven without an y seeders. This can be at- tributed to two reasons: ( i ) the aforementioned 14% of tor- rents are capable of reconstructing their file without a seeder at an average rate of 21.3 KBps; and ( ii ) ne wly joined peers can download the subset of av ailable pieces at an effecti ve rate. This can be observed in the representative torrent (cf. Fig. 2): between days 4 and 5 there is a peak in the number of leechers which results in a short peak in the download rate as new comers do wnload the av ailable pieces. 4.2 In v estigating the Causes of Sw arm Resilience The previous section has identified a notable percentage (14%) of torrents that can maintain a v ailability even with- out any seeders; this represents 24% of all leechers that en- counter seedless states. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 25 50 75 100 125 150 Cumulative fraction of hosts Download rate in (KBps) seedless state (micros-1) seedless state (micros-2) non-seedless state (micros-1) non-seedless state (micros-2) Figure 3. Interval download rates for nodes affected b y the lack of seeder s. T o inv estigate this, we separate torrents into those that surviv e in the absence of seeders ( resilient torrents ) and those that do not ( susceptible torr ents ). W e then in vesti- gate quantitati ve properties of these tw o groups to ascertain how they differ at various points in their lifecycles. All of the identified metrics ha ve been calculated for each group (across all member torrents) ev ery 10 minutes using infor- mation from the microscopic traces and the tracker reports. These values hav e then been av eraged together over each time period in vestigated. T able 1 gi ves an ov erview of all metrics used in this anal- ysis. W e calculate these over two time periods: the begin- ning of the torrents’ lifec ycle and just before the last seeder goes offline. Although not included in the table, we also in vestigated the effects of file size and content type with- out ascertaining any correlation. Most metrics are straight- forward, howev er, two require some explanation: Distribu- tion Entropy ( E ( T ) ) and the Churn Factor ( C F ). The E ( T ) in vestigates the distribution of pieces within the swarm; this is to inv estigate whether torrents that can surviv e achieve a superior distribution of pieces. W e there- fore characterize the distribution entropy in a torrent T at a giv en time t by introducing the following Entr opy Inde x : E ( T ) =  P P j =1 P N i =1 V ij  2 P · P P j =1 P N i =1 ( V ij ) 2 (2) Recall that N defines the number of nodes in the sw arm, P is the number of pieces a file is composed of and V i is the bitfield of node i . This index is similar to Jain’ s Fairness Index [11] and achie ves a value of 1 if all pieces are equally distributed among the peers. The Churn Factor C F in vestigates whether torrents that can surviv e have more stable populations. This factor is defined by N disc / N all where N disc is the number of users that hav e left the swarm during a giv en time period ( t ) and N all is the total number of users observed during this same period. A factor of 0 indicates that no user disconnected within t ; by default t = 10 mins . 5 Metric T ime before seedless state T ime after torrent’ s birth 1 hour 6 hours 6 hours 24 hours Resilient Susceptible Resilient Susceptible Resilient Susceptible Resilient Susceptible Swarm speed (in KBps) 58.83 23.88 68.58 24.99 95.72 53.50 62.99 44.82 Seeder/Leecher ratio 0.15 0.03 0.14 0.04 0.19 0.19 0.32 0.43 Firew alled/NA T ed peers (in %) 70.86 61.09 62.30 60.67 51.30 56.05 54.94 58.44 Distribution Entropy E ( T ) 0.93 0.94 0.92 0.93 0.91 0.90 0.92 0.91 Least replicated piece (# of copies) 9.21 1.61 8.38 2.82 15.34 14.21 21.33 23.55 Churn factor C F 0.03 0.21 0.08 0.15 0.07 0.11 0.08 0.07 Online leechers 281.34 111.61 250.48 101.55 134.05 82.71 163.12 89.55 Online seeders 5.28 1.51 6.11 1.80 12.05 11.25 18.17 21.34 T able 1. Characteristics of resilient torrents (those that maintain av ailability in seedless state) and susceptible torrents (those that cannot reconstruct the file). From the data in T able 1, we can make the following important observations, • T orr ent P opularity: From the be ginning, resilient tor - rents exhibit higher leecher population sizes. Larger torrents possess an increased probability of replicating rare pieces before the loss of seeders. • Low Churn F actor: High churn in small torrents cre- ates a greater risk of losing vital pieces; if this coin- cides with the loss of a seeder then it becomes impos- sible to recov er these pieces again until a seeder re- turns. Resilient torrents ha ve significantly lo wer churn factors than susceptible torrents. • Seeder/Leecher Ratio: Resilient torrents exhibit a higher seeder/leecher ratio and, as a deriv ativ e of this, experience download rates that over twice as high as susceptible torrents. This superior performance is highly beneficial for the surviv al of piece replicas as it allows the quick duplication of rare pieces. Be- fore seedless state occurring, resilient torrents there- fore hav e many more replicas of the rarest piece when compared to susceptible torrents. In summary , these results sho w that swarm resilience is a product of large, stable populations that can achieve higher download rates due to beneficial seeder/leecher ratios. The combination of these factors results in rarest piece replica- tion rates that are ov er 5 times greater than their suscepti- ble counterparts. This makes such swarms highly resilient to the loss of any seeders. Importantly , it also can be con- cluded that unav ailability cannot be addressed by modifying any of BitT orrent’ s algorithms (e.g. piece selection) b ut, in- stead, must be solved by incenti vising users to modify their behaviour . This is exemplified by the lack of any correlation between resilience and distribution entrop y . 4.3 Effects and T rends of Seed Departure Despite a notable percentage of torrents surviving with- out seeders, it is e vident that the loss of all seeders can often 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 300 Cumulative fraction of hosts Download rate (in KBps) affected (micros-1) affected (micros-2) non-affected (micros-1) non-affected (micros-2) Figure 4. Comparison of download perfor - mance between peers affected and not af- fected b y seedless states. result in una vailability . This section now in vestigates the ef- fects that this has on both individual users and wider system performance. Three stages can be identified which we now discuss. The first repercussion of the loss of seeders in suscepti- ble torrents is a significant and rapid drop in download rates. T o extend the earlier analysis, we no w compare the av erage download rate of users that suffer from a seedless state at some point during their download against the average do wn- load rate of users that always find content av ailable. W e first categorise users into two groups: af fected vs. non-af fected. The first group of users consists of leechers that are (at some point) af fected by a lack of seeders. The non-af fected users, on the other hand, hav e at least one seeder av ailable dur- ing their entire download. Fig. 4 gi ves the do wnload rate distribution of both user groups as obtained from the two microscopic crawlings. Whereas the median download rate for the non-affected users is 36 KBps in micros-1 and 48 KBps in micros-2 , the performance for peers attempting to download unav ailable content is only 0.06 KBps and 3.8 KBps, respectiv ely . The second observable stage is a direct deriv ati ve of the decrease in do wnload performance. Specifically , we ob- serve a large increase in download abortions. T o study this 6 we examine the session times in our microscopic traces. W e observe that 89% of users affected by file unav ailability (i.e. participating in susceptible torrents) abort their downloads due to the bad performance. Sadly , this is an unnecessary action as we have found that seeders often return, making files av ailable again. In contrast to these results, users op- erating in resilient torrents only hav e an abortion rate of 34.47%. Although this seems initially high, we also find that many users operating in other torrents that do not suffer from una vailability also abort their do wnloads. On closer inspection, these ’unnecessary’ abortions occur in torrents that ha ve particularly low download rates that are under a third of the av erage. The third stage in this process is the worrying emergence of a chain reaction. W e find that as the number of abortions increase, the number of av ailable chunks decrease. This re- sults in an exacerbation of the torrent’ s unav ailability and a further drop in do wnload rates for those trying to access the remaining chunks. As other users witness this trend, they too abort their downloads. This process results in fe wer users becoming seeders and therefore greater unav ailabil- ity and more abortions. Frequently , the above two repercus- sions of unavailability and the creation of this chain reaction often spells the end for a torrent. From these findings we deriv e that users are highly sen- sitiv e to their perceived instant quality of service and there- fore any solutions must maintain an acceptable download rate whilst also improving file a vailability . 5 Characterising Seedless States The previous section has validated and quantified the im- portance of seeders in regard to file av ailability in BitT orrent and discussed under which circumstances a file of a seedless torrent becomes una vailable. It has been found that in the majority of torrents (86%), the loss of all seeders results in unav ailability . In this section we therefore in vestigate the behaviour of seeders and characterise the nature of seed- less states using our large scale dataset. W e first look at the frequency of seedless states in BitT orrent. Follo wing this we inv estigate the causes of seedless states before, finally , in vestigating the issue of why torrents can become revi ved again after extended periods of una vailability . 5.1 Ho w Prev alen t are Seedless States? T o quantify how prev alent seedless states are in BitT or- rent, we ask the follo wing question: how many torr ents and to what extent are torrents affected by seedless states?. T o answer this, we use the logs from our macroscopic trace that giv e us a large-scale view on the system comprising of 46k torrents. t 4 t 3 t 1 Time t 2 User n n+1 1 Unavailibility Period Availibility Period t 0 t 5 Figure 5. Illustration of a seedless state. The measurements show that more than 38% of torrents (17,568 out of 46,227) lose their seeders within the first month, out of which 72% lack seeders after only 5 days. Similarly , we find that more than 45% of the torrents suffer from a lack of seeders for half of their monitoring time. T o ex emplify the scale of this, in 50% of the torrents observed for periods longer than 30 days, no seeder was a vailable for more than 16 days. Finally , in our study , more than 9.68 million users (33% of all users seen) participated in torrents with highly un- av ailable seeders suggesting that this is not only a long tail problem. Out of these users, more than 1.59 million were directly affected by seedless states. 5.2 Wh y do Seedless States Occur? Since seedless states are highly pre valent in real swarms, an intuitive question is: why do they occur in the wild? In this section, we first identify and then further in vestigate the influencing factors responsible for triggering seedless states. 5.2.1 Identifying Influencing F actors There are two main factors that directly influence the exis- tence of seedless states: ( i ) the session time of seeders and ( ii ) the inter-arri val rate of the users. T o illustrate the influ- encing factors, we use a simple example sho wn in Fig. 5. In this figure, each horizontal line represents the lifetime of a user; these users can either be in a leecher state (thin lines) or a seeding state (thick lines). It seems straightforw ard that the longer a seeder serves content, the more leechers are able to finish their down- loads. Unfortunately , (as demonstrated later on) the seeding time is typically quite short contributing significantly to the frequency and length of seedless states. Let’ s now assume that user n is the last av ailable seeder in our example torrent and none of the previous seeders re- turn to the torrent. In this case, a seedless state occurs when 7 the time required for leechers to download the file exceeds the online time of the last seeder . For example, Fig. 5 sho ws that after the last av ailable seeder leaves the swarm at time t 3 , none of the remaining leechers were able to finish the download. If we focus on the n -th node and its subsequent successor in the torrent ( n + 1 -th), the inter -arriv al time be- tween both users is giv en by τ n +1 (= t 2 − t 1 ) whereas the seeding time of node n is gi ven by µ n . Assume that both users n and n + 1 download a file of size F s with rate D n and D n +1 respectiv ely . Thus, the swarm enters a seedless state when Eq. 3 is fulfilled. D n F s + µ n < τ n +1 + D n +1 F s (3) T o simplify the analysis, we assume that D n = D n +1 3 . In this case, the seedless state is reached if the inter-arri val time is larger than the seeding time. T o summarise, seeding times as well as inter-arri val times play an important role in the generation of seedless states and subsequently in the long-term a vailability of con- tent. Since both parameters are not directly correlated, we individually analyse both of them in the follo wing. 5.2.2 Arriv al Behaviour of Users The first behavioural characteristic that is paramount to seedless state generation is the inter-arri val times of users. In this regard, intuiti ve questions are: (i) what inter-arri val times do we expect in reality and (ii) ho w do inter -arriv al times ev olve ov er time? By analysing a few hundred torrents in a small com- munity , previous work [7] has shown that user inter-arri val times are exponentially increasing. Our goal is to generalize this finding for ’open’ communities such as Mininov a.org that are orders of magnitude lar ger . For our analysis, we use similar techniques as applied in [7]. W e consider all torrents in our macroscopic trace. W e use linear re gression to fit the logarithm of the complementary 4 of the number of node ar - riv als of each torrent along time. Let X t denote the comple- mentary number of node arriv als at time epoch t and Y t be the fitting result. W e define the relativ e deviation of the ac- tual node arriv als ov er an ideally exponentially increasing function by log X i − log Y i log X i . Thus, a relati ve de viation of 0% indicates that both curves overlap. Fig. 6 shows the de via- tion for each torrent of our macroscopic trace. The x-axis depicts the torrents ordered by ascending population size while the y-axis shows the relati ve de viation. For most of the torrents, the relative de viation is less than 10% whereas 3 Our microscopic measurements show that the download rate of users that finish do wnloads ( D n in the e xample) is higher than the do wnload rate of those that do not ( D n +1 ) validating our assumption. 4 W e use the complementary number of node arriv als to avoid domains in which the logarithm is undefined, e.g., epochs with no peer arriv als. 0 0.1 0.2 0.3 0.4 0.5 0.6 0 10000 20000 30000 40000 Relative deviation Torrents sorted by ascending population Figure 6. Deviation fr om linear regression. 0.1 1 10 100 1000 0 3000 6000 9000 12000 15000 18000 Max. inter-arrival time (in h) Torrents affected by seedless states Figure 7. Maximum inter-arriv al times of tor- rents with highly una vailable seeder s. the deviation tends to decrease with increasing torrent pop- ularity . Altogether , the av erage relativ e deviation of all tor- rents is 4.8%. Therefore, we conclude that the inter-arri val time of the nodes exponentially increases with time. Notably , we observed especially high inter-arri val times in torrents affected by seedless states; this is in line with our analysis in the pre vious section. For instance, Fig. 7 plots the maximum inter-arri val time observed in these torrents with unav ailable seeders. More than 45% of the torrents exhibit inter -arriv al times far beyond 10 hours. 5.2.3 Seeding T imes of Users The second behavioural characteristic that is paramount to the creation of seedless states is the seeding time of a node, i.e. ho w long seeds stay online for . As already shown in our example torrent (cf. Fig. 5), to maintain file a vailability it is necessary for seeders to remain online for long enough for ne w seeds to be generated. Fig. 8 shows the cumula- tiv e distribution of the seeding times of the nodes obtained from the two microscopic measurements. It can be seen that seeding times are generally short-lasting with 75% of the seeders staying online for less than 4 hours. When this data is compared to the inter-arri val time of users it can be identified that the current seeding times in BitT orrent are not sufficient to av oid seedless states, thus prev enting to achieve long-term file av ailability in BitT orrent. 5.3 Ho w long are Seedless States? The representativ e snapshot presented in Fig 2 has high- lighted that torrents can become available again after a ex- 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 2 4 6 8 10 12 14 16 Cumulative fraction of hosts Session seeding time (in hours) trace: micros-1 trace: micros-2 Figure 8. Seeding time distribution. tended periods of unav ailability . In this section we validate (using both the macroscopic and microscopic datasets) that file una vailability is, in fact, discontinuous with reoccurring periods of temporary availability . Through our measure- ment studies we can state that this occurs because seed- ers often return to swarms that they have previously par- ticipated in. This allo ws the 11% of users that choose to remain online during periods of unav ailability (i.e. in sus- ceptible torrents) to eventually complete their downloads. Alongside the existence of r esilient torrents, this means that 23.5% of all leechers af fected by seedless states can actually still gain access to the file. The reoccurrence of seeders happens in ov er 64% of tor- rents that suffer from seedless states in our macroscopic study . T o inv estigate this, Fig. 9 shows the CDFs of both the duration of seedless states as well as the duration of the subsequent periods in which content becomes av ail- able again, computed o ver all torrents exhibiting this phe- nomenon. Note that the x-axis is in log scale. It can be ob- served that seedless periods are typically long-lasting with an average of 43.19 hours whereas the subsequent av ailabil- ity periods only last 12.56 hours on av erage. The primary reason for the (seemingly) altruistic return of seeders is likely to be the default settings of many BitT or- rent clients (e.g. V uze, µ T orrent) that automatically rejoin torrents at their start-up even after a user has completed a file download. Unfortunately , BitT orrent users do not have permanent identifiers and thus we cannot make quantitati ve statements on exactly how many unique seeders rejoin a swarm and o ver what time period. Howe ver , the length of the seedless periods as depicted in Fig. 9 of fers a conser- vati ve bound for the inter-seeding time distribution of such users. This is ob viously very coarse grained and therefore we expect the inter -seeding times actually to be higher . The reoccurrence of seeders is ob viously in contrast to previous work that has assumed unav ailability is continuous and immutable. T o in vestigate the impact that this finding has on pre vious w ork, we briefly look at the relati ve devi- ation that the assumption has when compared against our dataset. W e define the relative de viation as, 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.01 0.1 1 10 100 1000 Cumulative fraction of periods Duration (in hours) temp. availability seedless state Figure 9. Length of cyc les of temporary a vail- ability and unav ailability periods. relativ e deviation = measured av ail. time − assumed av ail. time assumed av ail. time (4) W e find that in approximately 35 % of the torrents in our macroscopic dataset, the assumption works well. In these torrents, we did not observe any temporary av ailability pe- riod after the torrent first enters a seedless state. Howe ver , in 50% of the torrents, the content is actually available for at least twice as that assumed when considering immutable unav ailability . 6 Impro ving File A v ailability The previous sections have outlined the file av ailability problem and highlighted the significant impact that seeders hav e on this. W e therefore deduce that a solution must find some way to encourage users to provide content even af- ter they have obtained it themselves i.e. to prev ent seeders from leaving torrents. T o do this we highlight two possible approaches: single-torrent and cross-torrent mechanisms. The principles of these tw o approaches are first abstractly outlined to show ho w each might improve seeding times. Follo wing this, the two approaches are ev aluated consider- ing the key findings described in previous section (e.g. reoc- currence of seeders). For this purpose, we run trace-based simulations using the workload from our lar ge scale dataset. 6.1 P oten tial Solution Approac hes for Ex- tending Seeding Times This section briefly outlines the two generic approaches that can be taken for improving seeding in BitT orrent. The first is using traditional single-torrent principles whilst the second exploits the concept of cross-torrent collaboration (originally outlined in [7]). Note that we do not offer con- crete implementational details; instead, we provide a brief outline of the principles behind each mechanism. 9 6.1.1 Single-T orr ent Solution A single-torrent solution inv olves incentivising users to re- main within a torrent to seed, based on certain properties related to that individual torrent. As of yet we do not know of an y successful mechanisms to achie ve this due to the dif- ficulty of enforcing incenti ves once a peer has already ob- tained the file which it desires. W e therefore consider a sim- ple framew ork of encrypted pieces that may work. Such a solution w ould in volve encrypting the file before it is dis- tributed within the swarm. The tracker would be responsi- ble for managing this encryption and, as such, would be the source of the keys. Subsequently , once a peer has down- loaded the file it would be required to remain seeding for a length of time determined by the tracker before the encryp- tion keys are released to it. 6.1.2 Cr oss-T orrent Solution A cross-torrent solution inv olves incentivising users to cooperate with the system as opposed to indi vidual tor- rents. This approach is motiv ated by observations from our macroscopic trace that sho ws 51% of the users join multi- ple torrents (4.98 on average). W e have further found that seeders frequently rejoin swarms after they ha ve left, there- fore providing conclusive evidence that the same peers re- join the BitT orrent system multiple times whilst still pos- sessing their previously downloaded files. T o highlight the principles of a cross-torrent solution, imagine a user who joins torrent X at some point in time and completes the download; this user may very well join another torrent Y at a later point in time. When the node comes online again to do wnload torrent Y it could then theoretically persists as a r eplica for torrent X . The incentiv es behind this could be managed in a number of ways (e.g. [17]). The following example highlights how the system could w ork using persistent contrib ution histo- ries. Through this approach, the system would maintain a history of the contrib utions made by each user (agnostic to which torrent the contribution is made). Subsequently , peers w ould sho w preference to piece requests from users with higher contribution ratios. This would therefore re- place BitT orrent’ s current rate-based tit-for -tat mechanism so that incentives were based on the entire system as op- posed to individual users and torrents. 6.2 Exp erimen tal Metho dology T o ev aluate the two possible solutions approaches, the BitT orrent simulator of Bharambe et al. [2] is used and ex- tended to enable the simulation of multiple torrents existing in parallel. 6.2.1 Evaluati ve Aims W e do not aim to perform an implementational comparison between vanilla BitT orrent and the proposed approaches, e.g., regarding protocol ov erhead and technical aspects to realize either approach. This is out of the scope of this pa- per . The goal of our ev aluation is to shed light on the fea- sibility and potential of the two approaches based on the newly disco vered observations from our studies. For both approaches we wish to discover , (i) does the approach increase file availability in torrents with ordinar- ily una vailable seeders, and (ii) what are the implications of this in regard to download performance. W e aim to in ves- tigate these f actors on both a per-torrent and system-wide basis to e xplore how the effects of the approaches impact both perspectiv es on BitT orrent. 6.2.2 Input to the experiments Selecting the T orr ents: Our trace data encompasses tens of thousands of torrents ov er a period of sev eral weeks, far more then the simulator is able to handle. Hence, we chose a random subset of 100 torrents from the set of torrents af- fected by seedless states with varying file sizes between 3- 1500 MB and a per-torrent monitoring period of at least four weeks 5 . The logs of these torrents contains data of more than 235,000 downloads. User beha vior: T o model the access pattern of torrents, we do not use any artificial peer arri val function. Instead, we bring up ne w peers as well as reoccurring seeders accord- ing to the trace logs. T o model the number of swarms that a peer joins we calculate the probability distribution over our entire data set. Any user that cannot download the file within 36 hours aborts the download 6 . Finally , after fin- ishing their downloads, users stay online as seeders based on the measurements from our microscopic crawlings (cf. Fig. 8). Speed distributions: T o have a representati ve band- width distribution, we first associate each IP address with a country , using a freely av ailable geolocation database [12]. Based on the country of origin, the Ookla database [16] provides us with the median down/uplink capacity of each user 7 . Failur es in contribution histories: T o represent informa- tion inconsistencies in the distribution of contribution his- tories (e.g. due to churn), when encountering a new user in the cross-torrent approach, the contribution history is only 5 W e have also experimented with higher/smaller amount of torrents. Due to space constraints, we opt for presenting only a representativ e sam- ple. 6 W e find through simulations that 36 hours is enough time to get a download success ratio over 99% in the presence of seeders for all access links and file sizes used in our experiments. 7 W e have also experimented with other datasets [9, 4] and obtained similar results. 10 Protocol Metric A vg. seeding D S F S time (in hours) (in KBps) (in %) BT : V anilla 3.44 137.84 20.25 ST : 2x seeding 6.88 158.11 13.65 ST : 5x seeding 17.20 179.41 4.39 ST : 10x seeding 34.40 190.81 0.66 CT : Persistent history 3.44 138.75 0.13 T able 2. Overview about system-level results. known with a probability of 0.9. This represents a worst- case scenario, as the literature has reported a superior accu- racy of 0.96% [17]. 6.2.3 P erformance metrics W e utilise two performance metrics to ev aluate the effec- tiv eness of the approaches. The first is the av erage down- loading rate of successful users ( D ) and the second is the fraction of download abortions ( F ). Both metrics are cal- culated on a per-torrent ( D T , F T ) and system-wide basis ( D S , F S ). 6.3 Comparativ e Results T able 2 giv es an overvie w of the three variants in which the measured seeding times are lengthened by a factor of either 2, 5, or 10. For comparability reasons, we assumed for the cross-torrent approach that users remain online af- ter downloading as long as they stay in v anilla BitT orrent. The results presented in this table summarize the fraction of download abortions ( F S ) and the av erage downloading rate ( D S ) on a system-wide lev el. Some points w orth noting: • In the chosen set of torrents, 20% of downloads were not successful in vanilla BitT orrent. • T o maintain persistent file a vailability in the single- torrent approach, e.g. to ensure a system wide success rate for downloads > 99%, the users must stay in av- erage 10 times longer after do wnloading. This induces an av erage seeding time of more than 34 hours. • The cross-torrent approach achieves a similar down- loading failure ratio as the single-torrent variant lengthening the seeding times by a factor of 10. How- ev er, this is achieved without having to increase the seeding times beyond that currently observed in Bit- T orrent. With regard to download rates, the cross- torrent approach also performs similar to vanilla Bit- T orrent. In addition to this table, Fig. 10 and Fig. 11 sepa- rately plot the fraction of aborted downloads and the a ver- age do wnloading rate, respecti vely , on a torrent basis. It can 0 0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 90 100 Abortion Ratio F T Torrents BT: Vanilla ST: 10x seeding 0 0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 90 100 Abortion Ratio F T Torrents BT: Vanilla CT: Persistent history Figure 10. Over vie w about the per-torrent abortion ratios. 20 40 60 80 100 120 140 160 180 200 220 10 20 30 40 50 60 70 80 90 100 Download Rate D T (in KBps) Torrents BT: Vanilla CT: Persistent history Figure 11. A verage download rates on a per - torrent basis. be observed that, in a fe w torrents, even a 10 times multipli- cation of seeding times only has a marginal ef fect on reduc- ing the abortion ratio. On the other side, the cross-torrent approach obviously benefits from the av ailable file replicas as the abortion ratio nev er exceeds the 2% threshold. When examining the do wnload performance on a per- torrent basis in Fig. 11, it can be observed that users apply- ing the cross-torrent solution actually increase their down- load rate when compared to v anilla BitT orrent. For in- stance, the average do wnloading rate ( ¯ D T ) over all torrents is 102.26 KBps for vanilla BitT orrent and 121.65 KBps for the cross-torrent variant. As this is an average, it does not highlight the disparities between different torrents’ perfor- mance based on popularity . Whilst not shown in the plots, the cross-torrent approach reallocates upload capacity from particularly popular torrents to other torrents. Therefore, 43.55% of the users that finish in vanilla BitT orrent gain a performance increase of 88% when applying cross-torrent collaboration. Ho wev er , the remaining 56.44% of these users suf fer from an av erage performance decrease of 22%. 11 6.4 Summary T o conclude, ev en when considering the seeding dis- continuity of users, the single-torrent approach emerges as highly impracticable: to ensure a system-wide file av ail- ability of > 99%, av erage seeding times of more than 34 hours are required. In contrast, the cross-torrent approach using persistent histories achie ves this lev el of file av ailabil- ity easily . It therefore allo ws the 20% of users that could not complete their downloads in v anilla BitT orrent to ef- fectiv ely download the file. The download performance of the cross-torrent approach is on a system lev el equiv alent to v anilla BitT orrent, and e ven improv es do wnload times on a per-torrent basis. Although this finding is at first glance surprising, the cross-torrent approach benefits from the sig- nificant increase in nodes ( > 20%) which now find content av ailable. This, in turn, allows users that previously could not access the content to download at high rates; this com- pensates for the inherent performance disadv antages of peer selection policies that are optimised for fairness [6]. Howe ver , it must also be noted that the increase of file av ailability due to cross-torrent collaboration is achiev ed by a notable trade-off. That is, the download rate of more than half of the users that finish downloads with vanilla BitT or- rent accounting degrades by 22%. This can be contrasted with a 88% improv ement for the remainder of peers. 7 Conclusions This paper has in vestigated BitT orrent’ s unav ailability problem in the wild and explored the feasibility of the po- tential solution-space. T o achie ve this, two lar ge-scale mea- surements studies were performed to ascertain the charac- teristics, causes and repercussions of file unav ailability in BitT orrent. Based on this, we made a number of interesting findings that offer the most accurate study of file av ailabil- ity in BitT orrent so far . Most notably , it was found that ( i ) a lack of seeders often results in unav ailability b ut not al- ways, ( ii ) the churn level, the f ast replication of rare chunks and the population size largely defines a swarm’ s ability to surviv e without a seeder ( iii ) unav ailability usually occurs in cyclic periods with intermittent av ailability , and ( iv ) un- av ailability often results in a chain ef fect that leads to future download f ailures. Due to these new findings, the solution-space was also in vestigated to see ho w the y af fect both single and cross tor - rent solutions. It was found that the continuance of BitT or- rent’ s single-torrent mechanisms can only address the prob- lem with a 10 fold increase in seeding times. In contrast, great potential has been found in using the cross-torrent ap- proach which maintains current performance le vels whilst also achieving o ver 99% av ailability . References [1] A. Bellissimo, B. N. Levine, and P . Shenoy . Exploring the use of bittorrent as the basis for a large trace repository . T ech- nical report, UMASS Amherst, 2004. [2] A. R. Bharambe, C. Herley , and V . N. Padmanabhan. An- alyzing and improving a bittorrent networks performance mechanisms. In INFOCOM , pages 1–12. IEEE, 2006. [3] B. Cohen. Incentiv es build robustness in bittorrent. In 1st W orkshop on Economics of P eer-to-P eer Systems , 2003. [4] M. Dischinger , A. Haeberlen, K. P . Gummadi, and S. Saroiu. Characterizing Residential Broadband Networks. In IMC . A CM Press, 2007. [5] Emulab – Network Emulation T estbed. https://www. emulab.net . [6] B. Fan, D. M. Chiu, and J. Lui. The delicate tradeoffs in bittorrent-like file sharing protocol design. In ICNP , pages 239–248. IEEE Computer Society , 2006. [7] L. Guo, S. Chen, Z. Xiao, E. T an, X. Ding, and X. Zhang. Measurements, analysis, and modeling of bittorrent-like sys- tems. In IMC . ACM Press, 2005. [8] L. Guo, S. Chen, Z. Xiao, E. T an, X. Ding, and X. Zhang. Measurements, analysis, and modeling of bittorrent-like sys- tems. IEEE Journal on Selected Areas in Communications , 25, Issue: 1, January 2007. [9] iPlane Project. http://iplane.cs.washington. edu . [10] M. Izal, G. Uroy-Keller , E. Biersack, P . A. Felber , A. A. Hamra, and L. Garces-Erice. Dissecting bittorrent: Five months in torrent’ s lifetime. In P assive and active Measure- ments , pages 1–11. Spring LNCS, 2004. [11] R. K. Jain, D.-M. W . Chiu, and W . R. Hawe. A quantitative measure of fairness and discrimination for resource alloca- tion in shared computer systems. T echnical report, Digital Equipment Corporation, September 1984. [12] Maxmind, Free Geolite Database. http://www. maxmind.com/app/geolitecountry . [13] D. S. Menasche, A. A. A. Rocha, B. Li, D. T owsle y , and A. V enkataramani. Content av ailability and bundling in swarming systems. In CoNEXT , 2009. [14] J. J.-D. Mol, J. A. Pouwelse, D. H. J. Epema, and H. J. Sips. Free-riding, fairness, and firewalls in p2p file-sharing. In P eer-to-P eer Computing , pages 301–310. IEEE Computer Society , 2008. [15] G. Ne glia, G. Reina, H. Zhang, D. T o wsley , A. V enkatara- mani, and J. Danaher . A vailability in bittorrent systems. In INFOCOM , pages 2216–2224. IEEE, 2007. [16] Ookla’ s Speedtest Throughput Measures. [17] M. Piatek, T . Isdal, A. Krishnamurth y , and T . Anderson. One hop reputations for peer to peer file sharing workloads. In NSDI , pages 1–14. USENIX Association, 2008. [18] J. A. Pouwelse, P . Garbacki, D. H. J. Epema, and H. J. Sips. The bittorrent p2p file-sharing system: Measurements and analysis. In IPTPS , 2005. [19] G. Siganos, J. Pujol, and P . Rodriguez. Monitoring the bit- torrent monitors: A bird’ s eye vie w . In P AM , pages 175–184, 2009. [20] L. G. Y an Y ang, Alix L.H. Cho w . Multi-torrent: a perfor - mance study . In MASCO TS , September 2008. [21] C. Zhang, P . Dunghel, D. Wu, and K. W . Ross. Unrav eling the bittorrent ecosystem, submitted. In Unpublished , 2009. 12

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment