A Reliable Replication Strategy for VoD System using Markov Chain

A Reliable Replication Strategy for VoD System using Markov Chain R. Ashok Kum ar K. GANESAN Research Scholar, Senior Professor School of Computing Sci ences School of Computing Scie nces VIT University, VIT University, Vellore 63 2 014, Tamil Nadu, India. Vellore 632 014, T amil Nadu, India. Abstract — In this paper w e have investigated on t he reli ability of streams f or a VoD system. The objective of the pape r is to maxi m ize the availability of streams for the peers in the VoD system. We have achieved th is by using data replication technique in the peers. Hence, w e proposed a new data replication technique to optimally store t h e videos in the peers. The new data replication technique generates more number of replicas than the ex isting techniques such as random, minimum request and m aximize hit. We have a lso inv estigated by applying th e CTM C model for the re liability of replications during the peer failures. Our result shows that the mean lifetime of replicas are more under various circumstances. We have addressed the practical issues of efficient utilization of overall bandwidth a nd buffer in the VoD system. We achieved greater success playback probability of videos than the existing techniques. Keywords -Reliability;Availability;Video on Demand; Peer; Proxy Server; Replication, Band w id th, Buffer. I. INTRODUCTI ON The VoD (Video on Demand) is one of th e m ost popular services on t he Inter net. The applications of the VoD service are digitally transmitted movies, live streaming video’s, distance learning etc., The VOD systems are based on client server architecture. The video server stores the video objects, title, popularity, and the Quality of Service ( QoS ) parameters for streaming. In this architecture, a client request for a video to the server and then the ser ver trans mits the video streams to the c lient for p layback. T he load o n the ser ver increases as t he number of clie nt request increases. T o balance the load o n the server, multiple servers are added to the existing VOD Syste m. Popularly these VOD systems are k nown as parallel [3 ] and distributed [4] VOD Servers. Each of these servers serves o nly a group of users rather than all users in the VoD s ystem. The major dr awbacks of t hese systems are the co st of upgrading video server that requires high end servers with additional band width and req uires l ong term working memory capacity. The following approaches are used to redu ce the cost factor for upgrading video servers such as a) Content Delivery Network ( CDN ) b) Proxy based system c) Pee r to Pee r system . In the CDN [2 ] approach video s of the nearer clients are cached and stored in th e Point of Presences(PoP) . In t his approac h it reduces the load on the server and the number of network hops but it is not cost effective due to additional hard ware and software required at PoP to cache the video s. In proxy based approach [ 5] [6 ] videos are ca ched in t he pro xy ser ver which is cost effective but suffers from scalabilit y problem. In peer to p eer approach [7] [8 ] a video strea m relies on application protocol, the peer s assist video servers b y distributing the streams o ver the network. In this architecture the band width are shared among peers which reduces the load o n the server and the net w ork bandwidth. Ho wever, all the peer s over the network are not in volved in streami ng of the video s. This creates a fairness proble m am ong the p eers [9 ] . To further reduce the loa d on the VoD server, a client side caching scheme was propo sed known a s earthworm [ 10 ]. In this scheme the client not only pla ys the video but also for wards the streams to anot her client with adequate buffer and delay known as basic chaining. T his scheme is further extended as forward, backward, adaptive and optimal chaining which exploits client resources such as b uffer and uplink bandwidth [ 10 ][ 11 ][ 12 ][ 13 ]. However demand for high quality videos and the longer duration of videos are expected in the neare r future [ 14 ]. For such application s the existin g chainin g schemes fall short in meeting the scalability require m ents like b andwidth and buffer. In o ur previous work we have de fined novel chaining technique kno wn as M Chaining [1 ] which is based on the CT MC model. In this chaining technique, the number of chain increase s and reaches the stead y state. O nce it reac hes steady state we found t hat there was a smooth streaming of movies to the clients. The above mentioned chai ning sc hemes mainly focus on the server load. T o further reduce the load on the server many have investigated on utilization of client resources by co mbining both Peer to Peer system and chaining mechanism. The P eer to P eer systems have smaller stor age capacity and les s bandwidth that cannot be a repr esentative o f a dedicated central server therefore the redundant movies are stored in the peers to i mprove t he reliabilit y of movies in the p eers. Redundancy of movies can be i mplemented i n two (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 281 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 different methods one is replication technique and another is Erasure cor rection code. In replication technique multiple cop ies of movies are stored in different peers and i n Erasure correction co de technique a movie i s div ided into different smaller segments a nd multiple cop ies of the se seg ments are stored in d ifferent peers. In o ur w ork, we use an archite cture which co mbines Proxy based system and P eer to Peer system to reduce the load on the server. We have used the M-Chainin g technique [1] for the transmission o f movies in t he peers. The problem in t his architecture is that pee rs ar e unreliable which m ay frequently break the chai n d uring the tra nsmission o f movies bet ween the peers. He nce we propo se a new data r eplication technique o n peers for improving the reliab ility of the movies . In this pap er data replication is defined as the more number of cop ies of the sa me video ar e stored in different peers. We ha ve proposed a new algorith m for data replication to optimally generate the number of replicas. In this p aper we have followed four major steps for the overall optimalit y of t he video strea ms. Firstly the video is replicate d into nu mber o f replicas and these replicas must be stored in different peers. Se condly it requires a data place m ent po licy, which decides, how many nu mber of rep licas should be placed on the pee rs. T hirdly a selection policy is required to select a p eer t hat co ntains a replica of the movie for optimal strea ming and finall y, we have used a CTM C (Continuous Time Marko v Cha in ) model to ana lyze the system perfor m ance of t he r eplication in case o f peer failures. The rest of the paper is organized as follows: Section 2 reviews previo us related works; Section 3 presents an overvie w of the propo sed VoD architecture; Section 4 evaluates the succ ess playback p robability and reliability using simu lation; a nd Section 5 concludes the paper . II . RELATED WORK In [ 15 ] has proposed a replication technique to maximize the availability of the v ideo , based on th e prior information about the avai lability of movie in the peer and the popularity o f the movie. The replicas are generated by maximizing hit rate known as Max H it . They for mulate an opti mization pro blem using dynamic programming to maximize t he availabilit y of the movie in the peer s. T he downside o f this appro ach is that the bandwidth of the pee r co ntaining the rep lication o f the movie is co mpletely ignored. The movie may be available in serving peer but the requestin g peers are completely b locked because of non a vailability of the bandwidth i n the ser ving peers. I n our rep lication technique, we con sidered the popularity of the movie and as well as servi ng peer bandwidth to ca lculate the number of r eplicas so t hat the req uesting peers are not completely blocked. In [ 15 ] has proposed another replicatio n technique to minimize the request o f the movie based on th e currently available bandwidth. T he replicas are generated by minimizing the request rate kno wn as Min R eq . They have formulated this problem using d ynamic programming to minimize the r equest of t he movie in the loaded peer. The dra w back of this technique is that the replication is done for the most popular movies. As the popularit y of the movie reduces, the replicatio n is also reduced . In our rep lication technique , p opularity o f the movie is o ne o f t he important p arameter for increasing t he replicas for the most po pular movies and to reduce the replicas for the least p opular movies. We have streamlined t he generation of rep licas in such way that eve n the least popular movie have si gnificant replicas and not worst as Min R eq . In [ 16 ] has for mulated a repli cation technique based on the a vailability of the file and pro posed a bi - weighted model to find the o ptimal resource allocatio n scheme among the files. In this technique, the preference is given to the files t hat ha ve maximum weightage. But the dr awback o f t his technique, to calculate the weight of the file the author rely o n partial and li mited infor mation of the file which is located in the nei ghboring unreliable peers. In our app roach, for each movie we have calculate d the weig ht b ased o n the popularity, request arrival and num ber of replicatio n. The highest weighted movies are given more prefere nce during the p lacement of the movies in the serving peers. In our case the weight is ca lculated periodicall y and modified accordingl y in the pr oxy server so that the non serving does not alw ays dep end on the neighboring unreliable peers for the movies. B y this way the updated information o f the available movie in t he neighboring serving peers i s obtained frequentl y from the pro xy server so th at the non serving peers can switch among the serving pee rs for the m ovie s in case of serving peer failures. In [ 17 ] has pro posed a replicatio n strateg y for the super peer that has a req uest r ate table which co ntains a total nu mber of request a nd request rate resources for a particular category. T he number of replicas is calculated b y multiplying the request r ate with a K factor, w here K is an a ggressive replication strategy. The replicas are generated proportional to the request rate and the file size ; these replicas ar e stored uniformly in all heterogeneous peers. T he major issue of this strategy is in fin din g the K factor which is not addressed properly. In our approach, we focus on request arr ival rate, down time of the serving pee rs and currently available bandwidth and buffer o f the serving peer to replicate the movies. We also have more number of serving peers b ecause user r elinquishes it s peer r esource information to q ualify as a ser ving peer that also s atisfies the m inimu m resource information. Replicas are ge nerated based on the availab le reso urce and request arrival rate in that instance of time instead of multiplying the request rate with some K factor. (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 282 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 In [ 18 ] have assumed that all the movies have t he same popularit y based on K ce nter problem . Since this problem is well known NP c omplete problem . When a new request ar rives to p lace a new movie to the super peer, then the new movie is always placed in t he super peer. If the sto rage capac ity of the super peer is less than the size of the ne w movie then so me ex isting movies in the super peer are forcefully re moved. In our approach, existing m ovies are rem oved based on the LFU (Least Frequen tly Used) algorithm i.e., if the popularity of the exi sting m ovie is less than the popularity of the new mo vie t hen the leas t popular movie is removed fro m the super peer and if the new movie popularit y is le ss than all the existing movies i n the super p eer then the r equest to p lace the new movie is rejected. W e have given more preference o n popular movies than the non po pular movies, so that the p opular movies are always available i n the super pee rs. Durability o f the replicas in peers is studied in [ 19 ] . The authors concluded that the durabilit y o f peers should have a longer life time b ecause of longer r epair time. The y have defined the lo w er b ound threshold for each super peer as the function of the s ystem capacity and band width. Usually, when the syste m fails in any circumstances they consider longer duration for the repair time. But this is not the case in o ur pr oposal, w e have achieved the shorter repair time for stop fail, VCR functionality etc., other than hardware and software failures, we have also ac hieved the greater durabili ty o f the replicas in case of so me peer failure. In [ 20 ] develop ed a simple Mar kov Model to evaluate the parameters o f rep lication system. T he parameters for the model is based on r eplica loss and replica rep air. The analysis of the model is co arse because the replication has a ggressively maintained in small numbers o f rep licas i n t he super pee rs. Reliabilit y of the r eplication fails due to the smaller nu mber of replicas and recovery fro m the failure of the rep licas was also not ad dressed in t he problem. Hence in our approach of re plication, we have recovered from the failure of the replicas b y using Markov Chain. In [ 21 ] d iscussed the model o f birth and death process using CTMC for the number of replicas which assumes the independent and exponential failure an d repair of the peers. Here the availability and dur ability are evaluated separatel y for finding the more ac curate model to predict the durability of the system. Insight of this appr oach see ms to produce inaccurate resu lts . In our approac h, we have maintained a stronger correlation a mong the peers to achieve the durab ility, availability and greater performance of the syste m. In [ 22 ] is estimated the data durability using CTMC model a nd p rovided an empirical expressio n to yield a good approximation of the sub linear p arameter values. However the results o f the model show so me inaccurate predications of the probabilit y of data loss. In our approach, we have focused on fin ding the accurate prediction of the probab ility of data loss during the replication of the videos. III. SYSTE M MODEL We have pr oposed a model that a s the combination of p roxy based architecture and peer to peer system as depicted in Fig.1. T he model contains a main multimedia server, number of p roxy server and pee r to peer systems. The main multimedia server contains movie files with the following information such as index, size, d uration, popularity, minimum b uffer, a nd maximum bandwidth of t he movies. T he o verall s ystem load is equally d ivided among t he clusters. A cl uster contains group o f p eers wh ich are connected to a proxy server. The pro xy server contains the streamed movies and cu rrently s treami ng movies. It also m aintains a database of all the peers which are currently available in t he cl uster. In t he peer t o p eer system a peer can be a serving peer or it can be a non serving peer. A serving peer stores the number of movies and these movies will be served to other non serving p eers using a chain ing mechanism. The first problem we have ad dressed in our syste m is the identification of a serving peer . Identification is difficult becau se a peer does not have powerful processing capabilit y when compared to a server. T his problem can be solved b y the user voluntarily relinquish t he peer resource information such a s the storage capacit y , CPU speed and network band width to the pro xy server. T he peer resource infor m ation is stored in t he datab ase which is maintained by the proxy server a nd this peer is d esignated ha s a serving p eer. Additional details of up time and down time of the serving peers are frequently obtained by the pro xy server to maintain the s ystem reliabilit y. In this approach, w e find two major benefits one i s t he overall main multimedia server load is reduced significant ly by including these additional bandwidth and buffer of the serving p eers in t he cl uster a s well as t he user of t he serving peer is also benefit ed by red uction o f the subscription fees . The second prob lem we have addressed in our system is the place ment of movies i n the serving peers. Placing of the m ovie in the serving peer is difficult because of high unreliability and unpr edictability of the peer. T his difficulty can b e eli minated by having more than o ne duplicated co py of the movies stored in different ser ving peers with a general r ule that no serving peers will ha ve multiple co pies of same m ovie . In case, if a serving peer fails then the copy of the movies of t he failed serving pee r can also be obtained from other serving p eers. I n traditional VoD s ystem the duplicate copy of the m ovie is obtained either using replication strategy or erasure coding technique. In replication strateg y the number of replica f or the movie is ca lculated and the se rep licas are stored in d ifferent serving peers. In Reed- Solomon Erasure Correction Code techn ique for (n - h) movies we need to calculate h (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 283 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 redundant m ovie s out of n movies an d these h redundant movies are stored in different ser ving pee rs . We have observed that erasure code technique generates more copies of duplicate movies and consumes large storage spac e of the serving peers when compared to replication strate gy. He nce, we propose a new rep lication strategy in o ur model a nd t he rep licas generated fro m this strategy are stored in d ifferent serving peers using Sma llest Load First algorithm. The third problem we have addressed in our system is selection o f the serving peers. Because of the dynamicity o f the serving peers, it is difficult to select a predefined serving peer to serve a non servi ng peer. In our approach we have used Least Load Fi rst algorithm to select a serving peer. The fourth p roblem we have addressed in our system is to measure the availab ility of the replicas in the ser ving pee rs. T he b ehavior of the serving p eers is un predictable because of t he dynamicity o f the up time and down time of the se rving p eers d uring the streaming session . So we apply Continuous Time Markov Chain ( CTMC) model to measure the performance of our new replication strategy. He nce we elaborate the detailed discu ssion of the problem addressed in our model in the next section. A. In itialization of the system a nd identification of serving peers. We consider N is the to tal number of peers and G is the total number of serving p eers in the cl uster. (N -G) is the total number of non se rving pee rs i n the clu ster. M is the set of mo vies { m 1 , m 2 , m 3 ,.. m m } served in the cluster and D m is t he duration of m th m ovie a nd S m is the size of the m th movie. Each movie m is equally divided in to V number of video blocks such that m =   i S m  =1 . The entire movie follows Va riable Bit Rate (VBR) for the transmi ssion o f video blocks and C m is the minimum number of cha nnels required for the efficient transmission of the m th movie. C is the sum of all C m channels required to transmit the movies in the clu ster. The r equest arrival rate f or the m th m ovie is exponentially distributed with a mean rate Λ m and q m is the p opularity o f the m th movie derived from zipf’s la w . Initially, the proxy server maintains a list of serving peers in its database. T his list is created b ased on the sharing parameters such as requested movie, storage capacity, CPU speed , band width, up time a nd down time of t he serving p eer. Whenever a pee r requests for a new movie to the main server, the req uest is r edirected to the p roxy server in which the peer belongs to that proxy server. T he proxy ser ver replies back w ith a dynamic list o f serving peers that contains the req uested movie to the req uested peer. If t he movie is available in the pr oxy server then the in itial portion of the video blocks are directl y streamed fr om the proxy server a nd the later portion of the m ovie will be streamed from one of the selected peer from the list of serving peers . If the Fig. 1. Propose d Model of VoD Sy stem movie is not available in t he prox y server or not available in a ny of the serving peers then the initial portion o f the video b locks ar e directly strea med from the main server and the later portion o f the video blocks are first d ownloaded and buffered in t he proxy server and then streamed from the proxy server to the requested peer. Intuitive behind this idea is all the customers and services providers are profited b y optimal utilization of the system resou rces. Firstly, load on the main multi media serv er is red uced b ecause of the load sharing a mong the peers. W e have also found that peers are neither o verloaded nor explo ited by the chaining mechanism. Seco ndly, the user of the system is highl y b enefitted by paying less in t heir monthly subscription fees. The cost of subscription fees is reduced based on the serving pee r’s r esource utilization and availability. T he user’s who relinquish their peer as the ser ving p eer is the most beneficiar y in this sche m e. Th e users of the non serving peers are also benefited by paying o nly small percentage o f subscriptio n fee s t han the actual subscriptio n fees. Finally, the o verall system resources are utilized ver y well because it u ses o nly the residual band width a nd buffer o f the serving p eers to serve non serving peers. B. R eplication S trategy Once th e serving peers are i dentified we need to calculate the nu m ber o f replicas for the movies and place these replicas in different serving peers. The objective of rep lication strategy is to maxim ize the availability of the m ovie s in the cluster, in case o f serving peer failures. T his is achieved by replicati ng the copies of the movies in the serving peers. A copy of the movie is al ways stored in the proxy server and duplicated copies are also stored in d ifferent serving peers while transmitting the m ovie i n the cluster, w hich maximizes the availab ility of the movies in the cl uste r. We have identified t he ser ving p eers as mentioned i n section III A, that a s su fficient band width and b uffer to place additional movies in it. Before placing the movies in the servi ng peers, the requested mov ies are downloaded fro m the main multimedia ser ver and stored in the proxy ser ver. Now we define the different parameters that are req uired for the repli cation strategy. Let G be the total nu mber serving peers and G contains (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 284 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 the set o f c urrently avai lable serving p eers in t he cluster. Let P up and P dn is the duration of peer up time and downtime r espectively o f the ser ving pee r in the cluster. The probabilit y of peer availabilit y in t he cluster is defined as A up = P up P up +P dn and the prob ability of the peer unavailability i n the cluster is defined as A dn = P dn P up +P dn . A flag is a ssociated with each o f the ser ving peer based o n the value o btained from A up a nd A dn . If the value o f the flag is 1 then it indicates that the serving peer is available and it ca n store the rep lica of the movie. Ot herwise t he servin g peer is not available and it cannot store the rep lica of the movie. T he reason for unavailability of t he ser ving peer ca n be software or hardware fail, stop fail, ne twork failure, insufficient network band width or not enough s torage space to stor e replicas. W e have ass umed that peer una vailability is much larger than peer availability in our model . Each ser ving peer g shares B g bytes of storage space and β g channels of uplink bandwidth. Let R m be the number of rep licas for the m th movie and R be the to tal number o f replicas in the clu ster. A valid rep lication should not exceed t he aggre gate storage capac ity of all the serving peers available in the cluster such that  S m R m M  =1   B g G  =1 and should also satisfy the channel req uirement s uch that  β g G  =1  C for the transmission o f the m ovie s. Now we prop ose a ne w replication algorithm for the proxy server s to generate optimal number of replicas for a cluster. Replication A lgorithm in Proxy Server Step 1: Wait in batch for the new request arrival of the movies. Step 2: Sort the movies in the batch based on the popularity q . Step 3: Measure the current request rate Λ for each of the new requested movies. Step 4: Get the size of each requested new movie f ile from the main multimedia server and store in a new list S m ’ . Step 5: Get the list of serving peers with A up flag as 1 from G and store in a new list G’. Step 6: Filter out the serving peers from G’ based on B g < S m ’ and store the remaining serving peers in a new list G’’. Step 7: Total number of ava ilable serving peers T R = count( G’’ ). Step 8: Ω = avg( Λ, q ) Step 9: Replicas R = Ω * T R Step 10: Choose R number of serving peers from G’’ and store in a new list G’’’. Step 11: Placement ( R, G’’’ ) Step 12: Repeat from Step 1. In our proposed replication algorithm , we had considered the other important factors suc h as peer uptime for reliability, p opularity of the movie and i nter arrival request rate of the movie. T he replication algorithm increa ses the number of rep licas for the most popular movies and balances the lo ad across all the serving peers. C. Placement of replicas in the serving peers After determining the R number of rep licas we must place these replicas i n the servi ng peers w ithout overloading them. We can select eit her a Ran dom technique or a Round Robin technique to place the replicas in the serving peer s. T hese placement techniques assume all movies to have same pop ularity that is not the ca se in the current scenario of VoD systems. We have to emphasis more on the p opularity of the movie because of t he d emand created in the system. Hence, we have used Sma llest Loa d First algorithm to place the replicas of the movies in the serving peers . In this a lgorithm we ha ve calculated the weight of each movie so that the highest popular movies will have lar ger weights when co mpared to its counter part of the movies with the least popular. Smallest Load F irst algorit hm Step 1: Accept number of replicas R and list of active serving peers G’’’ from the Replication Algorithm. Step 2: For each of the movie in R calculate the weight based on the request arrival rate, popularity and the number of replicas. The weight of each movie is calculated by W m = Λ m X q m R m . Map each movie with associated weight and store in ṀẆ hash table. Step 3: Sort all movies stored in ṀẆ based on their weights in non ascending order. Step 4: For each iteration Q ( Q is the number of replicas that a serving peer can store in it.) Select Q number of replicas from ṀẆ w ith the highest weight and place in Q different serving peers subjected to that the highest weighted movies are placed in smallest load first serving peers and no multiple copies of the same mov ie are stored in the same peer. D. Selection of a Serving Peer After placing th e rep licas of m ovies in dif ferent serving pee rs, next step is to select a serving p eer to serve a non ser ving peer . Whenever a non serving p eer makes a request for a movie with its existing reso urce information to the prox y server then proxy ser ver replies to th is no n servin g peer w ith the li st of all acti ve serving peers which contains the requested movie in it . Now the non ser ving peer should make a d ecision in selecting a ser ving peer from the list. One method o f choosing a serving p eer is to select a serving peer randomly. T he dra wback of this method fosters the load imbalance a mong the serving peers. H ence the non serving peer must chec k for th e least loaded serving peer to receive the requested movie. Now the non serving peer m akes a r equest of currently available resource information to all th e serving peers in the list . Then the requested non ser ving peer receives current ly available resource infor mation fro m all the requested (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 285 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 serving pee rs and it sorts all the requested serving peers in non-increasing fashion b ased on the available resource o f the req uested serving peers . Fi nally the no n serving peer will selec t a ser ving peer with t he hi ghest resource availab ility and least load for the receptio n of the movie. Duri ng the transmission session th e serving peers can fail at any time. If the serving peer fails then th e non serving peer again executes the same proced ure to select a serving peer with a highest available resource and least loaded serving peers. In the w orst case scenario if none o f the serving peers ar e available then the movies are directly transmitted to the non serving peer fro m the pr oxy server. E. Relia bility Model. In this section we have formulated a reliability model f or t he replicas of the movies. T he obj ective of this sec tion is to maximize t he availabili ty of popular movies i n the ser ving peer s. As we ha ve alread y discussed the replication strategy in section 3 B an d revealed that more number of replicas is generated for th e most p opular movies. Now, we ha ve t he pr oblem o f availability a nd durability o f the replicas in ser ving peers. This problem raises certain questio ns regardi ng the availabilit y a nd durability of the replicas in the serving peers. How lon g the ser ving peer s can participate in the pro cess of the replication? What is the reliability of servin g peer up time d uring the replication? What happens i f the serving p eer fails abruptly? Ho w often the peers jo in and leave the system as a serving pee r? To solve this pro blem we ap ply CTMC model to nor m alize the n umber of replicas during the tran smission of m ovies. We have considered n states of serving peers i n the CTMC model as shown in Fig. 2 , where k is the functioning of replicas for the m th movie. The state 0 is the absorbing state beyond which there is no replication. We relate our problem with t he Gamb ler ruin pro blem, such that t he probability of a state contains ce rtain number of replicas during tra nsmission can fail and it can be ruined d ue to non availabilit y of replicas in t he long run of the system. He nce we evaluate the s tate duration T p , where T p i s assu med to b e expo nentially distribu ted with mean 1/ λ , where λ i s the failure rate. T hus the reliability is defi ned as P[T p >t] = e - λt at any given time t . Over the period of ti m e, th e peer fails an d decreases the number of replicas in the s ystem. To address these attrition; we must have a repair mechanism to create ne w cop ies of rep licas d uring t he peer failure. T he r epair must identify the lost replica and copy t he lost rep lica to the existi ng servi ng peer. This process take s some duration T r , where we assume that T r is also e xponentially d istributed with mean 1/ μ , where μ is the rep air r ate. T hus the reliability is defined as P[T r >t] = e - λt at any giv en time t . To balance between the fast failure rate and fast cr eation of new replicas, we define a normalize function of repair rate γ as μ / λ . T he repair time T r must be at least the ti me it takes to detec t the lost rep licas and cop y of the new replicas to another servi ng peer. F. Ma rkov chain The ab ove model is anal yzed and reduced to Markov chain. The s ystem has k function ing replicas at any give n point o f time a nd th e re maining (n - k) r eplicas are being rep aired, this sys tem can be modeled as markov cha in havi ng (n+1 ) states. I n state k , a ny one of k functioning replicas ca n fail, in which case it transits to (k - 1) state or any one of (n - k) non function ing replicas is repaired, in which case it transits to (k+1) state. According to the CT MC model the tran sition is defined a s follows. In state k the transition can occ ur either to t he s tate (k - 1) with rate kλ or to the state (k+1) with rate (n - k)μ . Note that sta te 0 is the absorbing state beyond in which there is no replication. Fig.2. CTMC Mode l G. Mean time to failure Due to the d ynamics of peers in the cl uster, we need to calculate the life time of replicas before it is lost permanently. The relevant metric in this model is the time to failure i.e the ti me taken from n rep licas of a state to r each t he ab sorbing state 0 with no rep licas left. We define the life time of rep licas as T s as the prod uct of expected time n e and expected time t e where n e is the number of state it tra verse a nd t e is the total d uration spend in each state. W e defi ne expected life time of replicas in equation T s = n e *t e . .We d erive the values of n e and t e as follows. Let Q k be the probability t hat the system rea ches the absorb ing state 0 starting from state k before state n. Q k satisfies the following recurrence relation, where 0 < k < n Q k =   + (    )/   k+1 + (    )/   + (    )/   k-1  k =  k  k-1 +  k  k+1 Let  k +  k = 1 where  k =   + (    )/    k = (    )/   + (    )/  󰇛  k +  k 󰇜  k =  k  k-1 +  k  k+1 (  k-1   k ) =  k  k (  k   k+1 ) (  k-1   k ) =     (  /  ) (  k   k+1 ) We then start from state (n - 1) , to calculate the probability of traverse rate Q * to reach the abso rbing state 0 is given b y th e follo wing recurrence relation Q* =  1 󰇡   1  󰇢   1  = 0 󰇡   󰇢  (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 286 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 Therefore, the expected value for the number of state traversed is given b y n e =1/Q * . Let T k be the expected time before the system reaches the ab sorbing state 0 staring from state k befor e state n . T k sati sfies the follo wing recurre nce r elation, where 0 < k < n  n = 1  1 +  1 +  n-1  n-1 = 1 󰇛   1 󰇜  1 +  1 + 󰇛   1 󰇜  1 󰇛   1 󰇜  1 +  1  n-2 +  1 󰇛   1 󰇜  1 +  1  n  n-2 = 1 󰇛   2 󰇜  3 + 2  2 + 󰇛   2 󰇜  3 󰇛   1 󰇜  3 + 2  2  n-3 + 2  2 󰇛   2 󰇜  3 + 2  2  n-1     k = 1  k +1 + 󰇛    󰇜  +  k +1 (  k +1 ) + 󰇛    󰇜   k-1 + 󰇛    󰇜  (  k +1 ) + 󰇛    󰇜   k+1 which yields,  n-1 = 1 󰇛   1 󰇜  1 +  n-2 +  1 (  1 +  1 ) 󰇛   1 󰇜  1  n-2 = 1 󰇛   2 󰇜  3 +  n-3 + 2  2 󰇛   2 󰇜  3 + 2  2  1 󰇛   2 󰇜  3 (  1 +  1 ) 󰇛   1 󰇜  1  k = 1 (    + 1)  k +  k -1 (    + 1)  k + 󰇛    󰇜   k-1 +  k  k -1 (    + 1)  k + 󰇛  k   k 󰇜 (    )  k We now start fro m (n - 1) state to calculate the probability of the expected time T * to reac h the absorbing state 0 i s given b y the follo wing recurrence relation.  * =    k -1   1  =0  - 󰇛   k+1 󰇜󰇛  k +  k 󰇜 + (    )  k   =0   =1  Therefore, the total tim e spent in each state is gi ven by t e =1/T * . In ord er to maximize T s , t he system should increase repair ratio o r number of rep licas as large as possible. But the problem is ho w to i ncrease the value o f rep air ratio γ or number o f rep licas with storage cap acity η relatively large. Therefore, we consider the ef fect of some factors to choose the value of η and γ to maximize T s . T he following are the factor s we considered in our system. a) Storage capacity: The num ber of replicas created for a m ovie is limited by the total storage capacity of the system such that η <= η max where η max is the upper limit o n t he number of rep licas d ue to stor age limit. b) Dete cting replica loss: This factor directly effect on t he rep lica r epair time T r such that T r <= τ max where τ max is upp er limit on the nor malized repair rate. c) Rep air bandwidth: Final fa ctor we considered is the bandwidth constraint to cr eate new r eplicas in the system. A replica exists in a peer for a d uration T p and after which pee r dep arts. T he repair m echanism then creates a new replica after ti me T r . T he new rep lica is created o n an average time of E[T p +T r ] =1/λ +1/μ . Starting with η number of replicas w ith B b ytes, t he average bandwidth is d efined by φ =η/(γ+1/γ) , such that φ <= φ max w here φ max is th e bandwidth con straint in terms of bytes per second. VI . SIMULATION AND RESULT In this s ection, we use simulation to evaluate the performance of the pro posed tech nique a nd co mpared the results with the existing techniques. We used MATLAB software to eval uate the p erformance of the system. The result o f the simulation i s evaluated wit h different success p layback probab ility w ithin the cluster. The success playback probab ility is defined for a video with le ngth t , is the prob ability o f success ful reception th roughout the entire duration of the video given that VoD req uest is ad mitted. W e also eval uated the mean life ti me of t he replicas to o bserve the availability o f movie s in the cluster. The to pology u sed in the simulation is a single media server and 5 cluster based net work. E ach cluster constitutes a proxy server and 1500 peers, which i ncludes both serving and no n serving peers. The proxy server consists of band width that ranges from 30MB to 120MB and buffer ra nges from 1000 MB to 5000GB. The prox y server maintains a database of currently streamed/strea ming movie and a list of serving peers within the cluster. The to tal num ber of movies requested in a cl uster is less t han 300 movies per hour. T he leng th of the m ovie is 7200 secs and follows a variab le bit rate fo r the transmission. T he request ar rival rate follows the Poisson distribution. Each serving peer ha s the twice the uplink ba ndwidth of its co unterpart non serving peer and can store upto 10 movies. The po pularity of t he movie follows the Zipf’s distribution with skew factor of 0.271. T he peer follows an exponential distribution with m ean 3600 secs and 32400 secs for uptime and d owntime respectively. The si mu lated model is evaluated several times. T he result shown is an avera ge of all simulation trials is carried out in all cluster s. We simulated differen t replication strategies namely r andom, minimize request (MinReq) and Maximize hit ratio (MaxHit) and also compared with our pro posed technique. Fig. 3 shows number of replicas for each movie. The lower index numbered m ovie i s the most popular and higher i ndex numbered movie is le ast popular. In random replication technique the replicas of the movie are rando m ly distrib uted. In maximize hit rate replication technique the rep licas are almost equall y distributed for each movie. We ha ve observed in MinReq techniq ue more numb er of r eplicas is created for the most pop ular videos and decreases linearly as (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 287 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 the pop ularity of the video decreases. Our prop osed technique is much better t han other techniques. As we observed that the more n umber o f replicas are generated 20 40 60 80 100 0 50 100 150 200 250 300 350 400 Replication Strategy Number of Replicas Video Index Proposed MaxHit MinReq Random Fig. 3. Compar ison of Replicatio n Strategy for most po pular videos and decreases linearly to the least p opular videos without reducing man y replicas for the least popular videos. Our technique is much better in replication because it generates more number of replicas for th e m ost popular videos and not m uch replicas for the lea st popular video s when compared to other techniques. Fig. 4 shows the success playback probability of movies with r espect to p oison arrival rate. Initially, when the arrival rate is less, the n we get 10 0% successful pla y back. As the arr ival rate increases with respect to time t he s uccess p layback p robability decreases linearly. As we o bserved that our pr oposed technique ac hieves success playback pr obability is 8% more than minimize request and 15% more than maximize hit and 18 % more than random technique. Fig. 5 shows the success p layback probability with respect to availabilit y of serving peer s. We plot s uccess playback pro bability for ser ving peers a vailability ranging from 0.05 to 0.25. We observed that as the availability o f serving peers decrea ses the su ccess playback probability is also decreases we get 99.8% success playback probabilit y when the availabili ty o f the ser ving p eers reaches 0 .24 out o f all G serving peers. We observe our prop osed technique achieves the success playback pr obability of 4% more than minimize request and 1 6% more than maximize hit rate and 24 % more than rando m techniq ue. Fig. 6 shows the mean lifetime for the num ber of replicas available in the syste m. Using our proposed replication technique, w e v aried the rep air ratio γ as discussed in section III E, and calculated the mean life time of replicas. As the val ue o f γ=0.1 the life time of replicas ar e less when co mpared to the value γ=1 0.0, on contrary we e valuate the mean of li fe to differe nt number o f replicas. As we observed in Fig. 7, w hen repair ratio γ decreases, the life ti m e of the rep licas also 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Success Rate Sucess Playb ack Probability Arrival Rate Proposed MinReq MaxHit Random Fig. 4. Success Pl ayback Probabil ity of Movies . 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Peer Availibility Success Playback Probability Availability of Peers Proposed MinReq MaxHit Random Fig. 5. Availabili ty of Serving Pe ers. decreases and when we increase the repair ratio γ the life time of the replicas i ncreases exponentially. The success rate o f the VoD system is sho wn in the Fi g. 8. W e can observe the figure that most of the req uested movies ar e served i mmediately eit her fro m the prox y server or serving peer s. If it is served from the serving peer then there is so me latency i n chaining as shown in the Fig. 8. The rejec tions o f the movies are very less. Fig. 9 a nd Fig. 10 sho w average bu ffer and bandwidth utilization in the prox y ser vers. As we observe fro m the figure, initially more number of bu ffer and bandwidth is utilized. As ti me pr oceeds we ob served that there was an enormous decrease in the utilization of t he proxy server’s bandwid th a nd buffer due to chaining amon g the peers in the cluster. (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 288 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 Fig. 6. Mean L ifetime of the replicas Fig. 7. Mean L ifetime of the replicas based on repa ir ratio V. CONCLUSION In this paper, we have studied the d ifferent types of data replication techn iques and p roposed a new novel replication technique to generate more number of replicas. We addr essed the issues of identifying a peer, generating the replicas, placin g the replicas in the pee rs, selecting a serving pee rs and reliab ility o f replicas. In particularly, we ha ve invest igated intensel y for the reliability of replicas in the VoD system. Hence, w e have applied the CTMC model and analyzed the reliability of r eplicas. Our si mulation shows pro mis ing results i n generatin g more number of replicas and achieving the greater success p layback rate compared to the existing tech niques. We have also p roved, greater lifetime for the availability of replicas in the p eers b ut with the cost o f short latency in copying replica s of t he failed peer to a new ser ving peer . Finally, we have achieved the most efficient way of utilizing t he overall bandwidth and buffer in the Vo D s ystem. Further the paper can be i mproved by co mparing with RSE technique for the better results. Fig. 8. Success r ate of the VoD System Fig. 9. Ave rage Buffer Utilizatio n in proxy servers. Fig. 10. A verage Bandwidth Util ization in pro xy servers. 0 50 100 150 200 250 300 20000 40000 60000 80000 100000 Lifetime of the Replicas Mean Lifetime [Minutes] Number of Replicas gamma=10.0 gamma=1.0 gamma=0.1 1 2 3 4 5 6 7 8 9 10 20000 40000 60000 80000 100000 Repair Ratio Mean Lifetime [ Minutes] Gamma 300 Replicas 200 Replicas 100 Replicas 50 Replicas 500 1000 1500 2000 2500 3000 0 20 40 60 80 100 120 140 160 180 200 220 240 Average Number of Clients Arrival /Served/Rejected in the System Number of Movies Time in Minutes Arrival Served Rejected 500 1000 1500 2000 2500 3000 1000 2000 3000 4000 5000 Average Buffer Utilization Buffer in GB Time in Minutes Buffer (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 289 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 REFERENCES [1] R.Ashok Kumar , K .Hareesh, Dr. K.Ganesan, Dr. D.H Manjaiah ,”M -Chaining s cheme for Vo D Application o n cluster based Markov Process”, International Journal of Advance Media and Communication, I nderscience Pu blications [2] XUGUANG LAN, NANNING ZHENG, JIANRU XUE, WEIKE CHEN, BIN WANG, AN D WEN M A,2008. Manageable Peer- to -Pe er Architecture for Vid eo- on -Demand, I EEE Internat ional Symposium on Parallel and Distr ibuted Process ing. [3] L EE J. Y. B. 1998, Parall el Video Se rvers: A Tutorial, IEEE Multimedia , vol .5(2), pp.20 – 28. [4] ON G., ZI NK M., LIEPERT M., GRIWODZ C., SCHMIT T J.B ., AND STEINMETZ R., 2001.Replication for a Distribute d Multimedia System, In Proceedings of the Eighth International Conference on P arallel and Distri buted Systems , pp.37−42. [5] MOHAMED M. HEFEEDA, BHARA T K. BHARGAVA, AND DAVID K. Y.YAU, 2004. A Hybrid Architecture f or Cost - Effective On-Demand Media Strea ming,” The International Journal of Com puter and T elecommunic ations Networking, Volume 44 , I ssue 3, Pages: 353 – 38 2. [6] WANG Y., ZHANG Z., DU D., AND SU. D, 1998.A network- conscious approach to end- to -end video delivery over w ide area networ ks using proxy servers, In Proc . of IEEE INFOCO M’98 , San Francisco, CA , USA. [7] GUO L ., CHEN S., REN S., CHEN X., AN D JIANG S . 2004. PROP: a scala ble and reliable P2P assisted pro xy streaming system, I n Proc. ICDCS’04 , Toky o, Japan,pp.778-786. [8] PA DMANABHA N V. N., W ANG H. J., CHOU P. A ., A ND SRIPANI DKULCHAI K ,2002.Distributing st reami ng media content using cooperative networking, Te chnical Report, MSR- TR -2002- 37 , Microsoft Rese arch, pp.1-13. [9] LI J.,2005. Pe er streaming: A n on-demand p ee r- to -peer media streaming solution based on a rece iver-driven streaming protocol, In IEEE 7th Works hop on Multimedia Sig nal Processing . [10] HUA K. A ., SHEU S., AND WA NG J. Z. 1997. Earthwo rm: a networ k memory management t e chnique for large -scale distributed multimedia applications, In Pro c. IEEE INFOCOM ’97 , vol. 3, K obe, Japan, pp. 990 – 997. [11] SHEU S., HUA K . A., AND TA VANAPONG W, 1997. Chaining: a generalize d b atching t echnique for video - on - demand systems,In Proc. IEEE Int. Conf. Multimedia Computing and Systems , Ottawa, ON, Ca nada, pp. 110 – 117. [12] CHEN J. K. AND WU J. L. C. 1 999. Adaptive chaining scheme for dist ributed VOD applications, IEEE Trans. Broadcast. , vol. 45, no. 2, pp. 2 15 – 224. [13] TE-CHOU SU, SHIH-YU HUANG, CHEN-LUNG CH AN , AND JIA-SHUNG WANG ,2005. Optimal Chaining Scheme for Video- on -Demand Applications on Collaborative Netwo rks, IEEE TRANSACTION S ON MULTIMED IA , VOL . 7, NO. 5, pp- 972-980. [14] LIU Z., SHEN Y., PAN WAR S., ROSS K. W., AND WANG Y.2007. Efficient substream encoding for P2P video, In 16th Packet Video W orkshop , Lausanne. [15] POON, W.F . LEE, J.Y.B. CHIU, D.M. 2005.Comparison of Data Replication Strateg ies for Peer- to -Peer Video Streaming.. In proc Fifth International Conference on Information, Comm unications a nd Signal Process ing , [16] YE C AND CHIU D.M. ,2007. "Pe er- to -peer replication wit h preferences" , In Proc. ACM Inter. Confere nce on Scalable Information Syst ems , Suzhou, China. [17] L ETIAN RONG,200 8. Multimedia Resource Replicatio n Strategy for a Pe rvasive Peer- to -Peer En viro nment, Academy publisher, Journ al of Computers , VO L. 3, NO. 4. [18] ZHE XIANG , QIAN ZHANG,WENWU ZHU, ZHENSHENG ZHANG AND YA-QIN ZHA NG,2004. Peer- to -Peer Based Multimedia Distribu tion Service, IEEE TRANSACTI ONS ON MULTIMEDIA , VOL . 6, NO. 2. [19] CHARLES BLAK E AND RO DRIGO RODRIGUES 2003. High Availability , Scalable Storage, Dy namic Peer Networks, Proceedings of HotOS IX: The 9th Workshop on Hot Topics in Operating Systems ,Lihue, Haw aii, USA. [20] SRI RAM RAMABHA DRAN AND JOSEPH PASQUALE , 2006.Analy sis of Long-Running Replicate d Systems, INFOCOM 2006. I n Proc. of 25th IEEE International Conference on C omputer Commu nications . Barcelona. [21] BYUNG-G ON CHUN, FRA NK DABE K, ANDREAS HAEBERL EN, EMIL SIT , H AKI M WEATHERSPOON, M. FRANS KAASHOEK , JOHN KUBIATOWI CZ, AND ROBERT MORRIS. 2006 .Eff icient Replica M aintenance for Distributed Storage Systems, In Proceedings of the 3rd USENIX Symposium on Networked Systems Design and Implemen tation (NSDI '06). [22] FABIO PICCONI,BRUN O BAYNAT AND PIERRE SENS 2007. Predicting durability in DHTs using Markov chains, In Proc.of 2nd International Confere nce D igital Information Management , 20 07. ICDI M '07.28-31. AUTH ORS PROFILE R Ashok Kumar is a R ese arch Schol ar in the School of Computing Science at VIT University , V ellore, India. He received his BE degree in I nformation S cience & Engine ering f rom Ba ngalore University , and MT ech i n Computer Science & Engineering from Visv esvaray a T echnological University , Belgaum, India. His area of interest is Multimedia Appli cations, Database Management Syste ms, Internet T echnologies and Quality of Service. His current research includes V ideo on demand syste ms and Bandwidth Management. Dr. K. Ganesan is currently working as a Senior Professo r in the School of Computing Science s, VIT University , Vel lore, I ndia. He is also currently heading t he Ce ntre of Rele vance an d Exce llence in “Automotive I nfotronics” at VI T University, Vel lore. H e is also currently working on a Defense Research projected related with “Cryptography ”. He has published about 50 p apers in I nternational journals and conferences (National and Inter national level). He has been honored as the International Scientist of the Year 2008 by the International Biographical Centre of Cambridge, England. H e h as been chosen as a candidate fo r inclusion in the 10 th Anniversary Edition of Mar quis Who’s Who in Sci ence and Engineering. His areas of research include Media proce ssing (Image and Video), Mobile Computing a nd Data securi ty. (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 290 http://sites.google.com/site/ijcsis/ ISSN 1947-5500

A Reliable Replication Strategy for VoD System using Markov Chain

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment