Non real time simulation and results

We start with a scenario we created where we simulated a video of 20 minute length using the bandwidth profile shown in Fig. 1. We compared our online SVC, $`W=120`$, $`\alpha=60`$, $`PE= 20\%`$ with Netflix BBA-0 algorithm which is described on [2].
It is clearly shown from the figure that online SVC delivers frames in higher average play rate than BBA-0. Average play rate of SVC is 1.48Mbps while it is 1.0 Mbps for BBA-0.
BBA-0 is buffer based algorithm. It does not really predict the bandwidth, and that is why we see from figure() it does not start delivering frames in $`1^st`$ enhancement layer at the beginning. It takes it more than 8 minutes to reach the point in the buffer where the algorithm decides to fetch frames in higher layer. However, because the bandwidth is varying in a way peak-trough-peak, we see that even after the algorithm starts fetching higher layer, it keep oscillating between base and 1st enhancement layer. In other hand, SVC starts this oscillation process very early, thanks to the prediction. It takes the advantage of the peak to fetch frames in highest possible quality considering not to run into stalls in frames corresponding to trough time.

Non real time toy example, (a) Bandwidth profile, and (b) Playback rate

Non real time real scenarios

We compared our online algorithm ($`W=120`$, and $`\alpha=20`$, $`PE=20\%`$) with BBA-0 algorithm in some real collected bandwidth profiles. We used bandwidth profile 3 with average bandwidth of 2.5, and bandwidth profile 4 with average bandwidth of 3, and 2Mbps. 3Mbps looks high, but actaully the high bandwidth regime comse very late. It is at the end of the video time. In the case of bandwidth profile 3 with average 2.5Mbps, as shown in Fig. 2, SVC starts fetching frames in the quality of the highest enhancement layer immidiatlly and it continues delivering frames on this quality level, while it took the BBA0 algorithm about 250s to reach the highest enhancement layer. The reason is that the average bandwidth is high and BBA0 is waiting to fill the buffer up to $`B \geq B_{min}`$ $`(90)`$ beforeit starts fetching frames inquality of higher enhancement layer, and up to $`B \geq B_m`$ $`(120)`$ to deliver frames in highest quality level. Average delivering quality of SVC is 2.03Mbps, and average delivering quality of BBA-0 is 1.6Mbps

Playback rate of non real svc vs netflix BBA-0, for bandwidth profile 3 with average BW of 2.5Mbps

In the case of bandwidth profile 3 with average 1.5Mbps, as shown in Fig. 3, SVC starts fetching frames in the quality of the highest enhancement layer immidiatlly and it continues delivering frames on this quality level, while it took the BBA0 algorithm about 250s to reach the highest enhancement layer.However, close to the end the bandwidth is running very low, so SVC algorithm run into many stalls.It took it 74 more seconds to finish delivering all frames. It is expected when the window size is small, average bandwidth is low, and the good regime comes first to see stalls when bandwidth run low later. Average delivering quality of SVC is 1.7Mbps, and average delivering quality of BBA-0 is 1.45Mbps

Playback rate of non real svc vs netflix BBA-0, for bandwidth profile 3 with average BW of 1.5Mbps

In profile 4 with average bandwidth of 3Mbps, the bandwidth starts very low at the beginning and it gets much higher at the end. As the Fig. 4 shows, since the bandwidth is very low at the begining both algorithms continue fetching frames at the base layer quality. However, close to the end the bandwidth improves significantly. Therefore, SVC instantainuosly takes advantage of the bandwidth improvement and delivers the remaining frames at thehighest possible quality. In the contrast to that, BBA-0 continue buffering frames and it does not move to fetching higher layers until very late where almost the video reached its end. The average quality of SVC is 1.7Mbps while the average quality of BBA-0 is 1.12Mbps.

Playback rate of non real svc vs netflix BBA-0, for bandwidth profile 4 with average BW of 3Mbps

In profile 4,with average bandwidth of 1.9 Mbps. As the Fig. 5 shows, netflix finish fetching frames sooner than SVC in the cost of lower quality. Moreover, we can see that both algorithms run into stalls in the play back because of the low bandwidth at the beginning. The total amount of time interruptions for SVC is little bit higher than netflix with values of 41 and 30 respectively. However, SVC was able to deliver about $`40\%`$ of the frames on a quality of $`1^{st}`$ enhancement layer or more while netflix algorithm fetche all frames in base layer quality.

Playback rate of non real svc vs netflix BBA-0, for bandwidth profile 4 with average BW of 2Mbps

EVALUATION

In this section, we evaluate our algorithms (LBP) using both simulation and emulation. Simulation allows us to explore a wide spectrum of the parameter space. We then implemented a TCP/IP-based emulation testbed to compare its performance with simulation and to measure the runtime overhead in §69.1.

Simulation Parameters

playback layer	BL	EL1	EL2	EL3
nominal Cumulative rate (Mbps)	0.6	0.99	1.5	2.075

SVC encoding bitrates used in our evaluation

Simulation Setup. To make our simulation realistic, we choose the SVC encoding rates of an SVC encoded video “Big Buck Bunny", which is published in . It consists of 299 chunks (14315 frames), and the chunk duration is 2 seconds (48 frames and the frame rate of this video is 24fps). The video is SVC encoded into one base layer and three enhancement layers. Table 1 shows the cumulative nominal rates of each of the layers. The exact rate of every chunk might be different since the video is VBR encoded. In the table, “BL” and “EL$`_i`$” refer to the base layer and the cumulative (up to) $`i`$th enhancement layer size, respectively. For example, the exact size of the $`i`$th enhancement layer is equal to EL$`_i`$-EL$`_{(i-1)}`$.

For all schemes (both the baseline approaches and our algorithms), we assume a playback buffer of 10 seconds ($`B_{m}=10s`$) for the skip version and 2 minutes for the No-Skip version, and a startup delay of 5 seconds. We will systematically study the impact of different algorithm parameters, including prediction accuracy, prediction window size, and playback buffer size in Appendix 66. Finally, for all the variants of our algorithms with short prediction ($`W \leq 20s`$), we choose the lower buffer threshold to be half of the maximum buffer occupancy ($`B_{min}=B_{m}/2`$). When the buffer is less than $`B_{min}`$, we drop the highest layer that was decided to be fetched (unless the decision is fetching only the base layer). We still run the optimization problem, collect the layer size decisions, but we decrement the number of layers by 1 if enhancement layers are decided to be fetched. This helps being optimistic when the buffer is running low since the algorithm with short prediction have limited knowledge of the bandwidth ahead. All reported results are based on the 50 diverse bandwidth traces described next.

Bandwidth traces. For bandwidth traces, we used the dataset in , which consists of continuous 1-second measurement of video streaming throughput of a moving device in Telenor’s 3G/HSDPA mobile network in Norway. The dataset contains 86 bandwidth profiles (traces) for different transportation types including bus, car, train, metro, tram, and ferry. We exclude traces with either very high or low bandwidth since in both cases the streaming strategies are trivial (fetching all layers and only base layers, respectively). We then ended up having 50 traces whose key statistics are plotted in Fig. 6. Overall the traces are highly diverse, with lengths varying from 3 to 30 minutes. We note that since the “Big Buck Bunny" is 598s. The video is re-started for long traces and cut at the end of the trace for short traces.

The average throughput across the traces varies from 0.7Mbps to 2.7 Mbps, with the median being 1.6 Mbps. In each trace, the instantaneous throughput is also highly variable, with the average standard deviation across traces being 0.9 Mbps.

Statistics of the bandwidth traces: (a) mean and standard deviation of each trace’s throughput, and (b) trace length, across the 50 traces.

Bandwidth Prediction. We consider two different techniques for bandwidth prediction. First is a harmonic mean based prediction in which the harmonic mean of the bandwidth of the last 5 seconds is used as a predictor of the bandwidth for the next 20 seconds. We refer to our algorithm with harmonic mean based prediction by HM. Second, we assume crowd sourced prediction, and a combination of prediction window size with prediction error percentages. Longer prediction window comes with the cost of higher prediction error. For example we use $`(10,25\%)`$ to refer to the prediction window ($`W`$) of 10 seconds and the prediction error $`pe`$ of 25%. In our simulation, the predicted bandwidth is computed by multiplying the actual value in the bandwidth trace (the ground truth) by $`1+e`$ where $`e`$ is uniformly drawn from $`[-pe, pe]`$ (based on our findings in Appendix 70, the prediction error tends to have a mean of 0 in the long run). For skip version (real time streaming), we evaluated our algorithm in case of $`(10,25\%)`$ and $`(20,50\%)`$ since chunks beyond 20 seconds ahead might not be available yet. However, for the No-Skip version (non-real time streaming), we considered $`(20,50\%)`$ and $`(100,60\%)`$. We also include the offline scheme $`(\infty,0)`$, for comparison. It corresponds to the performance upper bound for an online algorithm, which is given by our offline algorithm.

Skip Based Streaming

We compare our skip-based streaming algorithm (§32.3) with three baseline algorithms with different aggressiveness levels. Baseline 1 is a conservative algorithm performing “horizontal scan” by first trying to fetch the base layer of all chunks up to the full buffer. If there is spare bandwidth and the playout buffer is not full, the algorithm will fetch the first enhancement layer of buffered chunks that can be received before their playback deadline. If the bandwidth still permits, the algorithm will fetch the second enhancement layer in the same manner. Baseline 2 instead aggressively performs “vertical scan”, it fetches all layers of the next chunk before fetching the future chunks. Baseline 3 is a hybrid approach combining Baseline 1 and 2. It first (vertically) fetches all layers of the next chunk and if there is still available bandwidth, it subsequently (horizontally) fetches the base layer of all later chunks before proceeding to their higher layers.

Skip based streaming results for different schemes: (a) layer distribution, (b) average playback rate, and (c) layer switching rate.

We compare the above three baseline approaches with three representative configurations of our proposed online LBP algorithm. They are referred to as HM (harmonic mean based prediction), $`(10,25\%)`$, and $`(20,50\%)`$. Moreover, we include our offline algorithm which has a perfect bandwidth prediction for the whole period of the video.

The results are shown in the three subplots of Fig. 7. Fig. 7-a plots the breakdown of the highest fetched layers of each chunk (“S” refers to skipped chunks). For example, for Baseline 1, $`26.5`$% of chunks are fetched only at the base layer quality (shown in light blue). The average playback rate (across all 50 traces) for each scheme is also marked in the plot. As shown, our schemes significantly outperform the three baseline algorithms by fetching more chunks at higher layers with fewer skips. Even when the prediction window is as short as 10 seconds, our scheme incurs negligible skips compared to Baseline 2 and 3, and yields an average playback bitrate that is $`\sim`$25% higher than Baseline 1. As the prediction window increases ($`W=20s`$ and $`pe=50\%`$), the layer distribution becomes very close to the offline scheme.

Fig. 7-b plots the CDF of the average playback rate of all the schemes across all traces. As shown, even with a prediction window of as short as 10 seconds, our online scheme achieves playback rates that is the closest to those achieved by the offline scheme across the 50 traces. One more interesting observation from Fig. 7-b is that both variants of our algorithm (HM, and ($`10`$,$`25\%`$)) outperform Baseline 1 in terms of average playback rate in every bandwidth trace. Also note that although Baseline 2 and 3 achieve higher playback rates than Baseline 1, they suffer from a large number of skips as shown in Fig. 7-a.

Fig. 7-c plots for each algorithm the distribution of the layer switching rates (LSR), which is defined as $`\frac{1}{C*L}\sum_{i=2}^C |X(i)-X(i-1)|`$ where $`C`$ is the number of chunks, $`L`$ is the chunk duration, and $`X(i)`$ is the size of chunk $`i`$ (up to its fetched layer). Intuitively, LSR quantifies the frequency of the playback rate change, and ideally should be minimized. Baseline 1, behaves very conservatively by first fetching the base layer for all chunks up to full buffer. Therefore it has lower layer switching rates at the cost of lower playback rates. Our algorithms instead achieve reasonably low layer switching rates while being able to stream at the highest possible rate with no skips.

We note that larger prediction windows can lead to better decisions even if the prediction has higher error. As long as the bandwidth prediction is unbiased, we see that higher prediction errors can be tolerated. Appendix 70 shows that crowdsourcing-based prediction is an unbiased predictor of the future bandwidth. Moreover, more results about the effect of the prediction error on the proposed algorithm are described in Appendix 66. Further, we show that the computational overhead of the proposed approach is low, as described in Appendix 54.

No-Skip Based Streaming

No-Skip based streaming results for different schemes: (a) layer distribution, (b) average playback rate, (c) total rebuffering time, and (d) layer switching rate.

We now evaluate the no skip based algorithm. We compare it with three state-of-the-art algorithms: buffer-based algorithm (BBA) proposed by Netflix , Naive port of Microsoft’s Smooth Streaming algorithm for SVC , and a state-of-the-art slope-based SVC streaming approach . To ensure apple-to-apple comparisons, we adopt the same parameter configuration (2-minute buffer size and 1-second chunk size) and apply the algorithms to all our 50 traces. Before describing the results, we first provide an overview of the three algorithms we compare our approach with.

Netflix Buffer-based Approach (BBA ) adjusts the streaming quality based on the playout buffer occupancy. Specifically, it is configured with lower and upper buffer thresholds. If the buffer occupancy is lower (higher) than the lower (higher) threshold, chunks are fetched at the lowest (highest) quality; if the buffer occupancy lies in between, the buffer-rate relationship is determined by a pre-defined step function. We use 40 and 80 seconds as the lower and upper thresholds. The quality levels are specified in terms of the SVC layers (“the highest quality” means up to the highest layer).

Naive port of Microsoft Smooth Streaming for SVC (NMS) employs a combination of buffer and instantaneous bandwidth estimation for rate adaptation. NMS is similar to BBA in that it also leverages the buffer occupancy level to determine the strategy. The difference, however, is that it also employs the instantaneous bandwidth estimation (as opposed to the long-term network quality prediction we use) to guide rate adaptation. As a result, for example, it can fetch high-layer chunks without waiting for the buffer level reaching the threshold as is the case for BBA.

**Slope-based SVC Streaming ** takes the advantage of SVC over AVC. It can download the base layer of a new chunk or increase the quality of a previously downloaded (but not yet played) chunk by downloading its enhancement layers. This is achieved by defining a slope function: the steeper the slope, the more backfilling will be chosen over prefetching. Following the original paper’s recommendations, we empirically choose 2 slope levels (SB1: -7%, and SB2: -40%). We verified that these two settings provide good results compared to other slope configurations ( going steeper than SB1 causes longer stall duration and going flatter than SB2 makes the playback rate lower).

The results are shown in four subplots in Fig. 8. Fig. 8-a plots the layer breakdown. The average playback rate and the total rebuffering time (across all 50 traces) for each scheme are also marked. As shown, in terms of rebuffering time, our online schemes with crowd sourced bandwidth prediction achieve the lowest stall duration even when the prediction window is as short as 20 seconds ahead. On other hand, NMS performs poorly in terms of avoiding stalls since It runs into almost an hour of stalls (53 minutes). Moreover, all variants of our online algorithm including HM significantly outperform other algorithms in fetching higher layers. For example, (20,50%) fetches only $`16\%`$ of the chunks at BL quality which is 57%, 70%, 62%, and 58% fewer then BBA0, SB1, SB2, and NMS respectively. Also, as the prediction window increases, the layer distribution becomes closer to the offline scheme, with the shortest stall duration incurred. Fig. 8-b and Fig. 8-c plot for each algorithm the distribution of the (per trace) average playback rate and the stall duration across all traces. The results are consistent with our findings from Fig. 8-a: our scheme achieves high playback rate that is the closest to the very optimistic algorithms (NMS) while incurring stalls that are as infrequent as the very conservative algorithms (SB3 and BBA). Thus, it is clearly shown that our algorithm is maintaining a good trade-off between minimizing the stall duration and maximizing the average playback rate. Fig. 8-d plots for each algorithm the distribution of the layer switching rates (LSR, defined in §54.1). Similar to the skip based scenario, our schemes achieve much lower LSR compared to the aggressive approach (NMS). The LSR can further be reduced but at the cost of reduced playback rate.

To conclude this section, we would like to point out the key points behind achieving better performance for our algorithm as compared to the baselines. First, incorporating chunk deadlines, bandwidth prediction, and buffer constraint into the optimization problem yields a better decision per chunk. Moreover, favoring the later chunks helps the algorithm avoid being overly optimistic now at the cost of running into skips later on. Finally, re-considering the decisions after the download of every chunk with the new updated bandwidth prediction helps make the algorithm self-adaptive and more dynamically adjustable to the network changes. The low complexity of the algorithm allows for re-running the algorithm and changing decisions on the fly.

Emulation over TCP/IP Network

To complement our simulation results, we have built an emulation testbed using C++ (about 1000 LoC) on Linux. The testbed consists of a client and a server. All streaming logics described in §32 are implemented on the client side, which fetches synthetic chunks from the server over a persistent TCP connection. We deploy our emulation testbed between a commodity laptop and a server inter-connected using high-speed Ethernet (1Gbps link and 1ms RTT). We use Dummynet on the client side to replay a bandwidth profile by dynamically changing the available bandwidth every one second. We also use the Linux tc tool to inject additional latency between the client and server.

Trace No.	1	2	3	4	5	6
Average rate (Mbps)	5.05	6.95	5.9	6.14	5.3	6.8
Standard deviation (Mbps)	4.3	6.65	4.7	5.25	3.84	7.02

LTE bandwidth traces

We next run the emulation experiment using six bandwidth traces, each of length 15-minutes. These traces were collected on an LTE network on different drive routes (as described in Appendix 70). Table 2 shows the statistics of the bandwidth traces, and since the bandwidth of the traces are high, we used the following cumulative SVC rates, $`1.5Mbps`$ (BL), $`2.75Mbps`$ ($`EL_1`$), $`4.8Mbps`$ ($`EL_2`$), $`7.8Mbps`$ ($`EL_3`$) . We configure the end-to-end RTT to be 60ms, which roughly corresponds to the last-mile latency in today’s LTE networks. Meanwhile, we run the same bandwidth traces under identical settings using the simulation approach. Since all traces confirm similar behavior, we explain the results of one bandwidth trace, so we can have both the quality CDF and the playback quality over time.

Fig. 9-a compares the simulation and emulation results in terms of the qualities of fetched chunks, and Fig. 9-b compares the chunk quality distribution. As shown, the simulation and emulation results well cross-validate each other. Their slight difference in Fig. 9-a is mainly caused by the TCP behavior (slow start after idle) that may underutilize the available bandwidth.

Emulation vs simulation: (a) playback bitrate over time, (b) chunk quality distribution.

Real time algorithm simulation and results

To evaluate the algorithm, we first created an example scenario where we assumed that a user is moving on a road and a tunnel is ahead of him. The user is downloading a 3 minute video. The moving scenario and the real bandwidth are shown in Fig. [fig:tunnelEx]. As seen from the figure the user experience 3 zones with average bandwidth of about 1.8Mbps. We considered the base layer and $`3`$ enhancement layers with start up delay of $`6`$ seconds. We compared our algorithm with $`3`$ naive strategies. Naive1 which works as follow; It fetches the base layer of all frames before it starts to fetch the $`1st`$ enhancement layer of frames that can have their $`1st`$ enhancement layers before their deadline. Once it fetches the $`1st`$ enhancement layer, it moves to fetching the $`2nd`$ enhancement layer and so on. Naive 2, and naive3 fetch all layers of frame $`i`$ before start fetching all layers of frame $`j`$ when $`i < j`$.

Algo	#stalls	#b	#e1	#e2	#e3	APBR
naive1	0	117	13	4	41	1.375
naive2	40	0	0	0	135	1.645
naive3	29	31	0	0	115	1.59
SVC	0	8	107	0	60	1.76

Tunnel scenario results

Playback rate of the tunnel scenario

It is clearly shown from table 3 and play back Fig. 10, both naive 2 and 3 are running in many stalls with 40 stalls for naive 2 and 29 stalls for naive 3. The reason behind that is the very optimistic behavior of both algorithms in delivering current frames in highest possible enhancement layer without considering the future. Delivering current frame in highest enhancement layer may lead to loosing enough bandwidth to deliver some of next frames even in base layer. What happened is that, initially ($`1^{st}`$ minute), the available bandwidth was high and add to that a start up delay of $`6s`$, so naive 2 and naive 3 went to fetch all frames corresponding to the 1st minute in highest enhancement layer ($`3^{rd}`$ enhancement layer)) . Naive 2 and 3 work on a way where if there is an available bandwidth at any time instant, they go to get $`3^{rd}`$ enhancement layer for frame $`i`$ before the base layer of frame $`j`$ where $`i < j`$. However, according to the average bandwidth of 1.8, where in average frames can be delivered in $`2^{nd}`$ enhancement layer, if a frame is delivered in $`3^{rd}`$ enhancement layer there will be high possibility that another one will not be delivered at all. And actually this is exactly what happened according to the given scenario. After the user entered the tunnel, we start to see naive 2 and naive 3 running into stalls, and most of the frames which their fetching time is corresponding to the time spent under the tunnel were not delivered. In contrast to that, naive 1 is very conservative to the point that many frames are delivered in low quality even though the bandwidth allows for higher quality to be delivered. Naive 1 does not run into stalls, but it delivers frames in the lowest average quality of $`1.375Mbps`$. It does not start fetching $`1^{st}`$ enhancement layer of remaining frames until it fetches the base layer of all frames. It does the same thing when it moves to fetching $`2^{nd}`$ enhancement layer if the bandwidth allows to do so. As we see from figure(), naive 1 played back the 1st 120s (2/3 of the video) in the base layer quality even though the average bandwidth would allow for higher paly back quality. SVC is less conservative than naive 1 and less optimistic than naive 2 and 3. It works in between, where It delivers the frames in the best possible quality considering avoiding future stalls. It optimizes over which frame should be delivered in what quality considering the whole bandwidth profile and all frames. As the table shows, SVC has the highest average of $`1.76`$ which reflects best utilization of the bandwidth with $`0`$ stalls. Even the bandwidth is high in the beginning, SVC does not go to fetch the frames in highest quality level, instead it smooths the quality over all frames. figure() shows that even after a minute the user is under the tunnel, but the playback quality of SVC is still stable, and the user is still watching the frames in the $`1^{st}`$ enhancement layer quality. Moreover, the quality jumped to the $`3^{rd}`$ enhancement layer when the user left the tunnel. We notice that naive 2 and 3 do not deliver about $`23\%`$ and $`18\%`$ of the frames, and naive 1 delivers about $`68\%`$ of the frames in the base layer quality. However, SVC delivers more than $`95\%`$ of the frames in a quality $`\geq`$ $`1^{st}`$ enhancement layer. Let’s take more realistic scenario where the bandwidth is predicted with a prediction error of $`20\%`$, so we can see how prediction error can affect the results. To do so, we simulate a video using the real and predicted bandwidth profiles which are shown in Fig. 11. We again compared the performance of our algorithm with naives 1, 2, and 3.

Real and predicted bandwidth for the tunnel scenario

We see from the results on table 4 and the figure 12 , that even with $`20\%`$ prediction error SVC is still fetching the frames in highest quality on average with value of 1.755. In the online case, algorithm x with RED which were described in section () are used to help making the decision of what to be fetched at any time slot if the real bandwidth is different from the predicted one. According to the predicted bandwidth, certain policy is pre-decided. However, on every time slot algorithm x with RED modify the decision if the real bandwidth is not equal to the predicted one by deferring fetching frames in case $`BR(i) < BP(i)`$ with random probability of quality drop, and fetching as much as BR(i) allows of later frames if $`BR(i) \geq BP(i)`$. Therefore, in real scenarios, delivered quality might be less than the pre-decided one because of prediction error. For example, looking into figure(), we see drop on the quality of the frames $`140`$, and $`142`$. According to the predicted bandwidth a decision was made to deliver all frames beyond $`115th`$ frame in highest enhancement layer quality. However, because of the teal bandwidth was less than the predicted one and there was no enough bandwidth to fetch frames $`140`$, and $`142`$ in the pre-decided quality before their deadlines, the decision is changed and both frames where fetched in $`2^{nd}`$ enhancement layer quality.

Algo	#stalls	#b	#e1	#e2	#e3	APBR
naive1	0	115	12	5	43	1.388
naive2	33	5	0	0	137	1.70
naive3	24	20	4	6	121	1.697
$`SVC`$	0	8	107	2	58	1.7553

Tunnel scenario with 20% prediction error results

Playback rate of the tunnel scenario with 20% prediction error

Real tunnel like scenario:
For this section we collected real trace where a hypothesis of tunnel like is already exist and we compared our algorithm’s performance with all of the 3 naive strategies. We assumed $`20\%`$ of prediction error where the real and predicted bandwidths are shown in Fig. 13. According to the figure, we see 3 zones, 2 zones with fairly good average bandwidth and a zone in between where the bandwidth is running low (tunnel like scenario). Table 5 shows how SVC was able to deliver frames in highest average quality (1.69) without running into stalls. Naive 2 and 3 come second in term of high average (1.47, and 1.47)), but both of them are running into stalls with 24 stalls each. Naive 1 does not run into stalls, but it delivers the frames in the lowest quality in average (1.37). From figure 14, we see that because naive 2 and 3 do not account for the future, They deliver first few frames at the highest quality level, but the cost is that most of the frames that corresponding to the through time are not delivered. Naive 2 and 3 do not deliver almost $`20\%`$ of the frames. In contrast to that, naive 1 is very conservative to the point that it starts fetching frames in $`1st`$ enhancement layer very late on time (after about 1 minutes). Moreover, it delivers about $`60\%`$ of the frames in the base layer. SVC works in between, it jumps to $`1st`$ enhancement layer very early in time (from $`1st`$ frame). It does not go as high as naive 2 and 3 do because it accounts for later frames, and it is not as too conservative as naive 1 does. It optimizes over the highest quality needed for delivering the current frame without hurting later frames.

Real tunnel like bandwidth profile

Algo	#stalls	#b	#e1	#e2	#e3	APBR
naive1	0	76	10	5	24	1.37
naive2	24	20	3	7	61	1.47
naive3	24	19	5	8	59	1.47
$`SVC`$	0	1	84	19	11	1.69

Real tunnel like scenario with 20% prediction error results

Playback rate for real tunnel like scenario with 20% prediction error

Now we want to consider a scenario where the bandwidth is oscillating between as high as it is enough to deliver frames in highest quality (Ex. 2.5Mbps) and as low as it can not allow for fetching frames even in base layer quality (Ex. 0.5Mbps) which is actually a practical scenario with mobile users. To explain how SVC does with compare to naive strategies we used the bandwidth profile that is shown in Fig. 15 with average bandwidth of $`1.5`$, and we assumed a video of 2 minutes length.

Oscillation bandwidth profile

because of the oscillation, it is expected that naive 2 and 3 would run into many stalls. And that is exactly what table 6 and Fig. 16 shows. naive 2 do not completely deliver 37 frames, and naive 3 do not completely deliver 36 frames. The reason behind that is agai the same, both naive 2 and 3 go to fetch frames in highest quality level whenever it is possible (around 2.5Mbps region), and when the bandwidth runs low (around 0.5 Mbps) they do not find enough bandwidth to deliver frames even in base layer. Even though they deliver about $`50\%`$ of frames in a quality that is in the $`1st`$ enhancement layer and higher, they fail to deliver $`30\%`$ of the frames. Naive 1 is able to deliver all frames, but more than 70 seconds out of 120s video length is played back in base layer, and about $`60\%`$ of the frames are delivered in the base layer quality . However, SVC does the best job among all strategies where It delivers frames in highest average quality ($`1.544`$Mbps). Even it does not jump immediately to fetch frames in enhancement layers, it does not wait for long time as naive 1 does. As shown in Fig. 16, after about 20 seconds from the start, and exactly where the bandwidth is very low, SVC starts playing back frames in $`1st`$ enhancement layer quality, That is because the enhancement layer corresponding to this time frames were fetched early on. About $`80\%`$ of the frames are delivered in the $`1st`$ enhancement layer quality or more.

Algorithm	#stalls	#b	#e1	#e2	#e3	APBR
naive1	0	71	28	3	13	1.338
naive2	37	13	5	6	54	1.289
naive3	36	13	8	9	49	1.286
$`SVC`$	0	24	77	4	10	1.544

Oscillation scenario results

PlayBack rate for oscillation scenario

Last example was short video assuming full knowledge of the bandwidth with no prediction error. However, to make the scenario close to the reality, let’s assume the same oscillating bandwidth profile but for a video of $`10 minutes`$ length and the bandwidth profile shown in Fig 17. Moreover, we use the online version where the bandwidth can be predicted for a window of $`W`$ seconds. Let us assume that $`W=2 minutes`$ minutes, and fetching policy recalculation every $`30s`$ ,$`\alpha=30s`$. We assume prediction error (PE) and the prediction error is increasing as time gets far from the re-calculation point $`k\alpha`$. Finally, we assume that the prediction error of the $`1st`$ 20 seconds starting at $`k\alpha+1`$ to $`k\alpha+20`$ is $`10\%`$, and the prediction error of the next 20 seconds is $`15\%`$ and so on. i.e PE=[10 15 20 25 30 35].

Long video, oscillating bandwidth profile scenario with prediction error

Table 7 shows a compares the SVC, $`W=120`$, and $`\alpha=30`$, $`SVC\_30`$, with the 3 naive strategies. We see from table 7 and the play back Fig.18 that even though the SVC does not have a prediction of the full video duration bandwidth up front, it still have the best average play rate (1.468 Mbps), and it does not run into stalls. To reduce the effect of early decisions on later frames $`\alpha`$ has practically chosen to be much less than the window size $`W`$, so every $`k*\alpha`$, the fetching policy is recalculated and the decision that was made for fetching frames from $`k*\alpha+1`$ to $`(k-1)*\alpha+W`$ could be changed at the recalculation point $`k*\alpha`$. We see from the play rate and the CDF figures that SVC delivers about $`75\%`$ of the frames in $`1st`$ enhancement layer. It is expected that the performance of the online algorithm could be lower than the offline version, but we still see better performance than the naive strategies in the oscillating scenario. from Fig.18, we see that naive 2 and 3 run into many stalls, They do not deliver about $`30\%`$ of the frames. Naive 1 fetches most of the frames in the base layer and by the time the video reaches about 420second of 600seconds video which is about $`70\%`$ of the video time, quality changes to $`1st`$ enhancement layer which means that $`70\%`$ of the frames are delivered in the base layer and by the time naive 1 jumps to fetch higher layer, the video is reaching its end.

Algorithm	#stalls	#b	#e1	#e2	#e3	APBR
naive1	0	414	64	19	98	1.325
naive2	179	102	24	20	270	1.278
naive3	177	104	25	26	263	1.278
$`SVC`$	0	159	418	13	5	1.468

Long video, oscillating scenario results

PlayBack rate for long video oscillating scenario

In reality, scenarios might not be exactly any of the above mentioned. Therefore, in next section we will show comparisons of SVC with other naive strategies for real collected bandwidth profiles. We will see that even running online algorithm with prediction error SVC still outperform all naive strategies in many scenarios; Especially, when in average the bandwidth can allow frames to be fetched in $`1st`$ enhancement layer (moderate bandwidth regime).

Real scenario simulation

We start with short videos and offline algorithm (full non causal knowledge of the bandwidth profile)

offline algorithm

We simulated a video of $`2`$ minute length with 4 average bandwidth level for real collected bandwidth traces defined as very good regime with average bandwidth ($`Avg BW=2.5`$), good regime ($`Avg BW=1.80`$) , moderate regime ($`Avg BW=1.52`$), and bad regime ($`Avg BW=1.1`$). We used 2 bandwidth profiles. In first profile, in average, bandwidth starts at good regime and decreases with time, and the other profile is the flip of the $`1st`$ one. Both bandwidth profiles are shown in Fig. 19 (a) and (b)

Real Bandwidth Profiles

Bandwidth profile 1

In the very good regime (Avg BW=2.5Mbps), and as table 8 and Fig. 20 (a) show, our algorithm does as good as the algorithm that fetches the higher layers of the current frame before fetching the base layer of the next frame. We see that even in good regime (Avg BW=1.8Mbps) these algorithms run into stalls while our algorithm keeps balancing between moving to higher layers and avoiding stalls. Because in the very good regime the average bandwidth is high, all algorithms deliver all frames before their deadlines and there are no stalls in any of them. However, the average delivered quality of SVC and naive 2 are the highest with value of 2.133. Both SVC and naive 2 deliver all $`115`$ frames at the $`3^{rd}`$ enhancement layer quality (the highest), and as expected the average quality of naive 1 is the lowest with a value of 1.834 which means that the average delivered quality is the $`2^{nd}`$ enhancement layer. Naive 1 delivers about $`50\%`$ of the frames in a quality that is $`\leq 2^{nd}`$ enhancement layer. As we previously mentioned, naive 1 does not start fetching $`1^{st}`$ enhancement layer until it fetches the base layer of all frames, and by the time it fetches base layer of all frames, many frames have already been played back in base layer quality, so their $`1^{st}`$ enhancement layer wont be fetched, instead the algorithm will start fetching the $`1^{st}`$ enhancement layer of the frames that are not played back yet. In good regime (Avg BW=1.8Mbps) When on average frames can be delivered in $`2^{nd}`$ enhancement layer, we start seeing naive 2 and 3 running in stalls because they are very optimistic in delivering frames in higher enhancement layer. Delivering current frame in highest enhancement layer may lead to loosing enough bandwidth to deliver the next frame even in base layer. What happen to naive 2 and 3 is that if there is an available bandwidth at any time instant, they give priority to fetching the $`3^{rd}`$ enhancement layer for frame $`i`$ before the base layer of frame $`j`$ when $`i < j`$, and according to the average bandwidth of 1.8, in average the frames can be delivered in the $`2^{nd}`$ enhancement layer; Therefore, if a frame is delivered in $`3^{rd}`$ enhancement layer there will be high possibility that another one would not be delivered at all. In contrast to that, naive 1 is very conservative to the point that many frames are delivered in low quality even though the bandwidth allows for higher quality to be delivered. SVC is less conservative than naive 1 and less optimistic than naive 2 and 3. It works in between and in average it delivers the frames in the best possible quality considering avoiding stalls. It optimizes over which frame should be delivered in which quality considering the whole bandwidth profile and all frames. As the table 8 and Fig. 20 (b) shows, In good regime, SVC has the highest average $`1.883`$ with $`0`$ stalls. Moreover 108 out of $`120`$ frames are delivered in $`2^{nd}`$ enhancement layer. naive 1 does not run into stalls but it has the lowest average quality $`1.597`$. naive 2 and naive 3 have average quality better then naive 1. However, they do not meet deadline of some frames. For naive 1, we can see that more than $`50\%`$ of the frames are delivered in quality $`\leq 1^{st}`$ enhancement layer. Naive 2 and 3 delivers about $`50\%`$ of frames in quality of highest enhancement layer, but they run into stalls and very high variability in the playback quality. When in average frames can be delivered in $`1^{st}`$ enhancement layer (Avg BW=2.5Mbps), as table 8 and Fig. 20 (c) shows, SVC delivers frames in highest average of $`1.593`$. Moreover, SVC does not run into stalls. naive 2 and 3 do not violate the deadline of about $`27\%`$ and $`21\%`$ of the frames respectively. naive 1 has no stalls but low quality per frame with compare to SVC. SVC delivers more than $`90\%`$ of the frames in quality of $`1^{st}`$ enhancement layer while naive 1 delivers more than $`50\%`$ of the frames on the base layer quality. Naive 2 and 3 deliver some frames on the highest quality level but the cost is not delivering some other frames. In conclusion, offline SVC can do as good as the optimistic algorithms in the high bandwidth regime because of its prior knowledge of the full bandwidth and the optimization over what to fetch at what time. It does nice job in good regime as well since it balances between between being so optimistic and so conservative. the power of the introduced algorithm can be seen in the regime where very optimistic algorithm run in many stalls and very conservative algorithm do not fully utilize the available bandwidth.

Avg_BW	Algo	#s	#b	#e1	#e2	#e3	APBR
2.5Mbps	naive1	0	16	23	19	57	1.834
	naive2	0	0	0	0	115	2.133
	naive3	0	0	0	5	110	2.12
	SVC	0	0	0	0	115	2.133
1.8Mbps	naive1	0	35	42	7	31	1.597
	naive2	14	8	4	1	88	1.778
	naive3	7	15	11	18	64	1.77
	SVC	0	0	0	108	7	1.883
1.5Mbps	naive1	0	60	26	3	26	1.449
	naive2	33	13	5	0	64	1.377
	naive3	26	22	13	9	45	1.365
	SVC	0	2	112	1	0	1.593
1.25Mbps	naive1	0	86	11	6	12	1.27
	naive2	55	12	2	0	46	0.99
	naive3	50	20	8	4	33	0.97
	SVC	0	66	49	0	0	1.294

Offline algorithm results for Bandwidth Profile 1

PlayBack rate , offline, Bandwdith Profile 1

Bandwidth profile 2

In the case of bandwidth profile 2 which is shown in Fig. 19 (b) , the low bandwidth regime comes first and as the time goes the bandwidth improves in average. As shown in table 9 and Fig. 21 (a), In high average bandwidth (2.5Mbps),SVC and naive 1 do not run into stalls. However, SVC has the highest average quality with $`1.91`$ and most of frames are delivered in $`2^{nd}`$ enhancement layer. It is clearly shown that even the average bandwidth is high, naive 2 and naive 3 run into stalls because of the low bandwidth in the beginning. For SVC, we see that 80 out of 115 frames are delivered in $`2^{nd}`$ enhancement layer, and 27 are delivered in highest quality level. Naive 1 delivers $`50\%`$ of the frames in the quality of the base layer. naive 2 and naive 3 deliver 86 and 74 frames out of 115 frames in highest quality level but on the cost of not delivering some other frames. In the case of average Bw of $`1.825 Mb/s`$, we can see from the table 9 and Fig. 21 (b) that because the low bandwidth comes first even the average bandwidth is as high as 1.825 Mb/s, naive 2 and naive 3 do not deliver more than $`25\%`$ of the frames. Naive 1 delivers all frames on average quality of $`1.28`$ and more than $`75\%`$ of the frames are delivered on quality of the base layer. However, SVC has the highest average quality of delivery with value of $`1.523`$, and more than $`60\%`$ of the frames are delivered on quality that is $`\geq 1^{st}`$ enhancement layer. In the case of average Bw of $`1.544 Mb/s`$, naive 2 and naive 3 do not deliver about $`40\%`$ of the frames. naive 1 deliver all frames on average quality of $`1.24`$ and more than $`80\%`$ of the frames are delivered on quality of the base layer. However, SVC has the highest average quality of delivery with value of $`1.359`$, and more than $`30\%`$ of the frames are delivered on quality that is $`\geq 1^{st}`$ enhancement layer. For the very low regime , Average Bw of $`1.25 Mb/s`$, all of the algorithms run into stalls. However SVC has the lowest number of stalls with 21 non delivered frames which is less than half of the number of stalls of naive 1 and about 1/3 of naive 2 and naive 3. SVC deliver frames in highest average quality with value of $`1.09`$. In conclusion, in this scenario naive 2 and 3 experience more stalls in the beginning even in the high average because of their optimistic behavior. However, in high average bandwidth scenario, SVC and naive 1 do not run into stalls, but in low average bandwidth scenario, naive 1 may keep fetching frames that would not meet their deadline requirements while SVC skip fetching those frames, so SVC reduces the number of the non delivered frames.

Avg_BW	Algo	#stalls	#b	#e1	#e2	#e3	APBR
2.5Mbps	naive1	0	59	23	5	28	1.467
	naive2	11	7	4	7	86	1.829
	naive3	8	11	8	14	74	1.813
	SVC	0	0	8	80	27	1.91
1.8Mbps	naive1	0	90	4	0	21	1.280
	naive2	33	22	3	4	53	1.293
	naive3	31	26	3	8	47	1.284
	SVC	0	43	45	1	26	1.523
1.5Mbps	naive1	0	95	2	1	17	1.240
	naive2	46	19	4	6	40	1.071
	naive3	44	23	5	7	36	1.064
	SVC	0	77	12	2	24	1.359
1.25Mbps	naive1	29	69	4	1	12	0.934
	naive2	63	14	3	3	32	1.813
	naive3	62	16	4	6	27	0.802
	SVC	12	77	2	1	23	1.185

Offline algorithm results for Bandwidth Profile 2

PlayBack rate , offline, Bandwdith Profile 2

Online algorithm

We start with evaluating the effect of $`\alpha`$ and the prediction error on the performance of our algorithm, and then we compare our algorithm with the 3 naive strategies.
Effetc of $`\alpha`$: The optimum policy is recalculated every $`\alpha`$ seconds where $`\alpha \leq W`$. Choosing $`\alpha << W`$ leads to better fetching policy. Every $`k*\alpha`$, the fetching policy is recalculated and the decision that was made for fetching quality of the frames corresponding to the overlapping zone between the current and previous windows (frames from $`k*\alpha+1`$ to $`(k-1)*\alpha+W`$) could be changed in the recalculation point $`k*\alpha`$. To evaluate the effect of $`\alpha`$, we simulated online algorithm for downloading a video of 600 second length. We assumed a prediction window of size 2 minutes, $`W=120`$, and average bandwidth ($`avgBw=1.2`$). Moreover, we assume bandwidth profile 3 where bandwidth starts at higher values and decreases over time. having $`\alpha=10`$ leads to deliver all frames, and as $`\alpha`$ increases we start to see more stalls and the reason is obviously more frames delivered in high quality for current window and not being considered for next window means the algorithm is more optimistic in delivering current frames while less conservative about the future. After enhancing for fetching frames that correspond to the current predicted window, algorithm 3 is used to fetch the later frames in highest quality level if there is still available bandwidth.

Number of stalls vs α

Effetc of Prediction Error: To evaluate the effect of the prediction error on the online algorithm, we simulated a video of 10 minute length. We assumed a prediction window of size 2 minutes, $`W=120`$, recalculation every 30 seconds ($`\alpha=30`$), and average bandwidth ($`avgBw=1.2`$). Moreover, we assume bandwidth profile 3 where bandwidth starts at higher values and decreases over time. We test the algorithm for prediction error of $`10\%`$, $`20\%`$, $`30\%`$, and $`40\%`$.
To test the performance of our algorithm with compare to the 3 naive strategies, we simulated a video of $`10`$ minute length, using bandwidth profiles 3 and 4 which are shown in Fig. 23. We assume that we have bandwidth prediction of $`2`$ minutes ahead, window size of $`120`$ seconds, and we chose $`\alpha=20`$ seconds. We considered prediction error (PE), and the prediction error is increasing as time gets far from the recalculation point $`k\alpha`$. we assume that the prediction error of the 1st 20 seconds starting at $`k\alpha+1`$ to $`k\alpha+20`$ is $`10\%`$, and the prediction error of the next 20 seconds is $`15\%`$ and so on. i.e PE=[10 15 20 25 30 35].

Real Bandwidth Profiles 3 and 4

Bandwidth profile 3

In the case of very good regime, Average Bw of $`2.5 Mb/s`$, as table 11 and Fig. 25 (a) shows, even though the low bandwidth comes later and even with prediction error specified above, our algorithm does as good as naive 2 and 3. Because the average bandwidth is high, all algorithms deliver all frames and there are no stalls in all of them. However, SVC and naive 2 have the highest average play rate. Both of them deliver the all the frames at the highest quality level. As expected the average quality of naive 1 is the lowest with a value of 1.79 which means that the average delivered quality is the $`2^{rd}`$ enhancement layer. 90% of the fames using naive 1 are delivered in quality $`Q \geq 1.067`$. However $`50\%`$ is $`\leq 1.867`$ which means more than $`50\%`$ of the frames are delivered in quality that is $`\leq 2^{nd}`$ enhancement layer. In other hand,almost all frames on mostly SVC and naive 2, and naive 3) or are delivered on the highest enhancement layer ($`3^{rd}`$ enhancement layer). In the case of average Bw of $`1.846 Mb/s`$, as expected when the average bandwidth goes bellow the point of delivering all frames in highest quality level, we start to see naive 2 and naive 3 run into stalls. As the table and 25 (b) show SVC has the highest average $`1.854`$ and $`90\%`$ of frames are delivered on a quality $`\geq 1^{st}`$ enhancement layer. naive 1 does not run into stalls but it has the lowest average quality $`1.554`$, and it delivers more than $`50\%`$ of the frames in quality $`\leq 1^{st}`$ enhancement layer. naive 2 and naive 3 have average quality better then naive 1. However, they do not deliver some frames. When the average Bw is equal to $`1.562 Mb/s`$ where in average frames can be delivered in $`1^{st}`$ enhancement layer we see as shown in table [tab:onlineRealBw2] and Fig. 25 (c), SVC delivers frames in the highest average of $`1.564`$. Moreover, SVC does not run into stalls. Naive 2 and 3 do not deliver more than $`25\%`$ of the frames. naive 1 has no satlls but low quality per frame with compare to SVC. SVC delivers more than $`50\%`$ of the frames in quality $`\geq`$ $`1^{st}`$ enhancement layer while naive 1 delivers more than $`50\%`$ of the frames on the base layer quality. Naive 2 and 3 deliver some frames on the highest quality level but the cost is not delivering high percentage of the frames. In the very low average bandwidth, average Bw of $`1.248 Mb/s`$, and specifically when the very low bandwidth comes later, the online SVC optimizes over current window and it does not have an idea that the bandwidth is going to run low for next few windows, so it would prioritize fetching higher enhancement layers of current frames over base layer of later frames, so when the bandwidth run low later, later frames might not be completely fetched in their window time. Therefore, and as shown in table table 11 and Fig. 25 (d), SVC can not beat naive 1 in term of fetching all frames in this very low average bandwidth under this bandwidth profile scenario. However, if it is known upfront that the average bandwidth is very low, the greedy part of our algorithm can do better than naive 1 in term of avoiding stalls. Intuitively, OptBLSchedul algorithm uses predictability to skip frames that would not meet their deadline requirements, so the number of non fetched frames will be minimized. It is clearly shown from table table [tab:onlineRealBw2] and Fig. 25 that naive 1 deliver frames in the highest average quality with value of 1.25, and it does not run into stalls. Naive 2 and 3 do not deliver more than $`50\%`$ of the frames. online SVC with $`W=120`$ and $`\alpha=20`$ does not completely deliver about $`18\%`$ of the frames.

Avg_BW	Algo	#stalls	#b	#e1	#e2	#e3	APBR
2.5Mbps	naive1	0	106	133	81	275	1.79
	naive2	0	0	0	0	595	2.133
	naive3	0	6	7	27	555	2.12
	SVC	0	0	0	0	595	2.133
1.8Mbps	naive1	0	220	187	38	150	1.554
	naive2	84	36	11	4	460	1.755
	naive3	45	73	70	73	334	1.745
	SVC	0	41	220	18	316	1.854
1.5Mbps	naive1	0	330	124	20	121	1.421
	naive2	186	64	11	6	328	1.339
	naive3	152	104	56	39	244	1.334
	SVC	0	263	108	1	223	1.564
1.25	naive1	0	456	59	24	56	1.25
	naive2	301	43	12	10	229	0.96
	naive3	282	70	32	26	185	0.95
	SVC	40	374	32	26	123	1.27

Online algorithm results for Bandwidth Profile 3

Playback rate, online algorithm, and bandwidth profile (3)

Bandwidth profile 4

In the case of the average Bw of $`2.5 Mb/s`$, as shown in table () and figure(), online SVC and naive 1 do not run into stalls. However, SVC has the highest average quality with $`1.8`$ and about $`90\%`$ of the frames are delivered in quality $`\geq`$ $`1^{st}`$ enhancement layer. Naive 1 delivers more than $`50\%`$ of the frames in a quality of the base layer. naive 2 and naive 3 do not deliver about $`15\%`$ of the frames. Bandwidth profile 2 and Average Bw of $`1.8 Mb/s`$.
In this specific scenario where the bandwidth starts very low even the average bandwidth is as high as 1.8, all algorithms run into stalls. As table () and figure () describes, naive 2 and 3 do not deliver about $`30\%`$ of the frames. Online SVC and naive 1 have the lowest number of stalls, and in average SVC deliver frames in higher quality than naive 1 with value of 1.382 and 1.23 for SVC and naive 1 respectively.

Avg_BW	Algo	#stalls	#b	#e1	#e2	#e3	APBR
2.5Mbps	naive1	0	356	83	21	135	1.41
	naive2	93	48	21	17	416	1.687
	naive3	86	43	35	64	367	1.687
	SVC	0	70	169	120	236	1.802
1.8Mbps	naive1	19	458	14	3	101	1.230
	naive2	200	104	26	13	252	1.20
	naive3	193	109	41	23	229	1.198
	SVC	12	345	91	17	130	1.382
1.5Mbps	naive1	250	263	26	5	51	0.74
	naive2	362	66	17	15	135	0.695
	naive3	361	68	17	21	128	0.692
	SVC	165	301	6	22	101	0.987
1.25	naive1	250	263	26	5	51	0.74
	naive2	362	66	17	15	135	0.695
	naive3	361	68	17	21	128	0.692
	SVC	165	301	6	22	101	0.987

Online algorithm results for Bandwidth Profile 4

Playback rate, online algorithm, and bandwidth profile (4)

Introduction

Mobile video has emerged as a dominant contributor to cellular traffic. It already accounts for around $`40-55`$ percent of all cellular traffic and is forecast to grow by around 55 percent annually through 2021 . While its popularity is on the rise, delivering high quality streaming video over cellular networks remains extremely challenging. In particular, the video quality under challenging conditions such as mobility and poor wireless channel is sometimes unacceptably poor. Almost every viewer at some point in time can relate to experiences of choppy videos, stalls,

Not surprisingly, a lot of attention from both research and industry in
the past decade has focused on the development of adaptive streaming
techniques for video on demand that can dynamically adjust the quality
of the video being streamed to the changes in network conditions. Such a
scheme has 2 main components:: On the server side, the video is divided into multiple chunks (segments), each containing data corresponding to some playback time (e.g., 4 sec), and then each chunk is encoded at multiple resolutions/quality levels (each with different bandwidth requirements).; During playtime, an entity (typically the player) dynamically switches between the different available quality levels as it requests the video over the network. The adaptation is based on many factors such as the network condition, its variability, and the client buffer occupancy . This results in a viewing experience where different chunks of the video might be streamed at different quality levels.

In the predominant adaptive coding technique in use today, each video chunk is stored into $`L`$ independent encoding versions, as an example of such a technique is H.264/MPEG-4 AVC (Advanced Video Coding) which was standardized in 2003 . During playback when fetching a chunk, the Adaptive Bit Rate (ABR) streaming technique such as MPEG-DASH (Distributed Dynamic Streaming over HTTP) needs to select one out of the $`L`$ versions based on its judgement of the network condition and other aforementioned factors.

AVC vs SVC Encoding

An alternative encoding scheme is Scalable Video Coding (SVC) which was standardized in 2007 as an extension to H.264 . In SVC, a chunk is encoded into ordered layers: one base layer (Layer 0) with the lowest playable quality, and multiple enhancement layers (Layer $`i>`$0) that further improve the chunk quality based on layer $`i-1`$. When downloading a chunk, an Adaptive-SVC streaming logic must consider fetching all layers from 0 to $`i-1`$ if layer $`i`$ is decided to be fetched. In contrast, in AVC, different versions (qualities) of chunks are independent, as illustrated in Fig. 28.

There are three typical modes of scalability, namely temporal (frame rate), spatial (spatial resolution), and quality (fidelity, or signal-to-noise ratio). The encoding however has an additional encoding overhead, which depends on the mode of scalability. For example showed that there is minimal or no loss in coding efficiency using temporal scalability. Temporal scalability is also backward compatible with existing H.264 decoders, and is simple to implement as compared to other forms of scalability. However, there are some limitations for using temporal scalability such as being visually un-pleasing for low base layer rates which motivate the use of other scalability modes that require more overhead. Appendix 55 describes some common scenarios where Adaptive-SVC streaming can be beneficial.

Motivating example: network condition prediction can improve streaming quality under mobility.

To motivate our problem, imagine a scenario where a mobile user starts a trip from point A to point B (see Fig. 27, anonymized with randomly chosen locations). As the user enters the destination location to the GPS application, she gets the route information, and the video player obtains the estimates of the bandwidth availability along the chosen path. The bandwidth estimation can be obtained using crowd-sourced information from measurements of other users who travelled the same route recently as we will show in Appendix 70. We will demonstrate that access to such information can help the player take significantly better informed decisions in its adaptation logic. For example, if the player is aware that it is about to traverse through a region with low bandwidth, it can switch to fetching the video at a lower quality to minimize the possibility of stalling. Another method to predict the future bandwidth that has been widely used in the literature is the harmonic mean based prediction , which uses the harmonic mean of the past few seconds to predict the bandwidth for the next few seconds.

In this paper, we first theoretically formulate the problem of adaptive-SVC video streaming with the knowledge of future bandwidth. We consider two streaming schemes: skip based and no-skip based streaming. The former is usually for real-time streaming in which there is a playback deadline for each of the chunks, and chunks not received by their respective deadlines are skipped. For no-skip based streaming, if a chunk cannot be downloaded by its deadline, it will not be skipped; instead, a stall (re-buffering) will incur, the video will pause until the chunk is fully downloaded. In both variants, the goal of the proposed scheduling algorithm is to determine up to which layer we need to fetch for each chunk (except for those skipped in realtime streaming), such that the overall quality-of-experience (QoE) is maximized and the number of stalls or skipped chunks is minimized. The key contributions of the paper are described as follows.

A novel metric of QoE is proposed for SVC streaming in both the scenarios (skip and no-skip). The metric is a weighted sum of the layer sizes for each chunk. Since the user’s QoE is concave in the playback rate , the higher layers contribute lower to the QoE as compared to the lower layers. Thus, the weights decrease with the layer index modeling the diminishing returns for higher layers.

We show that even though the proposed problem is a non-convex optimization problem with integer constraints, it can be solved optimally using an algorithm with a complexity that is linear in the number of chunks. The proposed algorithm, “Layered Bin Packing" (LBP) Adaptive Algorithm, proceeds layer-by-layer, tries to efficiently bin-pack all chunks at a layer and provides maximum bandwidth to the next layer’s decisions given the decisions of the lower layers of all the chunks.

We propose an online robust adaptive-SVC streaming algorithm (Online LBP). This algorithm exploits the prediction of the network bandwidth for some time ahead, solves the proposed optimization problem to find the quality decisions for $`W`$ chunks ahead, and re-runs every $`\alpha`$ seconds to adjust to prediction errors and find quality decisions for more chunks ahead.

We considered two techniques of bandwidth prediction. First, harmonic mean based prediction which was widely used in the literature where the harmonic mean of the past few seconds is used to predict the bandwidth for few seconds ahead (typically 20 seconds ahead). Second, crowd-sourced erroneous bandwidth prediction where bandwidth profiles experienced by people travelled the same road recently are used to predict the bandwidth for the current user.

Trace-driven simulation using datasets collected from commercial cellular networks demonstrates that our approach is robust to prediction errors, and works well with short prediction windows ( seconds). The proposed approach is compared with a number of adaptation strategies including slope based SVC streaming , Microsoft’s smooth streaming algorithm (adapted to streaming SVC content), and Netflix’s buffer-based streaming algorithm (BBA-0) (adapted to SVC).

The results demonstrate that our algorithm outperforms the state-of-the-art by improving key quality-of-experience (QoE) metrics such as the playback quality, the number of layer switches, and the number of skips or stalls.

In addition to the simulations, we built a testbed that streams synthetic SVC content over TCP/IP networks using real LTE traces. We then implemented our streaming algorithm on the testbed and evaluated it under challenging network conditions. The emulation outcome is very close to the simulation results and incurs very low run-time overhead, further confirming that our algorithm can be practically implemented and deployed on today’s mobile devices.

System Model

We consider the problem of adaptively streaming an SVC video. An SVC encoded video is divided into $`C`$ chunks (segments) and stored at a server. Every chunk is of length $`L`$ seconds, and is encoded in Base Layer (BL) with rate $`r_0`$ and $`N`$ enhancement layers ($`E_1, \cdots, E_{N}`$) with rates $`r_1, \cdots, r_{N}`$ $`\in`$ $`{\mathcal R} \triangleq \{0,r_0,r_1,\cdots, r_{N}\}`$. We assume that each layer is encoded at constant bit rate (CBR). In other words, all chunks have the same $`n`$th layer size. Let the size of the $`n`$-th layer of chunk $`i`$ be $`Z_{n,i} \in {\mathcal Z}_n \triangleq \{0,Y_n\}`$, where $`Y_n=L \times r_n`$. Let the size of a chunk that is delivered at the $`n`$-th layer quality be $`X_n(i)`$, where $`X_n(i)=\sum_{m=0}^n Y_m`$.

Let $`z_n(i,j)`$ be the size of layer $`n`$ of chunk $`i`$ that is fetched at time slot $`j`$, and $`x(i,j)`$ be what is fetched of all layers of chunk $`i`$ at time slot $`j`$, i.e., $`x(i,j)=\sum_{n=0}^N z_n(i,j)`$. Further, let $`B(j)`$ be the available bandwidth at time $`j`$. For the offline algorithm, we assume the bandwidth can be perfectly predicted. Also let $`s`$ be the startup delay and $`B_m`$ be the playback buffer size in time units (the playout buffer can hold up to $`B_m`$ seconds of video content). We assume all time units are discrete and the discretization time unit is assumed to be 1 second (which can be scaled based on the time granularity). Since the chunk size is $`L`$ seconds, the buffer occupancy increases by $`L`$ seconds when chunk $`i`$ starts downloading (we reserve the buffer as soon as the chunk start downloading).

The optimization framework can run at either the client or the server side as long as the required inputs are available. A setup where the algorithm is run at the client side is depicted in Fig. 28. The algorithm takes as an input, the predicted bandwidth for the time corresponding to the next $`C`$ chunks, layer sizes ($`Y_0,...,Y_N`$), startup delay ($`s`$), and maximum buffer size $`B_m`$, and outputs the layers that can be requested for the next $`C`$ chunks ($`Z_{n,i}, i \in \{1,..C\}, n \in \{0,...,N\}`$). The video chunks will be fetched according to the requested policy and in order. For the online algorithm, this process repeats every $`\alpha`$ seconds, and decisions can be changed on fly since the proposed algorithm adapts to the prediction error.

System Model

We consider two scenarios: skip based streaming and no-skip based streaming. For skip streaming, the video is played with an initial start-up (buffering) delay $`s`$ seconds and there is a playback deadline for each of the chunks where chunk $`i`$ need to be downloaded by time $`deadline(i)`$. Chunks not received by their respective deadlines are skipped. For no-skip streaming, it also has start-up delay. However, if a chunk cannot be downloaded by its deadline, it will not be skipped. Instead, a stall (rebuffering) will occur the video will pause until the chunk is fully downloaded. In both scenarios, the goal of the scheduling algorithm to be detailed next is to determine up to which layer we need to fetch for each chunk (except for those skipped), such that the number of stalls or skipped chunks is minimized as the first priority, the overall playback bitrate is maximized as the next priority, the number of quality switching between neighboring chunks is minimized as the third priority. Similar to many other studies on DASH video streaming , this paper does not consider mean opinion score (MOS) metric since obtaining MOS ratings are video-dependent and are time-consuming and expensive as they require recruitment of human assessors. A table of notations used in this paper is included in Appendix 37.

Adaptive SVC Streaming

We describe the algorithms for adaptively streaming SVC videos with qualities being decided based on the predicted bandwidth. We consider two scenarios: skip based streaming and no-skip based streaming. For skip streaming, the video is played with an initial start-up (buffering) delay $`S`$ seconds and there is a playback deadline for each of the chunks where chunk $`i`$ need to be downloaded by time $`deadline(i)`$. Chunks not received by their respective deadlines are skipped. For no-skip streaming, it also has start-up delay. However, if a chunk cannot be downloaded by its deadline, it will not be skipped. Instead, a stall (rebuffering) will occur the video will pause until the chunk is fully downloaded. In both scenarios, the goal of the scheduling algorithm is to determine up to which layer we need to fetch for each chunk (except for those skipped), such that the number of stalls or skipped chunks is minimized as the first priority, the overall playback bitrate is maximized as the next priority, the number of quality switching between neighboring chunks is minimized as the third priority. This paper does not consider mean opinion score (MOS) metric since obtaining MOS ratings are video-dependent and are time-consuming and expensive as they require recruitment of human assessors.

We now detail the adaptive SVC streaming algorithms. We describe the basic formulation for skip-based streaming in §60.1. We then identify the particular problem structure in our formulation and strategically leverage that to design a linear-time solution in §60.2 and §32.3. We prove the optimality of our solution in §32.4. An example of the algorithm is given in Appendix 65, and detailed proofs are in Appendix 40. We then extend the basic scheme to its online version in §32.6 and to no-skip based streaming in §32.7 (with detailed algorithm in Appendix 60, example in Appendix 67, and proofs in Appendix 68).

Skip Based Streaming: Offline Problem Formulation

Given the settings described in §31, we first formulate an offline optimization problem. It jointly (i) minimizes the number of skipped chunks, (ii) maximizes the average playback rate of the video, and (iii) minimizes the quality changes between the neighboring chunks to ensure the perceived quality is smooth. We give a higher priority to (i) as compared to (ii), since skips cause more quality-of-experience (QoE) degradation compared to playing back at a lower quality . Further, (iii) is the lowest priority among the three objectives. The proposed formulation maximizes a weighted sum of the layer sizes. The weights are along two directions. The first is across time where the layers of the later chunks are weighed higher using a factor $`\beta\ge 1`$. The second is across the layers where fetching the $`n`$-th layer of a chunk achieves a utility that is $`0<\gamma<1`$ times the utility that is achieved by fetching the $`(n-1)`$-th layer. Thus, the objective is given as $`\sum_{n=0}^{N}\gamma^n\sum_{i=1}^{C}\beta^i Z_{n,i}`$. We further assume that

\begin{equation}
%C\sum_{k=1}^{N-a} \gamma^{(k)} r_{a+k} < r_a, \  \text{  for } a= 0, \cdots, N. \label{basic_gamma_1}
%C\sum_{k=a+1}^{N} \gamma^{k-a} r_{k} <  r_a, \  \text{  for } a= 0, \cdots, N-1. \label{basic_gamma_1}
\gamma^a r_a > \sum_{k=a+1}^N\gamma^kr_k \sum_{i=1}^C \beta^i \  \text{  for } a= 0, \cdots, N-1. \label{basic_gamma_1}
\end{equation}

This choice of $`\gamma`$ implies that all the higher layers than layer $`a`$ have lower utility than a chunk at layer $`a`$ for all $`a`$. For $`a=0`$, this implies that all the enhancement layers have less utility than a chunk at the base layer. Thus, the avoidance of skips is the highest priority. The use of $`\gamma`$ helps prioritize lower layers over higher layers and models concavity of user QoE with playback rate. Due to this weight, the proposed algorithm will avoid skip as the first priority and will not use the bandwidth to fetch higher layers at the expense of base layer. Similar happens at the higher layers. The combination of the two weights help minimize multi-layer quality switches between neighboring chunks since the use of $`\gamma`$ discourages getting higher layers at the expense of lower layers. We assume $`\beta=1+\epsilon`$ where $`\epsilon > 0`$ is very small number (). The use of $`\beta= 1+\epsilon`$ helps in three aspects, (i) makes optimal layer decisions for different chunks unique, (ii) better adaptability to the bandwidth fluctuations by preferring fetching higher layers of later chunks, and (iii) reduction of quality variations. Indeed, if the playback buffer is not limited, there will ideally be a few jumps of quality increases and no quality decrease in the playback of the chunks using this metric. An example to further explain the objective and the above mentioned points for $`\gamma`$ and $`\beta`$ is provided in Appendix 63.

Overall, the SVC layer scheduling problem with the knowledge of future bandwidth information can be formulated as follows, where $`{\bf I}(.)`$ is an indicator function which has the value $`1`$ if inside expressions holds and zero otherwise.

\begin{eqnarray}
\textbf{Maximize: }\Bigg(\sum_{n=0}^{N}\gamma^n\sum_{i=1}^{C}\beta^i Z_{n,i}\Bigg)
\label{equ:eq1}
\end{eqnarray}

subject to

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s} z_0(i,j) = Z_{0,i}   \forall i = 1, \cdots, C
\label{equ:c1eq1}
\end{eqnarray}

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s} z_n(i,j) = Z_{n,i},\quad  \forall i,  n
\label{equ:c2eq1}
\end{eqnarray}

\begin{eqnarray}
 Z_{n,i}\le \frac{Y_n}{Y_{n-1}}Z_{n-1,i},\quad  \forall i,  n
\label{equ:c2eq11}
\end{eqnarray}

\begin{eqnarray}
\sum_{n=0}^N\sum_{i=1}^{C} z_n(i,j)  \leq B(j) \  \   \forall j=1, \cdots, (C-1)L+s,
\label{equ:c3eq1}
\end{eqnarray}

\begin{eqnarray}
\sum_{i, (i-1)L+s > t} {\bf I}\Bigg(\sum_{j=1}^t\bigg(\sum_{n=0}^Nz_n(i,j)\bigg)> 0\Bigg) L \leq B_m \   \forall t
\label{equ:c4eq1}
\end{eqnarray}

\begin{equation}
z_n(i,j) \geq 0\   \forall i = 1, \cdots, C
\label{equ:c5eq1}
\end{equation}

\begin{equation}
z_n(i,j) = 0\   \forall i, j > (i-1)L+s
\label{equ:c6eq1}
\end{equation}

\begin{equation}
Z_{n,i} \in {\mathcal Z}_n \quad  \forall i = 1, \cdots, C, \text{ and } \forall n = 1, \cdots, N
\label{equ:c7eq1}
\end{equation}

\begin{eqnarray}
\text{Variables:}&& z_n(i,j), Z_{n,i} \ \ \  \forall   i = 1, \cdots, C,  \nonumber \\
&&j = 1, \cdots, (C-1)L+s, \  n = 0, \cdots, N \nonumber
\end{eqnarray}

Constraints [equ:c2eq1] and [equ:c7eq1] ensure that what is fetched for any layer $`n`$ of a chunk $`i`$ over all times to be either zero or the $`n`$-th layer size. The decoder constraint [equ:c2eq11] enforces that the $`n`$th layer of a chunk cannot be fetched if the lower layer is not fetched since this layer will not be decoded because of the layer dependency. [equ:c3eq1] imposes the available bandwidth constraint at each time slot $`j`$ and [equ:c4eq1] imposes the playback buffer constraint so that the content in the buffer at any time does not exceed the buffer capacity $`B_m`$. Constraint [equ:c5eq1] imposes the non-negativity of the chunk download sizes, and [equ:c6eq1] enforces not to fetch a chunk after its deadline. The deadline of chunk $`i\in \{1, \cdots, C\}`$ is $`deadline(i)=(i-1)L+s`$.

Optimization Problem Structure

The problem defined in §60.1 has integer constraints and has an indicator function in a constraint. This problem is in the class of combinatorial optimization . Some of the problems in this class are the Knapsack problem, Cutting stock problem, Bin packing problem, and Travelling salesman problem. These problems are all known to be NP hard. Very limited problems in this class of combinatorial optimization are known to be solvable in polynomial time. Some typical examples being shortest path trees, flows and circulations, spanning trees, matching, and matroid problems. The well known Knapsack problem optimizes a linear function with a single linear constraint ( for integer variables), and is known to be NP hard. The optimization problem defined in this paper has multiple constraints, and does not lie in any class of known combinatorial problems that are polynomially-time solvable to the best of our knowledge. In this paper, we will show that this combinatorial optimization problem can be solved optimally in polynomial time.

Optimal Linear-time Solution

We now show the proposed problem in ([equ:eq1]-[equ:c7eq1]) can be solved optimally with a complexity of $`O(CN)`$. We call our proposed algorithm “Layered Bin Packing Adaptive Algorithm" (LBP), which is summarized in Algorithm 1. At a high level, our algorithm works from the lowest (the base) to the highest enhancement layer, and processes each layer separately. It performs backward and forward scans (explained below) at each layer given the decisions of the previous layers.

Running the backward scan at the $`n`$th layer (Algorithm 2) finds the maximum number of chunks that can be fetched up to the $`n`$th layer quality given the decisions of the previous layers. Then, running the forward scan (Algorithm 3) simulates fetching chunks in sequence as early as possible, so the start time of downloading chunk $`i`$ (the lower deadline $`t(i)`$) is found. Lower and Upper ($`t(i)`$, $`deadline(i)`$) deadlines will be used to find the next layer decisions (as explained below).

Backward Algorithm for Base Layer: Given the bandwidth prediction, chunk deadlines, and the buffer size, the algorithm simulates fetching the chunks at base layer quality starting from the last towards the first chunk. The deadline of the last chunk is the starting time slot of the backward algorithm scan. The goal is to have chunks fetched closer to their deadlines. For every chunk $`i`$, the backward algorithm checks the bandwidth and the buffer; if there is enough bandwidth and the buffer is not full, then chunk $`i`$ is selected to be fetched (line 18-22). The algorithm keeps checking this feasibility to select chunks to be fetched. If a chunk $`i^\prime`$ is not selected to be fetched, one of the following two scenarios could have happened. The first scenario is the violation of the buffer capacity, where selecting the chunk to be fetched would violate the playback buffer constraint. The second is the bandwidth constraint violation where the remaining available bandwidth is not enough for fetching a chunk. This scenario also means that the chunk could not be fetched by its deadline, so it can also be called deadline violation.

For buffer capacity violation, we first note that, there could be a chunk $`i^{\prime\prime} > i^\prime`$ in which if it is skipped, chunk $`i^{\prime}`$ can still be fetched. However, the backward algorithm decides to skip downloading chunk $`i^\prime`$ (line 8). We note that since there is a buffer capacity violation, one of the chunks must be skipped. The reason of choosing to skip chunk $`i^\prime`$ rather than a one with higher index is that $`i^\prime`$ is the closest to its deadline. Therefore, $`i^\prime`$ is not better candidate to the next layer than any of the later ones. In the second case of deadline/bandwidth violation, the backward algorithm decides to skip chunks up to $`i^\prime`$ since there is not enough bandwidth. As before, since equal number of chunks need to be skipped anyway, skipping the earlier ones is better because it helps in increasing the potential of getting higher layers of the later chunks.

Forward Algorithm for Base Layer: The forward algorithm takes the chunk size decisions from the Backward step which provides the base layer size decision of every chunk $`i`$ which is either 0 or the BL size. Then, the forward algorithm simulates fetching the chunks in sequence starting from the first one. Chunks are fetched as early as possible with the deadline, buffer, and the bandwidth constraints being considered. The chunks that were not decided to be fetched by the Backward Algorithm are skipped (any chunk $`i \notin I_0`$, line 6 ). The forward algorithm provides the the earliest time slot when chunk $`i`$ can be fetched ($`t(i)`$, line 10). This time is used as a lower deadline on the time allowed to fetch chunk $`i`$ when the backward algorithm is run for the next layer. Therefore, the backward size decisions of base layer of earlier chunks can not be violated when the backward algorithm is re-run for deciding the first enhancement layer sizes ($`E1`$ decisions). Moreover, it provides the portion that can be fetched of chunk $`i`$ at its lower deadline $`t(i)`$ ($`a(i)`$, line 11) and the remaining bandwidth at every time slot $`j`$ after all non skipped chunk are fetched ($`e(j)`$, line 12).

Modifications for Higher Layers: The same backward and forward steps are used for each layer given the backward-forward decisions of the previous one on the chunk sizes and lower deadlines. The key difference when the algorithm is run for the enhancement layer decisions as compared to that for the base layer is that the higher layer of the chunk is skipped if the previous layer is not decided to be fetched. When running the backward algorithm for $`E1`$ decisions, for every chunk $`i`$, we consider the bandwidth starting from the lower deadline of that chunk $`t(i)`$, so previous layer decisions (base layer decisions) of early chunks can’t be violated. The same procedure is used to give higher layer decisions when all of the lower layer decisions have already been made. An example to illustrate the algorithm is given in Appendix 67.

Y_n, deadline(i), s, B_m, C, B(j): available bandwidth at time j, X(i)∀i: The maximum size in which chunk i can be fetched, I_n: set contains the indices of the chunks that can be fetched up to layer n quality.

$X_n=\sum_{m=0}^n Y_m$ cumulative size up to layer n $c(j)=\sum_{j^\prime=1}^{j} B(j^\prime)$ cumulative bandwidth up to time j, ∀j t(i) = 0, ∀i, first time slot chunk i can be fetched a(i) = 0, ∀i, lower layer decision of fetched amount of chunk i at its lower deadline time t(i) e(j) = B(j), ∀j, remaining bandwidth at time j after all non skipped chunk are fetched according to lower layer size decisions X(i) = 0, deadline(i) = (i − 1)L + s ∀i bf(j) = 0, ∀j, buffer length at time j [X, I_n] = backwardAlgo(B, X, X_n, C, L, deadline, B_m, bf, t, c, a, e) [t, a, e] = forwardAlgo(B, X, C, deadline, B_m, bf, I_n)

Layered Bin Packing Adaptive Algorithm

B, X, X_n, C, L, deadline, B_m, bf, t, c, a, e X(i) size of chunk i, I_n: set contains chunks that can be fetched in quality up to n^th layer. i = C, j = deadline(C) initialize bf(j) to zeros ∀j.

(bf(deadline(i)) = B_m) then i = i − 1

rem1 = c(j) − c(1) + e(1), rem2 = rem1 rem2 = c(j) − c(t(i)), rem1 = rem2 + e(t(i)) + a(i)

(X(i) > 0) then X_n(i) = X(i) else i = i − 1

e(t(i)) = e(t(i)) + rem1 − X_n X(i) = X_n(i), I_n ← I_n ∪ i

fetched = min(B(j), X_n(i)), B(j) = B(j) − fetched X_n(i) = X_n(i) − fetched (X_n(i) > 0) then bf(j) = bf(j) + L

(X_n(i) = 0) then i = i − 1

(B(j) = 0) then j = j − 1 j = j − 1

Backward Algorithm

B, X, C, deadline, Bm, bf, I_n

t(i): first time slot chunk i can be fetched (lower deadline of chunk i), a(i), decision of fetched amount of chunk i at its lower deadline time slot t(i), e(j), remaining bandwidth at time j after all non skipped chunk are fetched according to the decided layer size.

j = 1, k = 1 i = I(k) then k = k + 1 (bf(j) = B_m) then j = j + 1

fetched = min(B(j), X(i))

t(i) = j, a(i) = fetched B(j) = B(j) − fetched e(j) = B(j), X(i) = X(i) − fetched

X(i) > 0 then bf(j) = bf(j) + L X(i) = 0 then k = k + 1

B(j) = 0 then j = j + 1 k = k + 1

Forward Algorithm

Complexity Analysis: The initialization clearly sums the variables over time, and is at most O($`C`$) complexity. At each layer, a backward and a forward algorithm are performed. Both the algorithms have a while loop, and within that, each step is O(1). Thus, the complexity is dependent on the number of times this loop happens. For the backward algorithm, each loop decreases either $`i`$ or $`j`$ and thus the number of times the while loop runs is at most $`C+deadline(C)+1`$. Similarly, the forward algorithm while loop runs at most $`C+deadline(C)+1`$ times. In order to decrease the complexities, cumulative bandwidth for every time slot $`t`$, $`r(t)`$ is used to avoid summing over the bandwidth in the backward and the forward loops.

Adaptation to ABR Streaming: We note that the proposed algorithm selects quality levels for every chunk and can also be used for ABR streaming. For a given set of available ABR rates, the difference between the rates for the coded chunk at quality level $`n+1`$ and quality level n can be treated as the nth layer SVC rate for all $`n`$.

Optimality of the Proposed Algorithm

In this subsection, we prove the optimality of Layered Bin-Packing Adaptive Algorithm in solving the optimization problem ([equ:eq1]-[equ:c7eq1]). We first note that it is enough to prove that the algorithm is the best among any in-order scheduling algorithm (that fetches chunks in order based on the deadlines). This is because for any other feasible fetching algorithm, we can convert it to an in-order fetching algorithm with the same bandwidth utilizations for each chunk. Getting in-order helps the buffer and other constraints. Thus, we can obtain the same objective and can satisfy the constraints. The following Lemma states that given the lower and upper deadlines (($`t(i)`$) and $`deadline(i)`$) of every chunk $`i`$, the $`(n-1)th`$ layer quality decision, running the backward algorithm for the $`n`$th layer maximizes the number of chunks that can have their $`n`$th layer fetched.

Given size decisions up to ($`n-1`$)th layer, and lower and upper deadlines ($`t(i)`$, and $`deadline(i)`$) for every chunk $`i`$, the backward algorithm achieves the minimum number of the $`n`$th layer skips as compared to any feasible algorithm which fetches the same layers to every chunk up to the layer $`n-1`$.

Proof. Proof is provided in Appendix. 40. ◻

The above lemma shows that backward algorithm minimizes the $`n`$th layer skips given the lower and upper deadlines of every chunk. However, it does not tell us if that lower deadline is optimal or not. The following proposition shows that for any quality decisions, the forward algorithm finds the optimal lower deadline on the fetching time of any chunk.

if $`t_f(i)`$ is the earliest time to start fetching chunk $`i`$ using the forward algorithm (lower deadline), and $`t_x(i)`$ is the earliest time to fetch it using any other in sequence fetching algorithm, then the following holds true.

t_f(i) \leq t_x(i).

The above proposition states that $`t_f(i)`$ is the lower deadline of chunk $`i`$, so chunk $`i`$ can’t be fetched earlier without violating size decisions of the lower layers of earlier chunks. Therefore, at any layer $`n`$, we are allowed to increase the chunk size of chunk $`i`$ as far as we can fully fetch it within the period between its lower and upper deadlines. If increasing its size to the $`n`$-th layer quality level requires us to start fetching it before its lower deadline, then we should not consider fetching the $`n`$-th layer of this chunk . Fetching the $`n`$-th layer of this chunk in this case will affect the lower layer decisions and will cause dropping lower layers of some earlier chunks. Since, our objective prioritizes lower layers over higher layers ($`0<\gamma < 1`$ and [basic_gamma_1]), lower deadline must not be violated. As a simple extension of Lemma [lem:skip:beta1], we can consider any $`\beta\ge 1`$.

Given optimal solution of layer sizes up to the ($`n-1`$)th layer, and lower and upper deadlines ($`t(i)`$, and $`deadline(i)`$) of every chunk $`i`$. If $`{\mathcal Z}_n^* = (Z_{n,i}^* \forall n, i )`$ is the $`n`$-th layer solution that is found by running the backward algorithm for the $`n`$th layer for the $`n`$th layer sizes, and $`{\mathcal Z}^\prime_n = (Z_{n,i}^\prime \forall n, i )`$ is a feasible solution that is found by running any other algorithm, then the following holds for any $`\beta \geq 1`$.

\begin{equation}
\sum_{i=1}^{C}\beta^i Z_{n,i}^\prime \leq \sum_{i=1}^{C}\beta^i Z_{n,i}^*
\label{equ:eq1_lemma1}
\end{equation}

Proof. Proof is provided in the Appendix. 41. ◻

We note that Lemma [lem:skip:beta1] is a corollary of Lemma [lem:skip:betage1], which can be obtained when $`\beta=1`$.

Using Lemma. [lem:skip:beta1], Proposition. [noSkipPro], and Lemma. [lem:skip:betage1], we are ready to show the optimality of Layered Bin Packing Adaptive Algorithm in solving problem ([equ:eq1]-[equ:c7eq1]), and this is stated in the following theorem.

Up to a given enhancement layer $`M, M \geq 0`$, if $`Z_{m,i}^*`$ is the size of every layer $`m \leq M`$ of chunk $`i`$ that is found by running Layered Bin Packing Adaptive Algorithm, and $`Z_{m,i}^\prime`$ is the size that is found by running any other feasible algorithm, then the following holds for any $`0<\gamma < 1`$, satisfies ([basic_gamma_1]), and $`\beta \le 1`$.

\begin{equation}
 \sum_{m=0}^M \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^\prime \leq \sum_{m=0}^{M} \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^*.
\label{equ:thm}
\end{equation}

In other words, Layered Bin Packing Adaptive Algorithm achieves the optimal solution of the optimization problem ([equ:eq1]-[equ:c7eq1]) when $`0<\gamma < 1`$, satisfy ([basic_gamma_1]), and $`\beta \ge 1`$.

Proof. Proof is provided in the Appendix 43. ◻

Layered Bin Packing Adaptive Algorithm finds the optimal solution to the optimization problem [equ:eq1]-[equ:cl].

ABR (MGC for SVC)

In this case, the tricky point is how we can come up with $`q(.)`$ that maps chunk size to a QoE. In VBR chunks encoded at same layer may have different sizes. Moreover, fetching enhancement layer of dynamic chunk is more important than fetching it for static chunk. However with the following assumptions:

priority given to lower layers over higher layers among all chunks (i.e $`\gamma < 1`$).
for a given layer, priority is given to dynamic chunks over static chunks.$`q(.)`$ is sum of chunk sizes scaled by different scaling factors where scaling factors of dynamic chunks are much higher than those of static chunks.
finite number of rates associated with every layer starting from the lowest one up to the highest one where most static chunks are encoded at the lowest one, and most dynamic chunk chunks are encoded at the highest one, and the rate of any chunk encoded at that layer can be described as a function of the nominal rate of that layer as: $`r_{n,i}=\tau_i r_n`$.

Given, the above assumptions, we still can claim the optimality of offline algorithm when we use same Backward-Forward algorithm of CBR with one change. To describe the change, let’s assume that at layer $`l`$, there are $`M`$ rates ($`1,....,m`$) where 1 is the lowest, and m is the highest. Chunks encoded at this layer may be encoded at any of the rates (depends on how dynamic that chunk is). Therefore, we first run backward algorithm on the highest sub-level of that layer (level with rate $`m`$) given the results of lower layers, and we include the most dynamic chunks only (the chunks that are encoded into that layer), once we find the candidates of that layer among most dynamic chunks, we go to the next layer $`m-1`$, we keep doing that until the rate $`1`$. After that, we just run the forward algorithm.

Online Algorithm: Dealing with Short and Inaccurate BW Prediction

We face two issues in reality. First, the bandwidth information for the distant future may not always be available. Second, even for the near future, the estimated bandwidth may have errors. To address both of these challenges, we design an online algorithm (Algorithm 29). The algorithm works as follows. Every $`\alpha`$ seconds, we predict the bandwidth for $`W`$ seconds ahead (lines 9-10). Typically $`\alpha`$ is much smaller than $`W`$ ($`\alpha \ll W`$). We find the last chunk to consider in this run of the algorithm (line 11). The online algorithm thus computes the scheduling decision only for the chunks corresponding to the next $`W`$ seconds ahead. We re-compute the quality decisions periodically (every $`\alpha`$ seconds) in order to adjust to any changes in the prediction. We can also run the computation after the download of every chunk (or layer) due to the low complexity of our algorithm.

Moreover, to handle inaccurate bandwidth estimation, we set lower buffer threshold $`(B_{min})`$, so if the buffer is running lower than this threshold, we reduce the layer decision by 1 (except if a chunk is already at base layer quality) (lines 15-16). In the real chunk download, if we are within a certain threshold from the deadline of the current chunk and it is not yet fully downloaded, we stop fetching the remaining of the chunk as far as the base layer is fetched and we play it at the quality fetched so far.

Y_n, deadline(i), s, B_m, C, B(j), W: the prediction window size, α: the decision reconsideration period. X(i)∀i: The maximum size in which chunk i can be fetched, I_n: set contains the indices of the chunks that can be fetched up to layer n quality.

same as Algorithm 1, offline version plus the following: sc = 1, the index of the chunk to start with. ec = 1, the index of the last chunk to consider. st = 1, the current time slot. collect user position and speed. predict the bandwidth for W seconds ahead. ec=The index of the first chunk has its deadline ≥ st + W For each layer, n = 0, ⋯, N [X, I_n] = backwardAlgo(B, X, X_n, sc, ec, L, deadline, B_m, bf, t, c, a, e) [t, a, e] = forwardAlgo(B, X, sc, ec, deadline, Bm, bf, I_n) sc=last fetched chunk+1 st=current time slot

Online Layered Bin Packing Adaptive Algorithm

No-Skip Based Streaming Algorithm

In No-Skip streaming (watching a pre-recorded video), when the deadline of a chunk cannot be met, rather than skipping it, the player will stall the video and continue downloading the chunk. The objective here is to maximize the weighted sum of the layer sizes while minimizing the stall duration (the rebuffering time). The objective function is slightly different from equation. [equ:eq1] since we do not allow to skip the base layers. However, we still allow for skipping the higher layers. For the constraints, all constraints are the same as skip based optimization problem except that we introduce constraint [equ:c1eq2] to enforce the $`Z_0(i)`$ for every chunk $`i`$ to be equal to the BL size ($`Y_0`$). We define the total stall (re-buffering) duration from the start till the play-time of chunk $`i`$ as $`d(i)`$. Therefore, the deadline of any chunk $`i`$ is $`(i-1)L+s+d(i)`$. The No-Skip formulation can thus be written as:

\begin{eqnarray}
%\hspace{-0.2in}
\textbf{Maximize: } \sum_{n=1}^{N}\gamma^n\sum_{i=1}^{C}\beta^i Z_{n,i}-\lambda d(C)%- \lambda \sum_{i=2}^C \beta^i(d(i)-d(i-1))%\Bigg(\max\Bigg(\sum_{t=s}^{C+s}\bigg(Y_{(0,t-s)}t-\sum_{i=1}^{t-s-1}\sum_{j=1}^{t} z_0(i,j),0\bigg)\Bigg)\Bigg)\nonumber \label{equ:eq1}
\label{equ:eq2}
\end{eqnarray}

subject to,

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s+d(i)} z_0(i,j) = Y_{0}   \forall i = 1, \cdots, C
\label{equ:c1eq2}
\end{eqnarray}

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s+d(i)} z_n(i,j) = Z_{n,i}, \quad  \forall i,  n>0
\label{equ:c2eq2}
\end{eqnarray}

\begin{eqnarray}
 Z_{n,i} \le \frac{Y_n}{Y_{n-1}}Z_{n-1,i}, \quad  \forall i,  n
\label{equ:c2eq21}
\end{eqnarray}

\begin{eqnarray}
\sum_{n=0}^N\sum_{i=1}^{C} z_n(i,j)  \leq B(j) \  \   \forall 1\le j\le (C-1)L+s+d(C),
\label{equ:c3eq2}
\end{eqnarray}

\begin{eqnarray}
\sum_{n=0}^N\sum_{i, (i-1)L+s+d(i) > t}{\bf I}\Bigg(\sum_{j=1}^t\bigg(z_n(i,j)\bigg)> 0\Bigg) L \leq B_m \   \forall t
\label{equ:c4eq2}
\end{eqnarray}

\begin{equation}
z_n(i,j) \geq 0\   \forall i = 1, \cdots, C
\label{equ:c5eq2}
\end{equation}

\begin{equation}
z_n(i,j) = 0\   \forall i, j > (i-1)L+s+d(i)
\label{equ:c6eq2}
\end{equation}

\begin{equation}
d(i+1) \geq d(i)\geq 0\   \forall i = 1, \cdots, C-1 \label{deq}
\end{equation}

\begin{equation}
Z_{n,i} \in {\mathcal Z}_n \quad  \forall i = 1, \cdots, C, and\ \forall n = 1, \cdots, N
\label{equ:c7eq2}
\end{equation}

\begin{eqnarray}
\text{Variables:}&& z_n(i,j), Z_{n,i}, d(i) \forall   i = 1, \cdots, C,  \nonumber \\
&& 1\le j \le (C-1)L+s+d(C), n = 0, \cdots, N \nonumber
\end{eqnarray}

This formulation converts multi-objective optimization problem with the stall duration and weighted quality as the two parameters into a single objective using a tradeoff parameter $`\lambda`$. $`\lambda`$ is chosen such that avoidance of one stall is preferred as compared to fetching all the layers of all chunks since users tend to care more about not running into rebuffering over better quality. Specifically, $`\lambda`$ satisfies the following equation.

\begin{equation}
\lambda >  \sum_{n=0}^N \gamma^n Y_n\sum_{i=1}^C\beta^i
\label{lambda_cond}
\end{equation}

With this assumption, we can solve the optimization problem optimally with a slight modification to the algorithm proposed for the skip based streaming version. The proposed algorithm for the No-Skip version is referred to by “No-Skip Layered Bin Packing Adaptive Algorithm" (No-Skip LBP, Algorithm 5 in Appendix 60). There are a few key differences in this algorithm as compared to the skip version, and we explained them below.

One difference as compared to the skip version is that the first step is to determine the minimum stall time since that is the first priority. In order to do this, we simulate fetching chunks in order at BL quality (Base layer forward algorithm, Algorithm 6 in Appendix 60). We first let $`d(1)=\cdots = d(C)=0`$. We start to fetch chunks in order. If chunk $`i`$ can be fetched within its deadline $`((i-1)L+s+d(i))`$, we move to the next chunk (line 20-21). If chunk $`i`$ cannot be fetched by its deadline, we continue fetching it till it is completely fetched, and the additional time spent in fetching this chunk is added to $`d(k)`$ for every $`k\ge i`$ since there has to be an additional stall in order to fetch these chunks (line 22-24). Using this, we obtain the total stall and the deadline of the last chunk $`(d(C)`$, and $`deadline(C))`$ The stall duration of the last chunk (chunk C) gives the total stall duration for the algorithm.

The other difference is in running the backward algorithm for the base layer decisions (see base layer backward algorithm, Algorithm 7 in Appendix 60). The key difference in running the backward algorithm for the base layer with compare to the skip version is that there must be no BL skips. With the backward algorithm, we will work on moving stalls as early as possible. We run the base layer backward algorithm starting at time slot $`j=deadline(C)= (C-1)L+s+d(C)`$. The scenario of deadline violation cannot happen due to the procedure of forward step before this. Thus, the possibility of buffer constraint violation must be managed. If we reach a chunk in which there is a buffer constraint violation, we decrement its deadline by 1 and check if the violations can be removed. This decrement can be continued until the buffer constraint violation is avoided (lines 11, 28-29). This provides the deadlines of the different chunks such that stall duration is at its minimum and stalls are brought to the earliest possible time, so we get minimum number of stalls and optimal stall pattern. When stalls are brought to their earliest possible, all chunks can have more time to get their higher layers without violating any of the constraints. Therefore, we have higher chance of getting higher layers of later chunks. Forward algorithm (Algorithm 3) is run after that to simulate fetching chunks in order and provide lower deadlines of chunks for the E1 backward run. For enhancement layer decisions, the backward-forward scan is run as in the skip version case since skips are allowed for the enhancement layers. The main algorithm that calls the forward and backward scans in the sequence we described is “No-Skip Layered Bin Packing Adaptive Algorithm" (Algorithm 5). An illustrative example of the algorithm is described in Appendix 67.

If $`d^*(C)`$ is the total stall duration that is found by No-Skip base layer forward algorithm and $`d^{\prime}(C)`$ is the total stall duration that is found by running any other feasible algorithm, then the following holds true:

d^\prime(C) \geq d^*(C)

In other words, the No-Skip base layer forward algorithm achieves the minimum stall duration.

Proof. Proof is provided in Appendix 68. ◻

From Lemma [lemma: noSkipLemma1], we note that No-Skip forward algorithm would finish playing all chunks at their earliest time. Since all the chunks are obtained at the base layer quality and there is a minimum number of stalls, we note that the objective function is optimized for any $`\beta\ge 1`$ when only base layer is considered. When running base layer backward algorithm, the deadlines of the chunks are shifted to the last possibilities which gives the maximum flexibility of obtaining higher layers of chunks before their deadlines.

Having shown the result for the base layer and having determined the deadline for the last chunk, the rest of the algorithm is similar to the skip version where only the weighted quality need to be considered (the stall time is already found). Thus, the optimality result as described in the following Theorem holds, where the proof follows the same lines as described for the skip version theorem.

If $`z_{m,i}^*`$ is the feasible size of every layer $`m \leq M`$ of chunk $`i`$ that is found by running No-Skip Layered Bin Packing Adaptive Algorithm, and $`z_{m,i}^\prime`$ is a feasible size that is found by any other feasible algorithm for the same stall duration, then the following holds for $`0 <\gamma < 1`$, ([basic_gamma_1]), $`\beta \ge 1`$, and ([lambda_cond]):

\begin{align*}
& \sum_{m=0}^M \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^\prime \leq \sum_{m=0}^{M} \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^*
\label{equ:eq1_lemma1}
\end{align*}

In other words, No-Skip Layered Bin Packing Adaptive Algorithm achieves the optimal solution of the optimization problem ([equ:eq2]-[equ:c7eq2]).

Proof. Proof is provided in Appendix 69. ◻

The No-Skip scheme faces the same challenges described in §32.6: short bandwidth prediction in the distant future and inaccurate bandwidth prediction, and they are handled the same way described in section §32.6.

Among any class of algorithms, the above greedy algorithm provides the maximum number of base layer chunks.

The proof follows by the greedy nature of the algorithm. At the time, the first chunk is skipped implies that there is no bandwidth available to obtain all the chunks till that time. Thus, one chunk at least has to be skipped. We can extend the arguments to see that there is no strategy that can obtain less number of skipped chunks.

We assume that the optimization is performed very $`\alpha`$ time-instants and the prediction for a window of length $`W`$ is known. We first assume that the bandwidth is known perfectly. Assuming the bandwidth for the next $`W`$ time-instants, the proposed non-causal optimization algorithm is performed (including finding a start-up delay needed, if any) for all the chunks that have not been downloaded so far and are within the first $`W-d`$ time instants of this window. If these chunks are downloaded earlier, the rest of the chunks are downloaded in the highest quality.

In the presence of prediction error, similar modifications are made as in the previous subsection. More precisely, the part of the chunk not downloaded by its deadline is skipped. If the base layer is not downloaded by its deadline, we re-start the $`W`$ length interval at this point and thus re-compute the decisions for the next $`W`$ time-instants at this point .The random drop and random increase policies are also utilized for the schedule till the $`W-d`$ chunks.

Layered Bin Packing Adaptive Algorithm

B, X, X_n, C, L, deadline, B_m, bf, t, c, a, e X(i) size of chunk i, I_n: set contains chunks that can be fetched in quality up to n^th layer. i = C, j = deadline(C) initialize bf(j) to zeros ∀j.

(bf(deadline(i)) = B_m) then i = i − 1

rem1 = c(j) − c(1) + e(1), rem2 = rem1 rem2 = c(j) − c(t(i)), rem1 = rem2 + e(t(i)) + a(i)

(X(i) > 0) then X_n(i) = X(i) else i = i − 1

e(t(i)) = e(t(i)) + rem1 − X_n X(i) = X_n(i), I_n ← I_n ∪ i

fetched = min(B(j), X_n(i)), B(j) = B(j) − fetched X_n(i) = X_n(i) − fetched (X_n(i) > 0) then bf(j) = bf(j) + L

(X_n(i) = 0) then i = i − 1

(B(j) = 0) then j = j − 1 j = j − 1

Backward Algorithm

B, X, C, deadline, Bm, bf, I_n

j = 1, k = 1 i = I(k) then k = k + 1 (bf(j) = B_m) then j = j + 1

fetched = min(B(j), X(i))

t(i) = j, a(i) = fetched B(j) = B(j) − fetched e(j) = B(j), X(i) = X(i) − fetched

X(i) > 0 then bf(j) = bf(j) + L X(i) = 0 then k = k + 1

B(j) = 0 then j = j + 1 k = k + 1

Forward Algorithm

Non real time streaming algorithm

In this section, we will describe the algorithm for non-real time streaming assuming that we non-causally know the future bandwidth. We will further relax the assumption to consider prediction errors, and give an online algorithm.

Non-causal Algorithm

The main difference between the real-time and non-real time streaming is that rather than skipping a frame, we will continue downloading the frame till it is downloaded. Since the main objective is to minimize the number of stalls, we consider the algorithm where there is no stall and all the delay is in the form of a start-up delay. Using the greedy policy for a certain start-up delay, we can know whether the set of frames for which base layer is downloaded, $`I`$, contains all the frames or not. The start-up delay is chosen as the minimum such time for which all the frames at at-least the base layer can be downloaded. This startup delay can be found by a binary search algorithm. Having found such startup delay, the real-time non-causal algorithm is used to find qualities for each of the frame.

Non-Real Time Scheduling with Prediction Error

In the presence of certain prediction error (assuming that the the error has zero mean), we use the startup delay and the schedule as decided above and start fetching the frames. However, if a frame is not completed by its deadline at the required quality and at-least base layer was received, the frame is played at the largest possible quality and we continue forward. However, if the current frame is not downloaded by its deadline even in the base layer, a re-optimization is performed at that point to find the startup delay for the future time bin such that there is no expected stall. We note that the adjustments of random dropping an random increase as mentioned for real-time streaming can also be used in this case.

Non-Real Time Online Scheduling Algorithm

We assume that the optimization is performed very $`\alpha`$ time-instants and the prediction for a window of length $`W`$ is known. We first assume that the bandwidth is known perfectly. Assuming the bandwidth for the next $`W`$ time-instants, the proposed non-causal optimization algorithm is performed (including finding a startup delay needed, if any) for all the frames that have not been downloaded so far and are within the first $`W-d`$ time instants of this window. If these frames are downloaded earlier, the rest of the frames are downloaded in the highest quality.

In the presence of prediction error, similar modifications are made as in the previous subsection. More precisely, the part of the frame not downloaded by its deadline is skipped. If the base layer is not downloaded by its deadline, we re-start the $`W`$ length interval at this point and thus re-compute the decisions for the next $`W`$ time-instants at this point .The random drop and random increase policies are also utilized for the schedule till the $`W-d`$ frames.

No-Skip Based Streaming Algorithm

\begin{eqnarray}
%\hspace{-0.2in}
\textbf{Maximize: } \sum_{n=1}^{N}\gamma^n\sum_{i=1}^{C}\beta^i Z_{n,i}-\lambda d(C)%- \lambda \sum_{i=2}^C \beta^i(d(i)-d(i-1))%\Bigg(\max\Bigg(\sum_{t=s}^{C+s}\bigg(Y_{(0,t-s)}t-\sum_{i=1}^{t-s-1}\sum_{j=1}^{t} z_0(i,j),0\bigg)\Bigg)\Bigg)\nonumber \label{equ:eq1}
\label{equ:eq2}
\end{eqnarray}

subject to,

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s+d(i)} z_0(i,j) = Y_{0}   \forall i = 1, \cdots, C
\label{equ:c1eq2}
\end{eqnarray}

\begin{eqnarray}
\sum_{j=1}^{(i-1)L+s+d(i)} z_n(i,j) = Z_{n,i}, \quad  \forall i,  n>0
\label{equ:c2eq2}
\end{eqnarray}

\begin{eqnarray}
 Z_{n,i} \le \frac{Y_n}{Y_{n-1}}Z_{n-1,i}, \quad  \forall i,  n
\label{equ:c2eq21}
\end{eqnarray}

\begin{eqnarray}
\sum_{n=0}^N\sum_{i=1}^{C} z_n(i,j)  \leq B(j) \  \   \forall 1\le j\le (C-1)L+s+d(C),
\label{equ:c3eq2}
\end{eqnarray}

\begin{eqnarray}
\sum_{n=0}^N\sum_{i, (i-1)L+s+d(i) > t}{\bf I}\Bigg(\sum_{j=1}^t\bigg(z_n(i,j)\bigg)> 0\Bigg) L \leq B_m \   \forall t
\label{equ:c4eq2}
\end{eqnarray}

\begin{equation}
z_n(i,j) \geq 0\   \forall i = 1, \cdots, C
\label{equ:c5eq2}
\end{equation}

\begin{equation}
z_n(i,j) = 0\   \forall i, j > (i-1)L+s+d(i)
\label{equ:c6eq2}
\end{equation}

\begin{equation}
d(i+1) \geq d(i)\geq 0\   \forall i = 1, \cdots, C-1 \label{deq}
\end{equation}

\begin{equation}
Z_{n,i} \in {\mathcal Z}_n \quad  \forall i = 1, \cdots, C, and\ \forall n = 1, \cdots, N
\label{equ:c7eq2}
\end{equation}

\begin{eqnarray}
\text{Variables:}&& z_n(i,j), Z_{n,i}, d(i) \forall   i = 1, \cdots, C,  \nonumber \\
&& 1\le j \le (C-1)L+s+d(C), n = 0, \cdots, N \nonumber
\end{eqnarray}

\begin{equation}
\lambda >  \sum_{n=0}^N \gamma^n Y_n\sum_{i=1}^C\beta^i
\label{lambda_cond}
\end{equation}

d^\prime(C) \geq d^*(C)

In other words, the No-Skip base layer forward algorithm achieves the minimum stall duration.

Proof. Proof is provided in Appendix 68. ◻

\begin{align*}
& \sum_{m=0}^M \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^\prime \leq \sum_{m=0}^{M} \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^*
\label{equ:eq1_lemma1}
\end{align*}

In other words, No-Skip Layered Bin Packing Adaptive Algorithm achieves the optimal solution of the optimization problem ([equ:eq2]-[equ:c7eq2]).

Proof. Proof is provided in Appendix 69. ◻

Conclusions and Future work

We formulated the SVC rate adaptation problem as a non-convex optimization problem that has an objective of minimizing the skip/stall duration as the first priority, maximize the average playback as the second priority, and minimize the quality switching rate as the last priority. We develop LBP (Layered Bin Packing Adaptive Algorithm), a low complexity algorithm that is shown to solve the problem optimally in polynomial time. Therefore, offline LBP algorithm that uses perfect prediction of the bandwidth for the whole period of the video provides a theoretic upper bound. Moreover, an online LBP that is based on sliding window and solves the optimization problem for few chunks ahead was proposed for the more practical scenarios in which the bandwidth is predicted for short time ahead and has prediction errors. The results indicate that LBP is robust to prediction errors, and works well with short prediction windows. It outperforms existing streaming approaches by improving key QoE metrics. Finally, LBP incurs low runtime overhead due to its linear complexity.

Extending the results to consider streaming over multiple paths with link preferences is an interesting problem, and is being considered by the authors in .

This paper proposes a streaming algorithm for videos whose chunks are encoded using a layered Scalable Video Coding. We addressed numerous challenges including strategically leveraging network bandwidth prediction, adapting the algorithm to both skip and no-skip streaming scenarios, and reducing the computational overhead. Extensive simulation and emulation evaluations reveal the performance and robustness of our scheme, based on which we are currently implementing a full SVC-based video streaming system.

Optimal Linear-time Solution

Layered Bin Packing Adaptive Algorithm

B, X, X_n, C, L, deadline, B_m, bf, t, c, a, e X(i) size of chunk i, I_n: set contains chunks that can be fetched in quality up to n^th layer. i = C, j = deadline(C) initialize bf(j) to zeros ∀j.

(bf(deadline(i)) = B_m) then i = i − 1

rem1 = c(j) − c(1) + e(1), rem2 = rem1 rem2 = c(j) − c(t(i)), rem1 = rem2 + e(t(i)) + a(i)

(X(i) > 0) then X_n(i) = X(i) else i = i − 1

e(t(i)) = e(t(i)) + rem1 − X_n X(i) = X_n(i), I_n ← I_n ∪ i

fetched = min(B(j), X_n(i)), B(j) = B(j) − fetched X_n(i) = X_n(i) − fetched (X_n(i) > 0) then bf(j) = bf(j) + L

(X_n(i) = 0) then i = i − 1

(B(j) = 0) then j = j − 1 j = j − 1

Backward Algorithm

B, X, C, deadline, Bm, bf, I_n

j = 1, k = 1 i = I(k) then k = k + 1 (bf(j) = B_m) then j = j + 1

fetched = min(B(j), X(i))

t(i) = j, a(i) = fetched B(j) = B(j) − fetched e(j) = B(j), X(i) = X(i) − fetched

X(i) > 0 then bf(j) = bf(j) + L X(i) = 0 then k = k + 1

B(j) = 0 then j = j + 1 k = k + 1

Forward Algorithm

Optimality of the Proposed Algorithm

Proof. Proof is provided in Appendix. 40. ◻

t_f(i) \leq t_x(i).

\begin{equation}
\sum_{i=1}^{C}\beta^i Z_{n,i}^\prime \leq \sum_{i=1}^{C}\beta^i Z_{n,i}^*
\label{equ:eq1_lemma1}
\end{equation}

Proof. Proof is provided in the Appendix. 41. ◻

We note that Lemma [lem:skip:beta1] is a corollary of Lemma [lem:skip:betage1], which can be obtained when $`\beta=1`$.

\begin{equation}
 \sum_{m=0}^M \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^\prime \leq \sum_{m=0}^{M} \gamma^m\sum_{i=1}^{C}\beta^i Z_{m,i}^*.
\label{equ:thm}
\end{equation}

Proof. Proof is provided in the Appendix 43. ◻

Layered Bin Packing Adaptive Algorithm finds the optimal solution to the optimization problem [equ:eq1]-[equ:cl].

ABR (MGC for SVC)

priority given to lower layers over higher layers among all chunks (i.e $`\gamma < 1`$).
for a given layer, priority is given to dynamic chunks over static chunks.$`q(.)`$ is sum of chunk sizes scaled by different scaling factors where scaling factors of dynamic chunks are much higher than those of static chunks.
finite number of rates associated with every layer starting from the lowest one up to the highest one where most static chunks are encoded at the lowest one, and most dynamic chunk chunks are encoded at the highest one, and the rate of any chunk encoded at that layer can be described as a function of the nominal rate of that layer as: $`r_{n,i}=\tau_i r_n`$.