System Stability Under Adversarial Injection of Dependent Tasks

S Y S T E M S T A B I L I T Y U N D E R A D V E R S A R I A L I N J E C T I O N O F D E P E N D E N T T A S K S A P R E P R I N T V icent Cholvi Departament de Llenguatges i Sistemes Inform ` atics Univ ersitat Jaume I Castell ´ o, Spain Juan Echag ¨ ue Departament de Llenguatges i Sistemes Inform ` atics Univ ersitat Jaume I Castell ´ o, Spain Antonio Fern ´ andez Anta IMDEA Networks Institute Madrid, Spain Christopher Thrav es Caro Depto. Ing. Mat. Facultad de Ciencias F ´ ısicas y Matem ´ aticas Univ ersidad de Concepci ´ on, Chile October 7, 2019 A B S T R A C T In this work, we consider a computational model of a distrib uted system formed by a set of serv ers in which jobs, that are continuously arriving, ha ve to be executed. Every job is formed by a set of dependent tasks (i. e., each task may hav e to w ait for others to be completed before it can be started), each of which has to be ex ecuted in one of the servers. The arri val of jobs and their properties is assumed to be controlled by a bounded adv ersary , whose only restriction is that it cannot overload any server . This model is a non-trivial generalization of the Adversarial Queuing Theory model of Borodin et al., and, like that model, focuses on the stability of the system: whether the number of jobs pending to be completed is bounded at all times. W e show multiple results of stability and instability for this adversarial model under dif ferent combinations of the scheduling policy used at the servers, the arri val rate, and the dependence between tasks in the jobs. K eywords T asks scheduling · task queuing · adversarial queuing · dependent tasks · stability 1 Introduction In this work, we consider a model of jobs formed by dependent tasks that have to be ex ecuted in a set of servers. The dependencies among the tasks of a job restrict the order and time of their execution. For instance, a task q may need some information from another task p , so that the latter must complete before q can be executed. This model embodies, for instance, the dynamics of Network Function V irtualization (NFV) systems [13, 23] or Osmotic Computing (OC) [22]. In a NFV system, network services (which are job types) are speciﬁed as service chains, obtained by the concatenation of network functions. These network functions are dependent computational tasks to be executed in the NFV Infrastructure (e.g., servers distributed over the network). In an OC system, an application is divided into microservices that are distributed and deployed on an edge/cloud server infrastructure. The user requests (jobs) in volv e processing (tasks) in sev eral of these microservices, as deﬁned by an orchestrator that takes into account the dependencies between the microservices. In that line, it also encompasses a number of features of Orchestration Languages (see, for instance, [16]), which propose a way to relate concurrent tasks to each other in a controlled fashion: the in vocation of tasks to achie ve a goal, the synchronization between tasks, managing priorities, etc. In our model, we consider a dynamic system in which job requests (or jobs for short) are continuously arriving. Each job contains the whole speciﬁcation of its dependent tasks: the collection of tasks to be executed, the server that must ex ecute each task, the time the e xecution incurs, the dependencies among tasks, etc. Instead of assuming stochastic job arriv als into the system, in our model we assume the existence of an adv ersary that has full control of the job requests System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T arriv als, and the speciﬁcation of their tasks. The only restriction on the adversary is that no server can be overloaded in the long run (some burstiness in the load is allo wed). In this adversarial framew ork, the objective is to achie ve stability in the system. This means that the system is able to cope with the adversarial arri vals, maintaining the number of pending job requests in the system bounded at all times. (This usually also implies that all the job requests are eventually completed.) Observe that the framework assumes that the resource allocation is done by the adversary (since it chooses where tasks have to be executed and in which order). Hence, the only tool we have to achie ve stability is the scheduling of tasks in the dif ferent servers. The study of the quality of service that can be pro vided under worst-case assumptions in a gi ven system (NFV or OC, for instance) is important in order to be able to honor Service Lev el Agreements (SLA). The positi ve results we obtain in this paper show that it is possible to guarantee a certain level of service e ven under pessimistic assumptions. These results can also be used to separate resource allocation and scheduling as long as the resource allocation guarantees that servers are not o verloaded, since we prov e that it is possible to guarantee stability in this case. 1.1 Related W ork For many years, the common belief was that only ov erloaded queues 1 could generate instability , while underloaded ones could only induce delays that are longer than desired, b ut always remain stable. This general wisdom goes back to the models of networks originally de veloped by Kleinrock [17], and based on Jackson queuing networks [14]. Stability results for more general classes of queuing networks [2, 15] also conﬁrmed that only ov erload generates instability . This belief w as sho wn to be wrong when it was observ ed that, in some networks, the backlogs in speciﬁc queues could grow indeﬁnitely e ven when such queues were not o verloaded [18, 19]. Motiv ated by this fact, there has been an effort to understand the factors that can affect the stability of a queueing network. In [8], the authors provide some results regarding some conditions that render bounded queue lengths both for a single queue and for feedforward networks. In [7], it is sho wn a class of networks which, although queues are served substantially more quickly that the rate at which tasks are injected, their mean service times are as small as desired. By using a simple queueing network, in [10] it was shown that conditions on the mean interarri val and service times are not enough to determine its stability under a particular policy . It was later shown that instability could also arise in some types of K elly networks [1, 6] (a network is said to be of the K elly type [15] when serv ers ha ve the same service rates). These results in the Adversarial Queuing Theory (A QT) model aroused an interest in understanding the stability properties of packet-switched networks. This has attracted the attention of many researchers in recent years (see, for instance, the results in [4, 9, 20, 21]). 1.2 Our W ork In this paper , we introduce a model to analyze queuing systems of computational jobs formed by dependent tasks. W e call this model Adversarial Job Queueing (AJQ). The main contribution of the AJQ model is a novel approach for modeling tasks and their dependencies in the computational job system which is much richer than the modelling capabilities of A QT . As mentioned, a job is composed by a set of tasks, each described by some parameters, like the server in which the task must be ex ecuted or the time that the task needs to be completed. Additionally , each task depends on other tasks of the same job (i. e., subsets of tasks that must be completed before the giv en task starts). The rich variety of task dependencies that we allow is, as far as we know , unique of our formalism, and makes AJQ very adaptive to model a variety of complex scenarios (including A QT as a special case). For instance, our model allows imposing that a task q cannot start until a set P of other tasks of the same job are completed. This expresses a scenario in which task q aggre gates the results obtained by the tasks in P . One example of this conﬁguration is a MapReduce computation [11], in which the reduce task has to wait for all the map tasks to complete. This dependence in which the task q needs all the tasks in set P to complete is called an AND dependence. Ho we ver , our formalism allows for more expressi veness by means of the OR dependence , in which se veral AND dependencies are combined. In this case, a task q has sev eral sets P 1 , P 2 , . . . , P l , and it waits for any set P i to be completed. This conﬁguration appears, for instance, when sev eral redundant tasks are used, so that the output of any of them is equally valid as input for q [12]. As mentioned, tasks are processed in servers. When tasks are activ e (ready to be executed) at a server but not being processed yet, they are maintained in a queue at the serv er . It is assumed that each server has an inﬁnite buf fer to store its own queue of active tasks. W e use a bounded adversarial setting in our model. In this setting, we assume that an adversary injects jobs in the system, choosing the time and the characteristics of each injected job, with certain 1 A server queue is considered to be o verloaded when the total arriv al rate at the server is greater than the service rate. 2 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T limits. This leads to worst-case system analyses. A desirable property under this model is that each server’ s service rate matches its injection rate for an arbitrarily long period of time, which implies the stability of the system, since the number of jobs at any time is bounded. In the rest of this paper, we deﬁne the model more formally and provide some results regarding both the stability and instability under different assumptions. From the point of view of the dependencies between tasks, we sho w that if they are feed-forwar d (see below) then the system is stable. From the point of view of the scheduling policies (i. e., how a server decides which task to execute next), we observe that, since AJQ is more general than A QT (once we do the appropriate matching between jobs and packets, and between links and servers) unstable scheduling policies in A QT are easily translated into policies that are unstable in AJQ. On the other hand, we sho w that some stable scheduling policies in A QT remain stable in AJQ. For instance, we prove that LIS , which gi ves priority to older tasks/pack ets (and is stable in A QT for an y rate below 1), is stable in AJQ if the injection rate of jobs is below a certain value that depends on the tasks processing time and acti vation delay . Finally , we show that there are other policies that are stable in A QT but unstable in AJQ. 2 Model In this section, we deﬁne the Adversarial J ob Queueing (AJQ) model. The AJQ model is designed to analyze systems of queueing jobs. The three main components of an AJQ system ( S, P, A ) are: • a set S = { s 1 , s 2 , . . . , s n } of n servers, • an adversary A who injects jobs in the system, and • a scheduling policy P , which is the criteria used by serv ers to decide which task to serve next among the tasks waiting in their queues. The system ev olves ov er time continuously (unlike A QT , which assumes discrete time). In each moment, the adversary may inject jobs to the system while the servers process those jobs. In each moment as well, some tasks may be w aiting to be ex ecuted, others may be in process, and others may be completed. A job is considered completed when all its tasks are completed. When a job is completed, all its tasks disappear from the system. Each job h K, f K i consists of a ﬁnite set K of tasks and a function f K that determines dependencies among the tasks. (For simplicity we will denote the job h K , f K i by its task set K .) Let K = { k 1 , k 2 , k 3 , . . . , k l K } be a job, where each k i is a task of K . The integer l K denotes the number of tasks of K . Each task k i is deﬁned by three parameters h s K i , d K i , t K i i . The parameter s K i ∈ S is the serv er in which k i must be e xecuted. The parameter d K i ≥ 0 is the activation delay of k i . The parameter t K i > 0 is the pr ocessing time of k i , i. e., the time server s K i takes to execute task k i . Let ( S, P , A ) be an AJQ system. Let T max := max i,K { t K i } and T min := min i,K { t K i } be the maximum and minimum time, respectively , required to complete a task of an y job K injected in the system. W e assume that these two quantities are bounded and do not depend on the time. Let D min := min i,K { d K i } and D max := max i,K { d K i } be the minimum and maximum acti vation delay , respectively , among all tasks of an y job injected in the system. Since, d K i ≥ 0 it follows that D min ≥ 0 . On the other hand, we assume that D max is a constant that may depend on the parameters of the system, but it does not change over time. Finally , we use L = max K { l K } to denote the maximum number of tasks (length) of a job, which we assume is also a constant that does not depend on the time. Feasibility . Let P ( K ) be the power set of K , i. e., the set of all subsets of K . Furthermore, let P 2 ( K ) be the second power set of K , i. e., the set of all subsets of P ( K ) . Gi ven a job K , a feasibility function f K : K → P 2 ( K ) determines which tasks of K are feasible , which means that they are ready to be ex ecuted, once the acti vation delay has passed. Let f K ( k i ) be equal to { A 1 , A 2 , . . . , A ` i } . The sets A x for 1 ≤ x ≤ ` i are called feasibility sets for k i . Then, the task k i is feasible at a time t if there exists a feasibility set A x for k i such that all tasks in A x hav e been completed by time t . Otherwise, k i is blocked, and still has to wait for some other tasks of K to complete before becoming feasible. The activ ation delay d k i of a task k i represents a setup cost, expressed in time, that k i must incur once it becomes feasible and before it can start to be processed. If t is the time instant at which k i becomes feasible, then k i will incur its acti vation delay during time interv al [ t, t + d k i ] . Hence, it cannot be e xecuted during such interv al, in which we say that task k i is a delayed feasible task (or only delayed task). When k i completes its activ ation delay at time t + d k i , it can be serv ed, and since that moment will be referred to as an active feasible task, or simply acti ve task. Equiv alently , a feasible task is acti ve if it has been feasible for at least d k i time. A job with at least one feasible (resp., activ e) task will be referred to as a feasible (resp., active ) job . 3 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T T able 1: This table sho ws the feasibility function of two different jobs J and K deﬁned ov er the same set of tasks { 1 , 2 , 3 , 4 , 5 } . Column f J ( · ) shows the feasibility function of job J , while column f K ( · ) shows the feasibility function of job K . Job J is not doable, since tasks 2 , 3 , 4 and 5 can- not be assigned a layer . On the other hand, job K is doable. Indeed, the number of each task in job K corresponds to its layer . T asks f J ( · ) f K ( · ) 1 {∅} {∅} 2 {{ 1 , 5 }} {{ 1 } , { 5 }} 3 {{ 2 }} {{ 2 }} 4 {{ 3 }} {{ 3 }} 5 {{ 4 }} {{ 4 }} 1 3 2 4 5 6 La y e r 1 La y e r 4 La y e r 3 La y e r 2 1 2 3 4 5 La y e r 1 La y e r 3 La y e r 2 1 2 3 4 5 1 T able 2: This ﬁgure shows the skeleton of jobs J and K presented in T able 1, which are the same. W ith the feasibility function, a task cannot start being served until some given state of the tasks in the same job holds. Hence, the feasibility function can be used, for instance, to force the execution sequence of the tasks of a job. It enhances the modeling capabilities of the AJQ model by allowing the coexistence of AND dependencies and OR dependencies, as mentioned. Doability . Let K be a job and k i be a task of K . W e say that k i is an initial task of K if ∅ ∈ f K ( k i ) . Observe that all initial tasks k i are automatically feasible at the time the job K is injected, and they become activ e d K i time later . W e assign a layer λ ( K, i ) to the tasks k i of a job K as follows. All initial tasks ha ve layer λ ( K , i ) = 1 . F or any j > 1 , a task k i is assigned layer λ ( K , i ) = j if it is not feasible when all tasks of layers 1 , ..., j − 2 are completed, but it becomes feasible when additionally the tasks of layer j − 1 are completed. Let λ K ≤ l K denote the number of layers of job K . If a task k i has layer λ ( K , i ) = ` , then there is a feasibility set A x ∈ f K ( k i ) for k i such that A x ⊆ { k j ∈ K : λ ( K, j ) < ` } . Observe that the above deﬁnition does not guarantee that all tasks of a job will be assigned a layer . In fact, it is not hard to create jobs that hav e tasks dependencies (e.g., cyclic dependencies) that prevent some tasks from being assigned a layer . T able 1 shows an example of a job whose tasks get layer numbers and an example with tasks that cannot be assigned a layer number . W e want e very job to be potentially completed. Therefore, we impose some restrictions o ver every feasibility function. Deﬁnition 1. Let K be a job and f K : K → P 2 ( K ) be its feasibility function. W e say that K is doable if every task k i of K can be assigned a layer . It is worth mentioning that, deciding whether a job is doable or not as deﬁned can be computed in polynomial time with respect to the size of the job (that takes into account the number of tasks and the size of the feasibility function). Indeed, layer 1 can be computed by checking which tasks have the empty set as a feasibility set. Then, a simple recursiv e algorithm computes all tasks in layer i using all the tasks in layers 1 , 2 , . . . , i − 1 . W e show in the next proposition, that the condition of doable job is necessary for a job to be completed, and that it is also sufﬁcient if it is the only job injected in a system and the scheduling polic y is work conserving. Proposition 1. Let ( S, P , A ) be a system wher e the adversary A injects only one job K and P is work conserving. Then, K can be completed if and only if K is doable. Pr oof. On one hand, if K is doable, K can be completed, since until that happens, there will always be at least one feasible task not completed. T o see this, assume by contradiction that there is a moment before K is completed such that no task is feasible. Consider any task k i among those that have not been completed yet with the smallest layer (since K is doable, all tasks ha ve a layer). Therefore, k i has a feasibility set that is a subset of the completed tasks. Hence, k i is feasible, which is a contradiction. Then, since there are always feasible tasks, their activ ation time is bounded, there are no other jobs in the system, and P is work conserving, ev entually all tasks of K will become activ e, be scheduled and processed, and complete. On the other hand, assume that job K is not doable b ut completes all its tasks in system ( S, P , A ) . Then, all tasks in K become feasible at some point in time, e ven those that are not assigned a layer . Consider the ﬁrst task k i that becomes feasible among those that hav e no layer (break ties randomly). If this happens at time t , let U be the set of tasks that 4 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T completed by time t , and let ` = max k j ∈ U λ ( K, j ) . Then, from the procedure to assign layers to tasks, k i would ha ve been assigned a layer λ ( K, i ) ≤ ` + 1 , which is a contradiction. T opologies. Let K be a job, and k i and k j be two tasks of K . W e say that k i depends on k j if there e xists a feasibility set A x ∈ f K ( k i ) for k i such that k j ∈ A x . Deﬁnition 2. The skeleton of a job K is the dir ected graph H K = ( V , E ) , wher e V ( H K ) := { k 1 , k 2 , . . . , k l K } and E ( H K ) := { ( k j , k i ) : k i depends on k j } . It is worthwhile to mention that a skeleton does not deﬁne the feasibility function of a job. The two jobs presented in T able 1 are different jobs on the same set of tasks and with the same skeleton (see Figure 2). Ne vertheless, one of the two jobs in T able 1 is doable and the other is not. Hence, the skeleton does not even dif ferentiate between doable and not doable jobs. The topology of a job K is the directed graph obtained by mapping the skeleton of K into the set of servers, where each task k i is mapped into its corresponding server s K i . Deﬁnition 3. Given a system ( S, P, A ) , the topology of the system is the directed gr aph obtained by overlapping the topology of all jobs injected by A in the system. Figures 3 and 5 show the skeleton of two jobs whose feasibility functions are described in tables 4 and 6. Figures 3 and 5 also sho w the layers of the jobs. The topology of a system in which only those two jobs are injected is sho wn in Figure 7. Scheduling policy . W e assume that each server has an inﬁnite buf fer to store its o wn queue of tasks. Every active task waits in the queue of its corresponding server . In each server , a scheduling policy P speciﬁes which task of all active tasks in its queue to serv e next. W e assume that scheduling policies are greedy/w ork conserving (i. e., a server al ways decides to serve if there is at least one activ e task in its queue). Examples of policies are F irst-In-F irst-Out ( FIFO ) which giv es priority to the task that ﬁrst came in the queue, or Last-In-F irst-Out ( LIFO ) which giv es priority to the task that came last in the queue. Other policies will be deﬁned later in the document. Adversary . W e assume that there is a malicious adversary A who injects doable jobs into the system. In order to av oid trivial overloads, the adversary is bounded in the following way . Let N s ( I ) be the total load injected by the adversary during time interv al I in server s (i. e., N s ( I ) = P t K i ov er all jobs K injected during I and tasks k i such that s K i = s ). Then, for ev ery server s and interv al I the adversary is bounded by: N s ( I ) ≤ r | I | + b, (1) where 0 < r ≤ 1 is called the injection rate , and b > 1 is called the burstiness allo wed to the adversary . Observe that (1) implies max i,K { t K i } ≤ b , since jobs are injected instantaneously . An adversary that satisﬁes (1) is called a bounded ( r, b ) -adversary , or simply an ( r , b ) -adv ersary . As mentioned, the system formed by an ( r , b ) -adv ersary A injecting doable jobs in the set of servers S using the scheduling policy P is called an AJQ system ( S, P , A ) . The number of activ e tasks in the queue of server s at time t is denoted Q s ( t ) . Deﬁnition 4. Let ( S, P , A ) be an AJQ system. W e say that the system ( S, P , A ) is stable if ther e exists a value M such that Q s ( t ) ≤ M for all t and for all s ∈ S , where M may depend on the system parameters (adversary , servers, and jobs char acteristics) but not on the time. Deﬁnition 5. Let P be a policy . If a system ( S, P, A ) is stable against any ( r , b ) -adversary A with rate r < 1 , then we say that the policy P is univ ersally stable . In the next sections, we pro vide some results regarding both the stability and instability in the AJQ model. 3 Stability and Instability of Scheduling Policies From the point of view of the scheduling policies (i. e., ho w a server decides which task to choose from the set of activ e tasks pending to be executed), in this section we show stability of the policy that gi ves priority to the task (job) that has been for the longest period of time in the system. On the other hand, we sho w that other well-known scheduling policies are not stable. 5 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T 1 3 2 4 5 6 La yer 1 La yer 4 La yer 3 La yer 2 1 2 3 4 5 La y e r 1 La y e r 3 La y e r 2 1 T able 3: Skeleton of job R with set of tasks { 1 , 2 , 3 , 4 , 5 , 6 } . The layers of the job are circled in red. T able 4: This table shows the feasibility function of job R and the servers to which the tasks of job R are assigned. T ask i of job R f R ( i ) s R i 1 {∅} s 1 2 {∅} s 1 3 {{ 1 }} s 1 4 {{ 1 , 2 , 3 }} s 2 5 {{ 4 }} s 3 6 {{ 3 } , { 5 }} s 4 1 3 2 4 5 6 La y e r 1 La y e r 4 La y e r 3 La y e r 2 La y e r 5 1 2 3 4 5 La yer 1 La yer 3 La yer 2 1 T able 5: Skeleton of job M with set of tasks { 1 , 2 , 3 , 4 , 5 } . The layers of the job are circled in red. T able 6: This table shows the feasibility function of job M and the servers to which the tasks of job M are assigned. T ask i of job M f M ( i ) s M i 1 {∅} s 4 2 {{ 1 }} s 3 3 {{ 1 }} s 3 4 {{ 1 , 2 }} s 2 5 {{ 1 , 3 } , { 4 }} s 1 s 1 s 2 s 3 s 4 T able 7: This ﬁgure sho ws the topology of job R (described in Figure 3 and T able 4) in solid lines, the topology of job M (described in Figure 5 and T able 6) in dashed lines, and, with all the lines, the topology of a system in which only these two jobs are injected by the adversary . 6 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T 3.1 Stability of LIS The LIS (Longest-In-System) scheduling policy giv es priority to the task (and hence the job) which has been in the system for the longest time. In this subsection, we show that any system ( S, LIS , A ) is stable, for an y ( r, b ) -adversary A with r < T min / ( T max + D max ) . W e start by showing a bound on the time that a job spends in the system until it is done. Consider a job K = { k 1 , k 2 , . . . , k l K } injected at time T 0 . Let T i be the ﬁrst time in which all tasks in the i - th layer of K are completed. The time T λ K is the time when K is done. Let T be some time in the interv al [ T 0 , T λ K ] . W e denote by g T the injection time of the oldest job that is still in the system at time T . W e deﬁne c := max T ∈ [ T 0 ,T λ K ] { T − g T } . Lemma 1. Let ( S, LIS , A ) be an AJQ system wher e A is an ( r, b ) -adversary with r < T min / ( T max + D max ) . Then, T λ K − T 0 ≤  D max + r ( c + b ) T min ( T max + D max )  . Pr oof. Let K be a job . Let k ∗ be the last task to be processed in the i - th layer of K . Hence, k ∗ is complete at time T i . All tasks in the i - th layer of K become feasible by time T i − 1 , including k ∗ . From deﬁnition of c , only tasks injected in the interval [ T i − 1 − c, T 0 ] can block k ∗ in its server . The tasks injected in this interval, including all the tasks in the i - th layer of K , are at most r ( T 0 − T i − 1 + c + b ) /T min . All these tasks are processed in at most r ( T 0 − T i − 1 + c + b )( T max + D max ) /T min time. Hence: T i ≤ T i − 1 + D max + r ( T 0 − T i − 1 + c + b ) T min ( T max + D max ) = T i − 1  1 − r ( T max + D max ) T min  + D max + r ( T 0 + c + b ) T min ( T max + D max ) Let  := 1 − r ( T max + D max ) /T min . Solving the recurrence, we obtain: T λ K ≤  λ K T 0 +  D max + r ( T 0 + c + b ) T min ( T max + D max )  λ K − 1 X i =0  i =  λ K t 0 +  D max + r ( T 0 + c + b ) T min ( T max + D max )   1 −  λ K 1 −   =  D max + r ( c + b ) T min ( T max + D max )  + T 0 . Which prov es the lemma. Since we are considering a case where r < T min / ( T max + D max ) , it holds that r ( T max + D max ) /T min = 1 −  < 1 . Hence, we rewrite the lemma as follo ws: T λ K − T 0 ≤ (1 −  ) c +  D max + r b T min ( T max + D max )  . Theorem 1. Let ( S, LIS , A ) be an AJQ system where A is an ( r , b ) -adver sary with r < T min / ( T max + D max ) . Then, all jobs spend less than  D max T min + r b ( T max + D max ) T min − r ( T max + D max )  time in the system. Pr oof. It is worth mentioning that c is the only time-depending parameter in the bound given by the previous lemma. Hence, if we show that c actually does not depend on time, we will be sho wing the theorem. W e prov e it by contra- diction. Assume that there is a moment in which c is strictly larger than:  D max T min + r b ( T max + D max ) T min − r ( T max + D max )  . 7 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T Hence, there has been a job in the system for a period of time strictly longer than:  D max T min + r b ( T max + D max ) T min − r ( T max + D max )  . If we apply the previous lemma to this job, it should ha ve been absorbed in at most: (1 −  ) c +  D max + r b T min ( T max + D max )  = c −   D max T min + r b ( T max + D max ) T min − r ( T max + D max )  +  D max + r b T min ( T max + D max )  < c time, which is a contradiction. 3.2 Scheduling Policies that ar e Unstable Here, we show that a number of well-known policies such as First-In-First-Out ( FIFO ), Nearest-T o-Go ( NTG ), Furthest-From-Source ( FFS ), and Last-In-First-Out ( LIFO ), are unstable, even for arbitrarily small injection rates. While the meaning of FIFO and LIFO in the context of AJQ is clear (and similar as in A QT), we need to deﬁne NTG and FFS . For a task k i of job K the distance fr om sour ce is the distance between the layer of k i and layer one ( λ ( K, i ) − 1 ), and the distance to go is the distance between the number of layers of K and k i ’ s layer ( λ K − λ ( K, i ) ). Hence, FFS gi ves priority to the task with largest distance from source and NTG gi ves priority to the task with smallest distance to go. Theorem 2. FIFO, NTG, FFS, and LIFO are unstable for e very r > 0 . Pr oof. First, we highlight that, giv en a system ( G, P, A ) in A QT , it can be modeled as a system ( S, P, A 0 ) in AJQ as follows: • For each link l in G , there is a unique server s l in S , which we call its equivalent server . • The scheduling policy P is the same both in A QT and in AJQ. • For each packet p injected by A , the adversary A 0 injects a job K such that: – For each link l in the path of packet p , there is a task k l in K to be executed in server s l . – If l is the ﬁrst link in the path of p , then k l is the initial task of job K . – If the path of packet p traverses link l immediately before it traverses link l 0 then task k l 0 only depends on task k l . – The processing time of each task is 1 and its acti vation delay is 0 . Clearly , if ( G, P , A ) is unstable for a given injection rate then ( S, P , A 0 ) will be also unstable for the same injection rate (i. e., all the unstable scheduling policies in A QT are also unstable in AJQ). By using the results in [6], we ha ve that NTG , FFS , and LIFO are unstable (in A QT) for ev ery r > 0 , and by using the result in [3] (in A QT) we hav e that FIFO is also unstable for e very r > 0 . Therefore, the theorem directly follows. 4 T opological Stability In this section we sho w stability for systems with feed-forward topology . W e say that a system has feed-forward topology if it is possible to enumerate the servers from 1 to n , so that ev ery directed arc in the topology of the system goes from a server with a smaller label to a serv er with a larger label. Theorem 3. Let ( S, P , A ) be an AJQ system with feed-forwar d topology . Then, for any policy P and any ( r, b ) - adversary A with injection rate r ≤ 1 , the system ( S, P , A ) is stable . Pr oof. Let ( S, P, A ) be an AJQ system with feed-forward topology . W ithout loss of generality , assume that the ordering of the set of servers that makes the system feed-forward is s 1 , s 2 , . . . , s n . For simplicity , we only use the 8 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T position j to denote server s j . Let τ j ( t ) be the time that server j would need to completely serve (drain) all its pending tasks present in the system at time t if the y were all activ e and no new task were injected: τ j ( t ) := X K ( j,t ) t K i , where K ( j, t ) is the set of pairs ( K, i ) such that K was injected by time t , s K i = j , and task k i has not been completed in server j . W e deﬁne a potential function Φ( · ) as follo ws: Φ(0) := D max + b ; Φ(1) := τ 1 (0) + Φ(0) + b, where τ j (0) denotes the time serv er j requires to process all its tasks present in the system before the adversary starts injecting jobs in the system. For 2 ≤ j ≤ n , Φ( j ) is deﬁned as: Φ( j ) = τ j (0) + Φ(0) + P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) + b, T o prov e this theorem, we sho w that for all 1 ≤ j ≤ n and for all t , τ j ( t ) ≤ Φ( j ) . W e use induction on j , the position of the servers in the ordering of S . Case j = 1 : Consider some time T . W e prove that τ 1 ( T ) ≤ Φ(1) . First, assume that for all time t ∈ [0 , T ] there is at least one activ e task in the queue of server 1 . Then, server 1 has been continuously working during the interval [0 , T ] . The time required to process all its queue at time T is the time it would need to process the tasks present at time 0 in its queue, plus the time required to process the load injected by the adversary during that interv al, minus the load processed during that interval. In the form of an equation, the previous amount of time is: τ 1 ( T ) ≤ τ 1 (0) + T + b − T = τ 1 (0) + b ≤ Φ(1) . Otherwise, there exists some time t ∈ [0 , T ] such that there is no acti ve task in the queue of server 1 at time t . Let t ∗ be the largest of such times. Note that, in that case, all tasks injected in server 1 before time t ∗ , and present at time t ∗ , were injected after time t ∗ − D max , since e very task injected before that time is active at time t ∗ . Therefore, by restriction (1), it holds: τ 1 ( t ∗ ) ≤ D max + b . Then, the time required to process all its queue at time T is the time it would need to process the tasks present at time t ∗ , plus the time required to process the load injected by the adversary during the interval [ t ∗ , T ] , minus the load processed during the same interval of time. In the form of an equation, the previous amount of time is: τ 1 ( T ) ≤ τ 1 ( t ∗ ) + ( T − t ∗ ) + b − ( T − t ∗ ) ≤ D max + b + b = Φ(0) + b ≤ Φ(1) . Case j > 1 : The inductiv e hypothesis is τ i ( t ) ≤ Φ( i ) for all for all t and for all 1 ≤ i < j . By inductiv e hypothesis then, the amount of tasks in servers i < j is at most P j − 1 i =1 Φ( i ) T min , the number of tasks they can trigger in server j is at most P j − 1 i =1 Φ( i ) T min · L, and the processing time for all those tasks is at most P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) . Consider again some time T . W e consider two cases equi valent to those considered in the case j = 1 . First, server j has at least one acti ve task in its queue during all the interval [0 , T ] . Therefore, server j has processed tasks during all that time. In that case, server j would need all the time required to process the tasks present at time 0 in its queue, plus all the time required to process the tasks triggered by tasks in pre vious servers, plus all the time required to process the load injected by the adversary during the interval [0 , T ] , minus the load processed during that interv al. Which, in the form of an equation is: τ j ( T ) ≤ τ j (0) + P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) + T + b − T = τ j (0) + P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) + b ≤ Φ( k ) . Assume no w that there is some time t ∈ [0 , T ] such that there is no active task in the queue of serv er j at time t . Let t ∗ be the lar gest such time. An analysis equiv alent to the one presented in the case j = 1 shows that τ j ( t ∗ ) ≤ D max + b . Therefore, if we compute τ j ( T ) equiv alently to the previous cases, we obtain: τ j ( T ) ≤ τ j ( t ∗ ) + P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) + ( T − t ∗ ) + b − ( T − t ∗ ) ≤ D max + b + P j − 1 i =1 Φ( i ) T min · L · ( T max + D max ) + b ≤ Φ( k ) . 9 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T Hence, the time required by server j to drain its queue is bounded by Φ( j ) , a function that does not depend on time. In conclusion, at any time, there are at most Φ( j ) /T min tasks in the queue of server j , and the system is stable. W e sho wed that a feed-forward topology is a sufﬁcient condition for stability in a system. Nev ertheless, this condition is not necessary . Indeed, as we hav e shown before, for any system ( G, P , A ) in the A QT model, there is an equiv alent system ( S, P, A 0 ) in the AJQ model. W e kno w that, in the A QT model, any system with a ring network (i. e., a directed cycle) is stable with any scheduling policy and against any adversary . Therefore, the equiv alent system in the AJQ model will also be stable. Nev ertheless, such AJQ system has a topology that is not feed-forward. 5 Job Pr operties that can Affect the Stability of the System In this section, we show that some of the features of the injected jobs can play a key role reg arding the stability of the system. Namely , we show that both the tasks’ processing time and activ ation delays are factors that, individually , can cause instability . W e also show that the feasibility function can lead, by itself, to instability . 5.1 T asks’ Processing Times W e show that the processing times of the tasks can affect the stability of the system. Namely , a stable system can be transformed into unstable by varying the processing time of some of their tasks, ev en if the adv ersary has the same rate r in both systems. Let LCT -LIS be the scheduling policy that gives priority to the task with longest processing time at the current server , breaking ties according to the longest-in-system policy . Proposition 2. Ther e exists a server set S and an adversary A with injection rate r > 1 / √ 2 such that the system ( S, LCT -LIS , A ) is unstable . Pr oof. The proof is inspired by the instability by dif ference in packet length proof in the continuous A QT [5] (CA QT) model. Let ( G, LPL-LIS , A 0 ) be the system used in Theorem 26 in [5] ( LPL-LIS denotes the scheduling policy that giv es priority to the packets with longest length, breaking ties according to the longest in system policy). Note that ( G, LPL-LIS , A 0 ) can be seen as an A QT system, except that two different packet lengths ( 1 and 2 ) are taken into account. Let us now consider a system ( S, LCT -LIS , A ) in AJQ, such that: • The scheduling policy is LCT -LIS . • For each packet p injected by A 0 , the adversary A injects a job such that all its tasks have a processing time ( 1 and 2 ) equal to the length of the injected packet. • The rest of the system is modeled in the same fashion as in the proof of Theorem 2. Theorem 26 in [5] sho ws that ( G, LPL-LIS , A 0 ) is unstable for an injection rate r > 1 / √ 2 . Therefore, it is not hard to deriv e that ( S, LCT -LIS , A ) is also unstable for the same rate. Figure 1 illustrates the system S used in the proof of the previous proposition and provides some details about its unstable behavior . Observe that if all tasks ha ve the same processing time T , then the LCT -LIS scheduling policy becomes LIS . As shown in Theorem 1, LIS is stable for any r < T min / ( T max + D max ) = T / ( T + D max ) . Hence, for small D max (e.g., D max = 0 ), we hav e a rate r > 1 / √ 2 for which LCT -LIS is stable if all tasks have the same processing time. Therefore, we have shown that an unstable system can be transformed into stable by only v arying the processing times of some of their tasks. 5.2 T asks’ Activation Delays As it has been done in the pre vious subsection, here we sho w that the acti vation delays of the tasks can af fect the stability of the system. Let SAD-NFS be the scheduling policy that giv es priority to the task with smallest activ ation delay at the queue of the current server , breaking ties according to the nearest from source policy regarding to an initial task in the job’ s skeleton. Proposition 3. Ther e exists a server set S and an adversary A with injection rate r > 1 / √ 2 such that the system ( S, SAD-NFS , A ) is unstable. 10 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T Figure 1: System used in the proof of Proposition 2. Subﬁgure (a) sho ws the system’ s topology and subﬁgure (b) shows the number of queued packets at each time instant for an injection rate of 0 . 8 (which is slightly higher than 1 / √ 2 ): dashed lines correspond to servers s 1 and s 4 , and the solid line corresponds to the ov erall system. As it can be seen, the number of queued tasks at s 1 and s 4 oscillate in an alternating and increasing fashion, which provokes a continuous increase in the system’ s number of queued tasks. Pr oof. The proof follows the lines of the one in Proposition 2. Let ( G, SPP-NFS , A 0 ) be the system used in Theo- rem 28 in [5] ( SPP-NFS denotes the scheduling policy that gi ves priority to the packets whose previously trav ersed link had smallest propagation delay , breaking ties according to the nearest-from-source polic y). Note that in this sys- tem the transmission time of e very packet is the same in e very link. Hence, ( G, SPP-NFS , A 0 ) can be seen as an A QT system, except that some links ha ve a positi ve ﬁxed propagation delay . Let us now consider a system ( S, SAD-NFS , A ) in AJQ, such that: • The scheduling policy is SAD-NFS . • For each link l in G with a propagation delay d l , all tasks executed in its equi valent server will ha ve an activ ation delay equal to d l . That is, the acti vation delays are seen as the delays taken by packets to traverse the links (besides the times spend at the queues). • The rest of the system is modeled in the same fashion as in Theorem 2. Clearly , if the system ( G, SPP-NFS , A 0 ) is unstable for a giv en injection rate then ( S, SAD-NFS , A ) will be also unstable for the same injection rate. Howe ver , by using the result in [5] (Theorem 28), we hav e that ( G, SPP-NFS , A 0 ) is unstable for an injection rate r > 1 / √ 2 . Therefore, we hav e that ( S, SAD-NFS , A ) is also unstable for the same rate. Note that if all links in the system ( G, SPP-NFS , A 0 ) of the pre vious proof ha ve zero delay it becomes an A QT system, and the ( S, SAD-NFS , A ) system obtained has only tasks with acti vation delay of 0 . In that case, both SPP-NFS and SAD-NFS behave as NFS in their respectiv e systems. Moreover , since NFS is uni versally stable in A QT as shown in [1], both systems ( G, SPP-NFS , A 0 ) and ( S, SAD-NFS , A ) are stable. Hence, we hav e shown that an unstable system can be transformed into stable by only varying the acti v ation delays of some of their tasks. 5.3 Feasibility Function Among T asks Now , we show that the feasibility function is a factor that, by itself, can also induce instability . W e say that a feasibility function is fully independent if no task in an y job depends on any other task (i. e., all tasks are initial). In this case, we also say that the tasks are fully independent. Proposition 4. Let ( S, P, A ) be an AJQ system such that all the tasks ar e fully independent. Then, for any set of servers S , any policy P and any ( r, b ) -adversary A with injection rate r ≤ 1 , the system ( S, P , A ) is stable. Pr oof. Direct, from the injection bound of Equation (1) and the fact that P is work conserving. 11 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T Then, it is clear that if we take an unstable system and make all tasks fully independent, it will become stable. 6 Future W ork The AJQ model opens interesting research questions. Regarding scheduling policies, it is still unkno wn whether there exists a universally stable policy (i. e., a policy stable under any adversary with r < 1 ). Indeed, all the parameters of the model make difﬁcult to see the existence of a universally stable policy . Regarding systems’ topology , a full char- acterization of the topologies that produce a stable system against any bounded adversary is still open. For instance, while we argue in Section 4 that the univ ersal stability of the ring in A QT can be propagated to AJQ, it is only for jobs that mimic the dependencies and topology of A QT . It would be interesting to kno w whether all AJQ systems with a ring topology are stable under bounded adversaries. On another hand, the AJQ model can be extended transferring the resource allocation decision from the adversary to the scheduling policy . In that case, the adversary could provide, for each task, a set of servers in which it can be processed (instead of a single server , as it is done in our model). In that extended model, we would be able to study the impact of resource allocation into the stability of a system. References [1] Matthew Andrews, Baruch A werbuch, Antonio Fern ´ andez, Frank Thomson Leighton, Zhiyong Liu, and Jon M. Kleinberg. Uni versal-stability results and performance bounds for greedy contention-resolution protocols. J our- nal of the A CM , 48(1):39–69, 2001. [2] Forest Baskett, K. Mani Chandy , Richard R. Muntz, and Fernando G. Palacios. Open, closed, and mixed networks of queues with different classes of customers. Journal of the A CM , 22(2):248–260, 1975. [3] Rajat Bhattacharjee, Ashish Goel, and Zvi Lotker . Instability of FIFO at arbitrarily low rates in the adversarial queueing model. SIAM Journal on Computing , 34(2):318–332, 2004. Earlier version appeared in Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. [4] Maria J. Blesa. Stability in communication networks under adversarial models . PhD thesis, Univ ersitat Polit ` ecnica de Catalunya, 2006. [5] Maria J. Blesa, Daniel Calzada, Antonio Fern ´ andez, Luis L ´ opez, Andr ´ es L. Mart ´ ınez, Agust ´ ın Santos, Maria J. Serna, and Christopher Thraves. Adversarial queueing model for continuous network dynamics. Theory of Computing Systems , 44(3):304–331, 2009. [6] Allan Borodin, Jon M. Kleinberg, Prabhakar Raghav an, Madhu Sudan, and David P . W illiamson. Adversarial queuing theory . Journal of the A CM , 48(1):13–38, 2001. [7] Maury Bramson. Instability of ﬁfo queueing networks. The Annals of Applied Probability , 4(2):414–431, 1994. [8] Cheng-Shang Chang. Stability , queue length and delay of deterministic and stochastic queueing networks. IEEE T ransactions on Automatic Contr ol , 39:913–931, 1994. [9] V icent Cholvi and Juan Echag ¨ ue. Stability of FIFO networks under adversarial models: State of the art. Computer Networks , 51(15):4460–4474, 2007. [10] J. G. Dai, John J. Hasenbein, and John H. V ande V ate. Stability and instability of a two-station queueing network. The Annals of Applied Pr obability , 14(1):326–377, 2004. [11] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simpliﬁed data processing on large clusters. Communications of the A CM , 51(1):107–113, 2008. [12] Carla P . Gomes and Bart Selman. Algorithm portfolios. Artiﬁcial Intelligence , 126(1-2):43–62, 2001. [13] Juliver Gil Herrera and Juan Felipe Botero. Resource allocation in nfv: A comprehensiv e survey . IEEE T rans- actions on Network and Service Management , 13(3):518–532, 2016. [14] James R. Jackson. Jobshop-like queueing systems. Management Science , 10(1):131–142, 1963. [15] Frank P . Kelly . Rever sibility and Stochastic Networks . Wile y , 1979. [16] David Kitchin, Adrian Quark, William R. Cook, and Jayadev Misra. The Orc programming language. In David Lee, Ant ´ onia Lopes, and Arnd Poetzsch-Heffter , editors, Pr oceedings of FMOODS/FORTE 2009 , v olume 5522 of Lectur e Notes in Computer Science , pages 1–25. Springer , 2009. [17] Leonard Kleinrock. Queueing Systems V olume I: Theory , volume 1. John-W iley & Sons, 1975. 12 System Stability Under Adversarial Injection of Dependent T asks A P R E P R I N T [18] Steve H. Lu and P .R. Kumar . Distributed scheduling based on due dates and buffer priorities. IEEE T ransactions on Automatic Contr ol , 12(36):1406–1416, 1991. [19] Aleksandr Nikolaevich Rybk o and Alexander L. Stolyar . Ergodicity of stochastic processes describing the oper - ation of open queuing networks. Pr oblems of Information T ransmission , 28:199–220, 1992. [20] Christopher Thrav es Caro. P erformance of scheduling policies and networks in gener alized adversarial queueing models . PhD thesis, Uni versidad Re y Juan Carlos, 2008. [21] Panagiotis Tsaparas. Stability in adversarial queueing theory. Master’ s thesis, Univ ersity of T oronto, T oronto, Canada, 1999. [22] Massimo V illari, Maria Fazio, Schahram Dustdar , Omer Rana, and Raji v Ranjan. Osmotic computing: A ne w paradigm for edge/cloud integration. IEEE Cloud Computing , 3(6):76–83, 2016. [23] Bo Y i, Xingwei W ang, Keqin Li, Min Huang, et al. A comprehensiv e survey of network function virtualization. Computer Networks , 133:212–262, 2018. 13

System Stability Under Adversarial Injection of Dependent Tasks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment