Efficiency Fairness Tradeoff in Battery Sharing
The increasing presence of decentralized renewable generation in the power grid has motivated consumers to install batteries to save excess energy for future use. The high price of energy storage calls for a shared storage system, but careful battery…
Authors: Karan N. Chadha, Ankur A. Kulkarni, Jayakrishnan Nair
1 Efficiency F air ness T radeoff in Battery Sharing Karan N. Chadha, Ankur A. Kulkarni and Jayakrishnan Nair Abstract —The increasing pr esence of decentralized renewable generation in the po wer grid has motiv ated consumers to install batteries to sav e excess energy for futur e use. The high price of ener gy storage calls for a shared storage system, but careful battery management is requir ed so that the battery is operated in a manner that is fair to all and as efficiently as possible. In this paper , we study the tradeoffs between efficiency and fairness in operating a shared battery . W e develop a framework based on constrained Markov decision pr ocesses to study both regimes, namely , optimizing efficiency under a hard fairness constraint and optimizing fairness under hard efficiency constraint. Our results show that there are fundamental limits to efficiency under fairness and vice-versa, and, in general, the two cannot be achieved simultaneously . W e characterize these fundamental limits via absolute bounds on these quantities, and via the notion of price of fairness that we introduce in this paper . Index T erms —Smart grids, Battery management systems, Bat- tery sharing I . I N T RO D U C T I O N The falling cost of solar panels and incentiv es for renewable generation has created hope for massively decentralized power generation. Islanded microgrids can install wind generators, individual households, housing societies, and industries can generate their own energy via solar panels, and reduce both, their reliance on the grid and their carbon footprint. Howe ver , unpredictability and unreliability of rene wable energy also necessitate battery backup to store excess energy , so that it can be utilized at times when this generation is low . W ith adequate battery backup, one can envisi on a future where there is widespread deployment of renewable generation, leading to a higher penetration of clean ener gy sources. A ke y component in realizing this dream is the cost of the battery . While solar panels have become inexpensi ve in the recent past, batteries prices continue to be stiff and significant. The high cost of battery storage has motiv ated communities to look for sharing contracts that provide users access to a common battery , leading to sharing of the o verall costs. The shared battery is expected to serve as collectiv e storage accessible by multiple users that can be utilized to store excess generation, which users may want to withdraw at times of deficit. Howe ver , like anyone who has used a shared refrigerator at a dorm would realize, sharing of a storage space brings with it concerns about the inequity of access and the Karan N. Chadha is an undergraduate at Department of Electrical Engi- neering, Indian Institute of T echnology Bombay , Mumbai, India, 400076. karanchadhaiitb@gmail.com Ankur A. Kulkarni is a faculty at Systems and Control Engineer- ing, Indian Institute of T echnology Bombay , Mumbai, India, 400076 kulkarni.ankur@iitb.ac.in Jayakrishnan Nair is a faculty at Department of Electrical Engineer- ing, Indian Institute of T echnology Bombay , Mumbai, India, 400076 jayakrishnan.nair@ee.iitb.ac.in . The authors acknowl- edge support from the DST grant DST/CERI/MI/SG/2017/077. possibility of other users ‘stealing’ one’ s content. Clearly , gate- keeping is required so that users access the battery in a manner that is fair to all, and yet the battery is operated in an efficient manner that fully exploits the in vestment made in installing it. W e are moti vated by this question of fair and ef ficient operation of a shared battery . For the success of the sharing arrangement, it should ideally allow each user to access the common battery as if it is wholly owned by the user without jeopardizing the access of other users. But if users have disparate generation characteristics, excessi ve charging by one user may leave another user with less opportunity to deposit his energy , diminishing equity of access. At the other extreme, it is e vident that without a formal algorithm to do the gate-keeping, some users could draw far more energy than they injected, thereby cannibalizing the energy of other users. Moreov er , it seems plausible that gate-keeping may compel the battery operator to decline requests for injection e ven while there is room in the battery , or for withdraw al of energy ev en there is energy in the battery . As a consequence, maintaining fairness may come in the way of efficient battery operation. In this paper, we develop a framework to quantify the tradeoffs between efficienc y and fairness in the operation of a shared battery . Our work shows that there is a precise sense in which the requirement of fairness fundamentally compromises efficienc y , and in general, the two cannot be achiev ed si- multaneously . W e also quantify the tradeoff by characterizing the maximum ef ficiency achie vable under fairness, and the maximum fairness achiev able under ef ficiency . W e consider a setting where N users with indi vidual stochas- tic net generation access a shared battery . The net generation may be positi ve (in which case the user wishes to inject energy) or neg ative (draw energy). W e posit that a battery management algorithm decides the amount of energy to accept from injecting users and to provide to demanding users. W e adopt a general principle of fairness that no user should draw more ener gy from the battery than it has injected into it. Howe ver , since energy units added by users are fungible, in the sense that they are indistinguishable from those added by other users, the abov e principle need not be applied sequentially . The battery management algorithm can allow users an ‘overdraft’, where users can for some intervals, draw more energy from the battery than they hav e injected thus far , so long as the energy is ‘returned’ at a later stage. Specifically , in our notion of fairness, we ask that for each user , the long-run time a verage of the energy drawn from the battery by the user be no more than the long-run time av erage of the energy injected. Our notion of ef ficiency is the loss of load rate (LLR). The LLR of user i is its long run time av erage of unmet demand. W e first study the problem of maximizing the ef ficiency subject to hard constraints on fairness and deriv e fundamental 2 limits on the minimum total LLR attainable under fairness constraints. W e show that when there is at least one user that is net demanding , i.e., has negativ e steady-state net generation on average; the total LLR remains bounded away from zero for any battery size. Remarkably , this is true ev en when the total steady-state av erage net generation of all users is positiv e. In contrast, in the absence of the fairness constraint, the total LLR, in this case, would decrease exponentially with battery size. When all users are net generativ e (i.e., with positiv e steady state net generation on av erage), the LLR ev en with the fairness constraint decreases to zero exponentially with the battery size. Giv en that the fairness constraint limits the ef ficient use of the battery , we then ask the price of fairness . This is defined as the ratio of the minimum attainable total LLR with the fairness constraint, to that attained without the fairness constraint. Our fundamental limits already show that this ratio approaches infinity for large battery size when at least one user is net demanding, but the system is net generativ e on the whole. Howe ver , remarkably , we show that the price of fairness can be arbitrarily large ev en all users are net generativ e. Finally , we study the maximum fairness attainable under efficient battery use. W e find numerically that efficient battery operation often compromises fairness for at least one user . Moreov er , that departure from the utopian fairness persists ev en when one increases the battery size. This shows that the conflict between efficiency and fairness is, in a sense, fundamental and that a larger battery does not remedy it. These results provide a natural moti vation for devising a market for energy wherein a user can inject its excess energy into the battery , which can then be sold to other users. The precise formulation for such a market is a point for future study . Literatur e Survey There is recent literature on the effects of energy/battery sharing in power systems. The effects of sharing energy are studied from a (non-cooperative) game-theoretic perspectiv e in [2]. The strategic decision is to choose the inv estment in individual battery storage systems. On the other hand, [3] views this problem from a coalition game perspectiv e. In vestment in a shared battery and the corresponding allocation scheme is studied in [4]. All these models assume the net generation observed by the battery to be independent and identically distributed random variables (i.i.d.) across time. This seriously limits the applicability of the system as the net generation is generally dependent across time. In contrast, we consider Marko v models for net generation, which can better model the behaviour of rene wable energy sources. Markov energy generation models for scheduling of energy storage hav e been considered in [5], [6]. [5] models the system as a finite-horizon Markov decision process (MDP) where the decision of buying or selling electricity is made by an operator . They provide threshold based heuristic policies and quantify their suboptimality . [7] studies the problem of scheduling and operation of energy storage for wind power plants. [8] considers an infinite horizon discounted MDP based model to determine the optimal amount of battery to be bought from the grid to minimize the cost given a battery of fixed capacity . These papers focus on providing structural results of optimal policies and/or heuristic algorithms. On the other hand, we consider a model in which we minimize the expected loss of load if all demand is to be satisfied by only the renewable generation and the common battery . Our main distinction is the introduction of a notion of fairness, which, to the best of our knowledge, has not been done before in the scheduling of energy storage. Using this notion, we study the tradeoff between efficiency and fairness in the operation of shared energy storage systems. I I . M O D E L A N D P R E L I M I N A R I E S A. Notation In general, we denote random variables by capital letters and the value taken by a random variable is denoted by the corresponding lower case letters. ∏ i ∈ I V i denotes the Cartesian product of sets V i , i ∈ I . { u , . . . , v } is the set of integers from u to v . [ n ] denotes the set { 1 , . . . , n } . B. Model Consider n users { 1 , . . . , n } equipped with stochastic net generation e volving in discrete time. The net energy generation of user i at time t be denoted by X i ( t ) ∈ S i . A positive v alue of X i ( t ) indicates a net surplus at time t (i.e., user i generated more energy than she consumed) whereas a negati ve value of X i ( t ) indicates a net deficit at time t (i.e., user i demanded more energy than she generated). W e assume an arbitrary positive granularity with which energy generation/demand is measured; this granularity is taken to be unity without loss of generality , so that S i ⊂ Z . Let s + i denote the maximum energy user i injects and − s − i denote the maximum energy user i demands, i.e., s + i = max s ∈ S i s and s − i = min s ∈ S i s . T o avoid degenerate scenarios, we assume that s + i > 0 and s − i < 0 . Let X ( t ) = { X 1 ( t ) , . . . , X n ( t ) } . W e assume that X ( · ) is an irreducible discrete time Markov chain (DTMC) over a fi- nite state space S ⊆ ∏ i ∈ [ n ] S i . Note that we do not assume here that the net generation processes associated with the individual users are independent. 1 W e make the assumption that P [ X ( t + 1 ) = s | X ( t ) = s ] > 0 for all s ∈ S . Note that this ensures aperiodicity of X ( · ) . Let π = ( π ( s ) , s ∈ S ) denote the stationary distribution of the DTMC X ( · ) . W e define the drift ∆ i associated with user i as her steady state average net generation: ∆ i = ∑ s ∈ S s i π ( s ) , where s = ( s 1 , . . . , s n ) . User i is said to be net generative if ∆ i > 0 and net demanding if ∆ i < 0 . The system drift ∆ is defined the sum of the user drifts, i.e., ∆ = ∑ i ∈ [ n ] ∆ i . A common battery with capacity b max ∈ N is shared between the users. The battery occupancy at time t is a random variable 1 It is straightforward to generalize our results to the more general setting where the vector of net generations is itself a function of an abstract background Marko v process. This generalization allows for an arbitrary state space desciption that might, for example, incorporate history and/or weather information. 3 denoted by B ( t ) . ( X ( t ) , B ( t )) is a controlled Markov process ev olving ov er S = S × [ b max ] . (1) Battery dynamics are given by B ( t + 1 ) = B ( t ) + ∑ i ∈ [ n ] A i ( t ) , (2) where A i ( t ) denotes the energy accepted from user i (when A i ( t ) ≥ 0) or the energy supplied to user i (when A i ( t ) < 0). W e assume that the actions A i ( t ) , i ∈ [ n ] at each time instant t are chosen by the battery operator as a function of the state history ( X ( s ) , B ( s )) , s ≤ t . The battery management algorithm is constrained as fol- lows. In state ( x , b ) ∈ S , the space of allow able actions, denoted by A ( x , b ) , is giv en by: A ( x , b ) = ( a 1 , . . . , a n ) | a i ∈ ( { 0 , . . . , x i } , if x i ≥ 0 { x i , . . . , 0 } , if x i < 0 , 0 ≤ b + ∑ i ∈ [ n ] a i ≤ b max (3) A = [ ( x , b ) ∈ S A ( x , b ) The above constraints restrict the amount of energy supplied to a user by the amount demanded and similarly restricts the amount of energy accepted from a user by her net surplus. In addition, the actions must also respect the capacity constraints of the battery , and that the battery cannot be discharged below 0. C. Loss of Load Rate (LLR) Our notion of ef ficiency is defined by the loss of load rate. The loss of load rate (LLR) for user i , denoted by LLR i , is defined as: LLR i : = lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) < 0 } ( A i ( t ) − X i ( t ))] (4) Note that LLR i ≥ 0 , since A i ( t ) ≥ X i ( t ) when X i ( t ) < 0 from (3). LLR i captures the long run average rate of unmet demand for user i . For a system of n users, we define the LLR of the system to be the sum of LLRs of the individual users, i.e., LLR sys = ∑ i ∈ [ n ] LLR i . I I I . M A X I M I Z I N G E FFI C I E N C Y W I T H H A R D C O N S T R A I N T S O N F A I R N E S S In this section, we study the tradeoff between ef ficiency and fairness by putting a hard constraint on fairness and optimizing efficienc y under this constraint. The notion of fairness we consider is that for each user , the time-av eraged amount of energy drawn from the battery is at most the time- av eraged amount of energy injected into the battery by that user . Efficienc y is measured using the LLR. W e formulate this problem as a constrained Markov decision process, and based on this formulation, deri ve fundamental limits on the ef ficiency achiev able under fairness constraints. A. Constrainted Markov decision pr ocess formulation In this section, we impose a set of service constraints that the battery management algorithm must satisfy , in order to impart fairness in battery scheduling. FC i : lim T → ∞ 1 T T ∑ t = 0 E [ A i ( t )] ≥ 0 . (5) FC i asks that the amount of energy we provide to each user is at most the amount of energy the user injects into the battery . W e call lim T → ∞ 1 T ∑ T t = 0 E [ A i ( t )] the net contribution (C i ) of source i . W e impose that the battery management algorithm satisfy FC i for all i ∈ [ n ] , collectiv ely referred to as fairness constraints . W e consider the problem of operating the system under fairness constraints so that LLR sys is minimized. This problem is an instance of a constrained Markov decision process (we refer the reader to [9] for a survey), or CMDP with the av erage cost criterion. The state space is given by (1), and the set of allow able actions for each state is giv en by (3). The problem, for a fixed initial distribution on the state space, is summarized belo w: (P) minimize φ lim T → ∞ 1 T T ∑ t = 0 ∑ i ∈ [ N ] E [ I { X i ( t ) < 0 } ( A i ( t ) − X i ( t ))] subject to lim T → ∞ 1 T T ∑ t = 0 E [ A i ( t )] ≥ 0 ∀ i ∈ [ n ] . Here the decision variable is φ , a randomized, history- dependent policy . A randomized policy is a sequence of functions φ ( t ) , t ∈ N that map the set of histories until time t to probability distributions on the set of av ailable actions at time t . The set of available actions in state ( x , b ) ∈ S is giv en by (3). A deterministic policy maps the set of histories to a specific av ailable action at each time t ∈ N . A policy is Markov if it depends only on current state ( X ( t ) , b ( t )) and it is stationary if φ ( t ) does not depend on t . The CMDP (P) is tractable since it can be solved by solving an equiv alent linear problem (LP) (see [9], [10]); we do not provide the details of this reduction here due to space constraints, though we do present the LP reduction for the CMDP posed in Section V for optimization of fairness gi ven a hard constraint on efficienc y . Note also that the CMDP (P) is defined for a given initial distribution on S ; as such its optimal value depends on this initial distribution. Howe ver , the choice of this initial distribution does not affect the results we deriv e in this paper . B. Fundamental limits Denote the set of net demanding and net generating sources by D : = { i ∈ [ n ] | ∆ i < 0 } , G : = { i ∈ [ n ] | ∆ i > 0 } . The follo wing theorem provides a lower bound on the objec- tiv e of (P) under any policy . Theor em 3.1: LLR sys ≥ ∑ i ∈ D ( − ∆ i ) . 4 Pr oof: It suffices to pro ve the stated lower bound on LLR sys under any feasible policy for (P). LLR i = lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) < 0 } ( A i ( t ) − X i ( t ))] = lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) < 0 } A i ( t )] + lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) > 0 } X i ( t )] − lim T → ∞ 1 T T ∑ t = 0 E [ X i ( t )] ( a ) ≥ lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) < 0 } A i ( t )] + lim T → ∞ 1 T T ∑ t = 0 E [ I { X i ( t ) > 0 } A i ( t )] − lim T → ∞ 1 T T ∑ t = 0 E [ X i ( t )] = lim T → ∞ 1 T T ∑ t = 0 E [ A i ( t )] − lim T → ∞ 1 T T ∑ t = 0 E [ X i ( t )] ( b ) ≥ − lim T → ∞ 1 T T ∑ t = 0 E [ X i ( t )] = − ∆ i The inequality ( a ) follows since A i ( t ) ≤ X i ( t ) when X i ( t ) > 0 . The inequality ( b ) is a consequence of our fairness constraint. The statement of the lemma now follows by summing over all net demanding sources. The above bound shows that LLR sys remains bounded away from zero when D 6 = / 0 for any battery size . W e next consider the case when all sources are net generati ve, i.e., G = [ n ] . W e show that in this case the optimal LLR sys goes down to zero exponentially with increasing battery size. T o show this, we upper bound the optimal LLR sys by an expression which exponentially decays to 0. For this, we extensi vely use Theorem A.1 from Appendix A. It shows that the optimal LLR for a single net generating user decays exponentially in battery size. Using this, we now show that the exponential decay of the optimal LLR is true even for multiple sources, if all of them are net generating. Theor em 3.2: If G = [ n ] , we hav e lim sup b max → ∞ log ( LLR o ) b max = − c , where LLR o denotes the optimal LLR, b max is the battery size and c is a positiv e constant. Pr oof: W e upper bound the solution of (P) by considering a particular policy . W e divide the battery into n equal chunks of size b max n , allocating one chunk to each user . Consider the policy that optimally schedules each user using only the battery chunk allocated to her; as we mention in Appendix A, simple greedy operation is optimal when a battery (chunk) is used by used by a single user . Moreov er , it is easy to see that this policy is fair . Under the above policy , let LLR i denote the loss of load rate of source i . Then, the LLR sys of the combined system under this policy is ∑ n i = 1 LLR i . W e know from Theorem A.1 that lim b max → ∞ log ( LLR i ) b max = − c i , (6) for some positive constant c i . W ithout loss of generality , let c 1 ≤ c 2 ≤ . . . ≤ c n . It is not hard to see now that lim sup b max → ∞ log ( LLR sys ) b max = − c 1 , which implies the statement of the theorem. I V . P R I C E O F F A I R N E S S In this section, we analyse the efficienc y implications of the fairness constraint (5) in the LLR optimization formulation (P) introduced in the previous section. W e introduce the notion of Price of Fairness (PoF), which is the ratio of the optimal LLR subject to the fairness constraint, to the optimal LLR without this constraint. The main conclusion of this section is that the PoF can be arbitrarily large, even when all users are net generati ve. This means that imposing a strong fairness constraint in battery sharing can result in a substantial loss of efficienc y . In the following section, we address this issue by proposing an alternative formulation that seeks to maximize fairness subject to maximal efficienc y . T o define PoF , let LLR e denote the optimal loss of load rate without the fairness constraint, i.e., LLR e : = min φ lim T → ∞ 1 T T ∑ t = 0 n ∑ i = 1 E [ I { X i ( t ) < 0 } ( A i ( t ) − X i ( t ))] . W e refer to policies that solve the above optimization as efficient policies. It is not hard to see that the optimal loss of load rate is achieved by any greedy policy that (i) always charges the battery whene ver energy is available, and (ii) al- ways meets user demand whenever feasible. Formally , any policy that chooses actions that satisfy the follo wing conditions is ef ficient. E1. If 0 ≤ b + ∑ i ∈ [ n ] x i ≤ b max , then a = x . E2. If b + ∑ i ∈ [ n ] x i > b max , then a i = x i for all i satisfying x i < 0 , and b + ∑ i ∈ [ n ] a i = b max . E3. If b + ∑ i ∈ [ n ] x i < 0 , then a i = x i for all i satisfying x i > 0 , and b + ∑ i ∈ [ n ] a i = 0 . Let the space of actions ( a 1 , . . . , a n ) that satisfy the abov e efficienc y conditions in state ( x , b ) be denoted by A e ( x , b ) and the history dependent policies that satisfy these action constraints be denoted by Φ e . The price of fairness (PoF) is defined as PoF = LLR o LLR e . (7) T o analyse the PoF , we first consider the case where there is at least one net demanding source. In this case, if the system drift is itself negati ve, it is easy to see that LLR o is bounded away from zero (using the same argument as in the proof of Theorem 3.1). Thus, the more interesting case is ∆ > 0 . In this case, the following lemma shows that the PoF grows to infinity as b max → ∞ . Lemma 4.1: Suppose that | D | ≥ 1 (i.e., ∆ i < 0 for some i ) but the system is net generativ e (i.e., ∆ > 0). Then PoF → ∞ as b max → ∞ . Pr oof: By Theorem 3.1, LLR o is bounded a way from zero for any value of b max . Ho wev er, LLR e decays exponentially 5 with b max ; this is because under an efficient policy , one can think of the resulting LLR as arising from a single user with net generation process ∑ i ∈ [ n ] X i ( · ) . The exponential decay of LLR e with b max now follows from Theorem A.1 in the appendix. Next, we consider the case where all users are net gen- erating. In this case, one might expect that a strict fairness constraint does not affect ef ficiency . Howe ver , as the following lemma shows, the PoF can be arbitrarily large e ven when all sources are net generating. Lemma 4.2: Giv en any M > 0 , one can construct a system instance where G = [ n ] , such that the PoF ≥ M . Pr oof: W e will construct an instance with 2 users that satisfies PoF ≥ M for any arbitrary positive threshold M . The full details of the construction are rather cumbersome; we provide a sketch of the argument belo w . For users 1 and 2, we construct two independent net gen- eration processes such that the LLR decay rate corresponding to each user operating alone (the framework analysed in the appendix) is distinct, i.e., λ 1 < λ 2 . In this case, it can be shown (using the precise charactization of the decay rate from large deviations theory) that the decay rate λ e associated with battery sharing between the users under an efficient policy satisfies λ e ∈ ( λ 1 , λ 2 ) . Denoting the standalone LLR of user i by LLR o , i , this implies lim b max → ∞ LLR o , 1 LLR e = ∞ . Thus, there exists ¯ b max > 0 such that LLR o , 1 ( ¯ b max ) LLR e ( ¯ b max ) > 2 M . From hereon, we freeze the battery size to this value ¯ b max , so the dependence of LLR on ¯ b max will be suppressed. Let δ = LLR o , 1 . W e now perturb the net generation process of user 2 as follows: ˜ X 2 ( s ) = X 2 ( s ) + a , for a ≥ 0 such that ∑ s ∈ S s π ( s ) I { ˜ X ( s ) < 0 } ∈ ( − δ / 2 , 0 ) . Note that quantities in the perturbed system are represented with a tilde accent. In the perturbed system, user 2 is more generating than in the original system, and the rate at which it demands energy from the system is at most δ / 2 . It is easy to see that g LLR e ≤ LLR e . Moreover , in the perturbed system operated under the fairness constraint, the rate at which energy is accepted from user 2 is at most δ / 2 . This means that the reduction in the LLR of user 1 is at most δ / 2 relati ve to standalone operation, i.e., g LLR o ≥ LLR o , 1 − δ 2 ≥ 1 2 LLR o , 1 . Thus, g PoF = g LLR o g LLR e ≥ 1 2 LLR o , 1 LLR e = M . This completes the proof. An example of the phenomenon of unbounded PoF when all sources are net generating is shown in Figure 1a. Note that the above theorem does not show that the PoF for a giv en net generation profile gro ws to infinity as b max → ∞ ; see Figure 1b, where PoF seems to saturate with increasing b max . Batter y Lev el 0 4 8 12 Price of F airness 10 0 10 2 10 4 10 6 10 8 (a) P 1 = 0 . 95 0 . 05 0 . 95 0 . 05 , P 2 = 0 . 51 0 . 49 0 . 51 0 . 49 Batter y Level 0 10 20 Price of F airness 20 40 60 80 (b) P 1 = 0 . 9 0 . 1 0 . 9 0 . 1 , P 2 = 0 . 51 0 . 49 0 . 51 0 . 49 , P 3 = 0 . 55 0 . 45 0 . 55 0 . 45 Fig. 1: These figures show the change in price of fairness as batter y size increases when the sources are independent. S = {− 1 , 1 } n and P i denotes the transition matrix of source i . V . O P T I M I Z I N G F A I R N E S S S U B J E C T T O E FFI C I EN C Y It is clear from Section IV that there exist cases, in which the Price of Fairness becomes unbounded as battery size increases. In this section, we formulate a problem in which we try to maximize fairness subject to efficiency , i.e., we pose being efficient as a hard constraint and ask how fair can we get? W e first define the objecti ve which captures the fairness in the system. Our notion of f airness is to minimize the maximum ov er all users, the amount a user draws minus the amount it supplies to the battery . Recall the net contribution C i = lim T → ∞ 1 T ∑ T t = 0 E [ A i ( t )] of user i defined in Section III. Now , by work conservation we have under any policy , ∑ i ∈ [ n ] C i = 0. The system is said to be completely fair when C i = 0 ∀ i ∈ [ n ] . In general, this may not be the case. Hence to maximize fairness, we maximize the minimum of C i . Here, by being efficient , we mean being in the space of policies which minimize the LLR without the hard fairness constraint. In general, the optimal value of an objectiv e, in this case fairness, over an arbitrary set of policies is not easily computable. Howe ver , an observation we made in Section IV allows to formulate this as a tractable problem. Recall from Section IV that efficient policies are those that take actions in the space A e ( x , b ) when the state is ( x , b ) ∈ S . As a consequence, the space of efficient policies can be described by specifying only the space of allowable actions. This allows us to formulate this problem as a CMDP . Thus, we define the problem of maximizing fairness subject to ef ficiency as follows max φ ∈ Φ e min i ∈ [ n ] lim T → ∞ 1 T T ∑ t = 0 E [ A i ( t )] , (8) where the maximization is over all φ that are efficient, i.e., that are constrained to take actions in A e ( x , b ) in ev ery state ( x , b ) . This problem can also be formulated as a CMDP as follows (F) maximize φ , θ θ subject to θ ≤ lim T → ∞ 1 T ∑ T t = 0 E [ A i ( t )] , ∀ i ∈ [ n ] , φ ∈ Φ e . 6 A. Reduction to a Linear Pr ogram The optimization problem (F) abov e is extremely complex since the class of all history dependent policies is a highly unstructured space. This makes it extremely hard to deriv e a policy or a battery management algorithm. Ho wev er, a remarkable simplification is possible, which allo ws for the computation of such policies in an efficient manner . A Markov decision process is said to be unichain if, under any stationary deterministic policy , the corresponding Markov chain contains a single (aperiodic) ergodic class. This ensures the existence of a unique stationary distribution, indepen- dent of the initial distribution. Notice that under any policy φ ∈ Φ e the battery dynamics are identical and deterministic. Consequently , under any policy φ ∈ Φ e , the Marko v chain ( X ( t ) , b ( t )) has a single ergodic class, making this CMDP unichain. A well-known fact about unichain CMDPs is that stationary randomized policies dominate (see, e.g., Theorem 4.1 of [9]); thanks to this, we can limit our search to stationary randomized policies. Moreov er , the CMDP admits an eleg ant equiv alent linear programming formulation (Theorem 4.3 of [9]), where the optimization is not over policies, but rather ov er occupation measures (or probability distributions) on the product of state and action spaces. W e introduce this formulation belo w . An occupation measure ρ is a probability distribution on S × A e , where A e = S ( x , b ) ∈ S A e ( x , b ) . In the LP formulation, the occupation measure is required to satisfy , ∑ ( x , b ) ∈ S ∑ a ∈ A e ( x , b ) ρ ( x , b , a ) = 1 , ρ ( x 0 , b 0 , a 0 ) ≥ 0 , (9) ∑ ( x , b ) ∈ S ∑ a ∈ A e ( x , b ) ρ ( x , b , a )( δ ( x 0 , b 0 ) ( x , b ) − P ( x 0 , b 0 | a , x , b )) = 0 (10) ∀ ( x 0 , b 0 ) ∈ S , a 0 ∈ A e , where, δ ( u , v ) ( u 0 , v 0 ) = ( 1 , if u = u 0 and v = v 0 , 0 , otherwise , and P ( x 0 , b 0 | a , x , b ) is the probability of transition from state ( x , b ) to state ( x 0 , b 0 ) under action a ∈ A e ( x 0 , b 0 ) . In our case, these dynamics are trivial: P ( x 0 , b 0 | a , x , b ) = P ( x 0 | x ) I { b 0 = b + ∑ i ∈ [ n ] a i } , and P ( x 0 | x ) is the probability that the background process X ( t ) transitions from x to x 0 . (9) encodes that ρ is a probability distribution on S × A e . (10) imposes that ρ is an in variant distribution for the controlled Marko v chain. The equi valent LP formulation of (F) is then given by [9]: (LP) maximize ρ , θ θ subject to θ ≤ ∑ s + i u = s − i ρ ( a i = u ) u , ∀ i ∈ [ n ] , ρ ∈ ρ e . Here, ρ e is the set of ρ that satisfy (9)-(10), and with a slight abuse of notation, we define ρ ( a i = u ) = ∑ ( x , b , a ) ∈ S × A e : a i = u ρ ( x , b , a ) . Battery size 5 10 15 ma x ? min i C i -0.08 -0.06 -0.04 -0.02 0 n = 1 n = 2 n = 3 n = 4 Fig. 2: This figure shows how the maxmin fairness varies with increasing batter y size . S = {− 1 , 1 } n and P i denotes the transition matrix of source i . We ha ve the f ollowing sources and n = k implies a system which has first k of these sources. 2 P 1 = 0 . 5 0 . 5 0 . 6 0 . 4 , P 2 = 0 . 5 0 . 5 0 . 8 0 . 2 , P 3 = 0 . 7 0 . 3 0 . 7 0 . 3 , P 4 = 0 . 8 0 . 2 0 . 4 0 . 6 . The first constraint in (LP) arises from the fairness definition, restated using occupation measures. This formulation allows for efficient computation of the optimal fairness. W e next prov e a result giving structural properties of efficient policies. Pr oposition 5.1: Under any efficient policy , the battery dynamics remain the same. Moreover , the underlying marginal distribution over states ( ρ ( x , b ) ∀ x , b ∈ S ) is the same across efficient policies. Pr oof: Using E1, E2 and E3, we note that irrespectiv e of what actions we choose for each agent, the sum total of all actions in A e ( x , b ) is the same for each pair ( x , b ) ∈ S . Since the battery dynamics are b ( t + 1 ) = b ( t ) + ∑ i a i ( t ) , the battery dynamics stay the same across efficient policies. Using the first claim, we can see that the transition matrix for ( x , b ) is the same across all efficient policies. Thus, ρ ( x , b ) is the same across all efficient policies. W e hav e computed the solution to (LP) and the plot of optimal fairness with increasing battery size for different user configurations is giv en in Figure 2. W e have taken the background processes of the users to be independent for these ev aluations. From this plot, we can observe that the fairness tends to saturate on increasing battery size, i.e. after a certain threshold, the fairness value approaches a negati ve constant (except for the case n = 1 , where the efficient policy is also fair). This illustrates that increasing the battery size does not necessarily improv e the fairness of efficient policies. Figure 2 indicates that ev en a relaxation of the fairness constraint in (P) to C i ≥ − δ for some δ > 0 is not guaranteed to result in an efficient battery operation, even as b max → ∞ . T o illustrate this explicitly , in Figure 3a and Figure 3b, we plot the fairness ef ficiency frontier for 3 and 5 users, respectively . These plots sho w how close we get to efficient operation by relaxing the fairness constraint from C i ≥ 0 to C i ≥ − δ ∀ i . The shaded region shows all those points which are achiev able and the frontier is the boundary for the achiev able region. For the case when an efficient policy is fair , the frontier would coincide with the axes. 2 The irregularities for even number of users occur due to integer effects. 7 / 0 0.05 0.1 0.15 0 0 2 4 6 (a) P 1 = 0 . 6 0 . 4 0 . 6 0 . 4 , P 2 = 0 . 9 0 . 1 0 . 9 0 . 1 , P 3 = 0 . 6 0 . 4 0 . 5 0 . 5 / 0 0.02 0.04 0.06 0.08 0 0 0.2 0.4 0.6 0.8 (b) P 1 = 0 . 7 0 . 3 0 . 4 0 . 6 , P 2 = 0 . 6 0 . 4 0 . 9 0 . 1 , P 3 = 0 . 6 0 . 4 0 . 2 0 . 8 , P 4 = 0 . 8 0 . 2 0 . 8 0 . 2 , P 5 = 0 . 51 0 . 49 0 . 51 0 . 49 . Fig. 3: These figures show the fairness efficiency frontier for 3 and 5 users respectively with b max = 12 , S = {− 1 , 1 } n and P i denotes the transition matr ix of source i . This optimal loss of load is denoted b y LLR δ . The plot is of ε v/s δ , where ε = LLR δ LLR e − 1 . and the f air ness constraints are C i ≥ − δ . This rev eals a fundamental conflict between ef ficiency and fairness: a small relaxation of the fairness constraint does not necessarily improv e ef ficiency , and a small relaxation of the efficienc y requirement does not necessarily imply fairness for all users (this follows from our previous observation of unbounded PoF). T o concretize our claim regarding the fundamental conflict between efficienc y and fairness, we show in the following example that irrespective of the battery level, any efficient policy always has a non-zero maxmin fair value. W e consider a setup of two users, wherein each user either generates unit power or demands unit power at each time instant. Let the transition probability matrix of user i be giv en by P i = p i 1 − p i 1 − q i q i . (11) Then, we define the generation probability of user i by α i = 1 − q i 2 − p i − q i . Using this setup, we gi ve the following proposition, which giv es a class of examples where the fairness remains bounded away from zero under efficient policies. W e denote the maxmin fair value under efficient policies by F e . Pr oposition 5.2: Consider two users with S = {− 1 , 1 } 2 and the transition probability matrix of user i being P i (as in (11)). If α 1 > α 2 , we have F e = α 2 − α 1 for all even battery sizes. Pr oof: First, we note that ρ ( b = k ) = 0 for k odd. This is because, for e ven battery size, we know that in the long run the battery spends only finite amount of time in the odd battery lev els and spends an infinite amount of time in the e ven battery lev els. T o see this, at each even battery level, we either accept two units of power , accept one unit and gi ve one unit, or supply two units of power . Thus the battery lev el doesn’t change it’ s parity unless it hits a boundary . If the battery lev el is initially odd, we know that it will hit either the upper boundary ( b max ) or the lo wer boundary ( 0 ) in finite time, both of which are ev en. After this, the battery lev el stays ev en. Let α i denote the generation probability of user i and let α 0 i = 1 − α i . Let ρ ( · , · ) denote the occupation measure of states of any efficient policy (Prop. 5.1). W e note that giv en that the policy is ef ficient, from E1,E2 and E3, the freedom in the action is only when the background state is x = ( − 1 , − 1 ) and the battery level is b = 1 and when the background state is x = ( 1 , 1 ) and the battery lev el is b = b max − 1. Let us assume that when the background state is x = ( − 1 , − 1 ) and the battery lev el is b = 1, we choose action a = ( − 1 , 0 ) with probability s and action a = ( 0 , − 1 ) with probability 1 − s . Also, assume that when the background state is x = ( 1 , 1 ) and the battery le vel is b = b max − 1, we choose action a = ( 1 , 0 ) with probability t and action a = ( 0 , 1 ) with probability 1 − t . Thus, we hav e, C 1 − C 2 = 2 ( α 1 − α 2 ) + ( 2 t − 1 ) ρ ( x = ( 1 , 1 ) , b = b max − 1 ) − ( 2 s − 1 ) ρ ( x = ( − 1 , − 1 ) , b = 1 ) . Since ρ ( b = k ) = 0 for k odd, we hav e C 1 − C 2 = 2 ( α 1 − α 2 ) . Thus, we hav e C 1 = α 1 − α 2 and C 2 = α 2 − α 1 . For α 1 > α 2 , C 2 < C 1 and hence F e = α 2 − α 1 for all ev en battery sizes. V I . C O N C L U D I N G R E M A R K S A N D F U T U R E W O R K W e studied the problem of scheduling of a shared energy storage among multiple users by concentrating on the tradeoff between fairness of use across users and the ef ficiency of use of the battery . Our results sho w that there are hard fundamental limits to fairness under efficienc y constraints and to efficienc y under fairness constraints. In particular , a larger battery size does not help in achieving both fairness and efficienc y simul- taneously . Our effort at characterizing these tradeoffs was via optimization of efficiency subject to fairness (and vice-versa). W e ha ve also simulated the fairness-ef ficiency fr ontier of le vels of efficiency and fairness that cannot be jointly improved upon under any policy or battery size. Our results also indicate that sharing contracts and costing of batteries should be done with care since fair operations imply suboptimal battery utilization. Devising an equitable sharing of costs and a market for transacting excess energy are fascinating directions of future research. A P P E N D I X A L A R G E BAT T E RY A S Y M P T OT I C S F O R A S I N G L E U S E R In this section, we consider the problem of minimizing the LLR associated with a single user i operating a battery of size b max alone. W e make the following remarks: 1) The net generation process X i ( · ) corresponding to user i is a functional of the DTMC X ( · ) . 2) An elementary energy conservation argument shows that any policy of battery operation for user i is fair with respect to the fairness notion (5). 3) The LLR of user i is minimized by a simple greedy policy: Suppose that the net generation is x i and the battery occupancy equals b . If 0 ≤ b + x ≤ b max , set action a = x . If b + x < 0, set a = − b , and if b + x > b max , set a = b max − b . Under this policy , the battery ev olution is giv en by B ( t + 1 ) = [ B ( t ) + X i ( t )] [ 0 , b max ] , (12) where [ z ] [ 0 , b max ] = min ( max ( z , 0 ) , b max ) . The main result of this section is that if ∆ i > 0 , then LLR o , i , the optimal LLR for user i , decays exponentially with the battery size b max . 8 Theor em A.1: For user i operating a battery of size b max in a standalone fashion, if ∆ i > 0 , then lim b max → ∞ log ( LLR o , i ) b max = − λ i , where λ i ∈ ( 0 , ∞ ) . Theorem A.1 is a consequence of the following lemmas. Lemma A.1: Let the setting of Theorem A.1 hold. Under the battery evolution giv en by (12), lim b max → ∞ log ( LLR o , i ) b max = lim b max → ∞ log P [ B = 0 ] b max . where B is distributed according to the steady state distribution of B ( · ) Pr oof: It is easy to see that LLR o , i ≤ P [ B = 0 ] . It therefore suffices to show that LLR o , i ≥ c P [ B = 0 ] for some c ∈ ( 0 , 1 ) . This is a direct consequence of our assumption that P [ X ( t + 1 ) = s | X ( t ) = s ] > 0 for all s ∈ S , which ensures that each time the battery drains to zero, there is positiv e probability of a loss of load. This argument can be easily formalized using the renewal reward theorem. Lemma A.2: Let the setting of Theorem A.1 hold. Under the battery evolution giv en by (12), lim b max → ∞ log P [ B = 0 ] b max = − λ i , where λ i ∈ ( 0 , ∞ ) . Pr oof: W e analyse the asymptotics of P [ b = 0 ] using the r eversed system [11], which is obtained by interchanging the role of generation and demand. In other words, X r i ( t ) = − X i ( t ) (we use the superscript r to represent quantities in the rev ersed system). Thus, ∆ r i = − ∆ i < 0 . It is not hard to see that P [ B = 0 ] = P [ B r = b max ] . The battery ev olution in the rev ersed system is equi valent to the buf fer ev olution in a finite buf fer Markov modulated queue with negati ve drift, for which logarithmic asymptotics of the form lim b max → ∞ log P [ B r = b max ] b max = − λ are kno wn; see [12, Section 6.5] and [13]. R E F E R E N C E S [1] K. N. Chadha, A. A. K ulkarni, and J. Nair , “Ef ficiency f airness tradeof f in battery sharing, ” in Decision and Control (CDC), 2019 IEEE 58nd Annual Conference on . IEEE, 2019, p. pages. [2] D. Kalathil, C. W u, K. Poolla, and P . V araiya, “The sharing economy for the electricity storage, ” IEEE T ransactions on Smart Grid , vol. 10, no. 1, pp. 556–567, jan 2019. [Online]. A vailable: https://doi.org/10.1109/tsg.2017.2748519 [3] P . Chakraborty , E. Baeyens, K. Poolla, P . P . Khargonekar , and P . V araiya, “Sharing storage in a smart grid: A coalitional game approach, ” IEEE T ransactions on Smart Grid , pp. 1–1, 2018. [Online]. A vailable: https://doi.org/10.1109/tsg.2018.2858206 [4] C. W u, J. Porter, and K. Poolla, “Community storage for firming, ” in 2016 IEEE International Conference on Smart Grid Communications (SmartGridComm) . IEEE, nov 2016. [Online]. A v ailable: https: //doi.org/10.1109/smartgridcomm.2016.7778822 [5] Y . H. Zhou, A. Scheller-W olf, N. Secomandi, and S. Smith, “Managing wind-based electricity generation in the presence of storage and transmission capacity , ” Production and Operations Management , nov 2018. [Online]. A vailable: https://doi.org/10.1111/poms.12946 [6] J. H. Kim and W . B. Powell, “Optimal energy commitments with storage and intermittent supply , ” Operations Research , vol. 59, no. 6, pp. 1347–1360, dec 2011. [Online]. A vailable: https://doi.org/10.1287/ opre.1110.0971 [7] M. Korpaas, A. T . Holen, and R. Hildrum, “Operation and sizing of energy storage for wind power plants in a market system, ” International Journal of Electrical P ower & Ener gy Systems , vol. 25, no. 8, pp. 599–606, oct 2003. [Online]. A vailable: https://doi.org/10.1016/s0142- 0615(03)00016- 4 [8] P . M. van de V en, N. Hegde, L. Massouli, and T . Salonidis, “Optimal control of end-user energy storage, ” IEEE T ransactions on Smart Grid , vol. 4, 03 2012. [9] E. Altman, Constrained Markov Decision Processes , 1999. [10] M. L. Puterman, Markov decision pr ocesses: discr ete stochastic dynamic pr ogramming . John W iley & Sons, 2014. [11] D. Mitra, “Stochastic theory of a fluid model of producers and consumers coupled by a buffer , ” Advances in Applied Pr obability , vol. 20, no. 3, pp. 646–676, 1988. [12] A. J. Ganesh, N. O’Connell, and D. J. W ischik, Big queues . Springer , 2004. [13] F . T oomey , “Bursty traffic and finite capacity queues, ” Annals of Oper- ations Researc h , vol. 79, pp. 45–62, 1998. Karan N. Chadha Karan is a dual degree student of Electrical Engineering at Indian Institute of T echnol- ogy Bombay (IITB). His research interests include applied probabilty , game theory , optimization and learning theory . Ankur A. K ulkarni Ankur is an Associate Professor with the Systems and Control Engineering group at Indian Institute of T echnology Bombay (IITB). He receiv ed his B.T ech. from IITB in 2006, M.S. in 2008 and Ph.D. in 2010, both from the Univ ersity of Illinois at Urbana-Champaign (UIUC). From 2010- 2012 he was a post-doctoral researcher at the Coor - dinated Science Laboratory at UIUC. His research interests include information theory , stochastic con- trol, game theory , combinatorial coding theory prob- lems, optimization and variational inequalities, and operations research. He was an Associate (from 2015–2018) of the Indian Academy of Sciences, Bangalore, a recipient of the INSPIRE F aculty A ward of the Department of Science and T echnology , Government of India, 2013, Best paper awards at the National Conference on Communications, 2017, Indian Control Conference, 2018 and International Conference on Signal Processing and Communications (SPCOM) 2018, Excellence in T eaching A ward 2018 at IITB and the William A. Chittenden A ward, 2008 at UIUC. He is a consultant to the Securities and Exchange Board of India on regulation of high frequency trading. Jayakrishnan Nair Jayakrishnan recei ved his BT ech and MT ech in Electrical Engg. (EE) from IIT Bombay (2007) and Ph.D. in EE from California Inst. of T ech. (2012). He has held post-doctoral positions at California Inst. of T ech. and Centrum W iskunde & Informatica. He is currently an Assis- tant Professor in EE at IIT Bombay . His research focuses on modeling, performance ev aluation, and design issues in queueing systems and communica- tion networks.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment