A Random Search Framework for Convergence Analysis of Distributed Beamforming with Feedback
The focus of this work is on the analysis of transmit beamforming schemes with a low-rate feedback link in wireless sensor/relay networks, where nodes in the network need to implement beamforming in a distributed manner. Specifically, the problem of …
Authors: C. Lin, V. V. Veeravalli, S. Meyn
1 A Random Search Frame work for Con v er gence Analysis of Distrib uted Beamforming with Feedback Che Lin, Member , I EEE , V enugopa l V . V eerav alli, F e llow , IEEE , and Sean Meyn, F ellow , IEEE Abstract —The fo cus of this work is on t he analysis of transmit beamf orming schemes with a l ow-rate feedback link in wireless sensor/relay networks, where nodes in the network need to implement beamf orming in a d istributed manner . Specifically , the problem of distributed phase alignment is considered, where neither t he transmitters nor the receiv er h as p erfect channel state information, but ther e is a low-rate feedback l ink from the receiv er to the t ransmitters. In this setting, a framework is proposed for systematically analyzing the perfo rmance of distributed beamf orming schemes. T o illustrate th e advantage of this fram ework, a simple adaptive distributed beamfo rm- ing scheme that was recently proposed by Mu dambai et al. is stud ied. T wo important properties for the recei ved signal magnitude functi on ar e derive d. Using these properties and the systematic framework, it i s shown that the adaptive distributed beamf orming scheme con ver ges both in probability and in mean. Furthermore, it is established that the time required fo r the adaptive scheme to con ver ge i n mean scales li nearly with respect to the number of sensor/relay n odes. Index T erms —Array signal processing, con ver gence of nu - merical methods, detectors, d istributed algorithms, feedb ack communication, networks, relays. I . I N T RO D U C T I O N The pr oblem of distributed beamfor ming ar ises quite natu- rally in wireless sensor/relay n etworks. In a sensor network, sensors m ake estimates o f a com mon observed phen omenon and reach a c onsensus using a lo cal message p assing algo- rithm. In a relay ne twork, a source no de intends to commun i- cate with the destination node by passing the message to all re- lay nodes. In both setti ngs, the sensor/relay nodes then serve as distributed transmitters and seek to c on vey a common message to the intended recei ver . T o p reserve energy in this stage, trans- mit beamf orming ha s emerged as a promising scheme du e to its poten tial array gain and low-complexity . Howe ver , perfect channel state information (CSI) at the transmitter is requ ired by conventional transmit beam forming sch emes to gen erate beamfor ming coefficients an d ach iev e phase alignmen t at th e receiver end. T his r equiremen t and the distributed nature o f This research was supported in part by the NSF aw ards #CCF 043108 8 and #CNS 0831670, and IT MANET D ARP A #RK 2006-07284 through the Uni ve rsity of Illinois, and by a V odafo ne Foundation Graduat e Fellowshi p. Any opinions, findings, and conclusi ons or recommendations expressed in this materia l are those of the authors and do not necessarily reflect the vie ws of NSF or D ARP A . Che Lin is w ith Institute of Communicat ion Engineering , National Tsing Hua Uni ve rsity , Hsinchu, 30013, T aiwan (e-mail: clin@ee .nthu.edu.tw). V enugopal V . V eerav alli and Sean Meyn are with Coordinated Science Laboratory , Unive rsity of Illinois, Urbana-Champaig n, Urbana, IL 61801, USA (e-mail: { vvv , meyn } @i llinoi s.edu). wireless sensor /relay networks make it d ifficult to implem ent transmit beamforming schemes in practice. Although ob taining perfect CSI may be too exp ensiv e fro m a p ractical point- of- view , partial CS I can be made av ailab le via a lo w-r ate fe edback link from th e recei ver to the transmitters. As a consequence, there has been increased interest in design ing efficient s chemes that achieve distributed phase align ment in the pr esence of a low-rate feedbac k link [1], [ 2], [ 4], [5]. In this work, our goal is to pr ovide a framework for systematically an alyzing the perfo rmance of a general set of d istributed be amformin g schemes with such low-rate feedback . T o illustrate the a dvantages of our fram ew o rk, we f ocus on the analysis of a recently pr oposed training schem e for distributed beamfo rming [1], [2]. The p roposed schem e is a simple adaptive algor ithm using on e bit o f feed back in- formation , and is attr activ e in practice since it is simple to implement. Naturally , one would expect a tradeoff in energy consump tion due to possible slow con vergence of distributed beamfor ming, but sur prisingly , the schem e propo sed in [1] conv erges rapidly and hence u tilizes energy efficiently . T he scheme adju sts its ph ases for all sensor s simultaneou sly in each time slot to achieve phase alignmen t. This red uces the overhead significantly compared with direct channel estima- tion between each sour ce node an d the d estination nod e. In fact, the convergence time o f the sch eme scales linearly with the number of nod es. Although the scheme o f [1] h as many d esirable featu res, the fundam ental re asons beh ind the effecti veness o f the scheme are unclear fr om pr evious work. In [2], the an alyses o f the conv ergence an d linear scalability o f distributed beamfo rming schemes have b een based on mod el appro ximations, which may be loo se for some cases. A ssuming the stepsize ap - proach es ze ro, stochastic app roximatio n is used in [3] to show the conv ergence of the o ne-bit scheme in distribution. Furthermo re, the authors pr oposed two more alg orithms: the signed algorithm an d th e ρ % solution algorithm and p roved the convergence of b oth alg orithms via the same techn ique. A discrete version of the p roblem has been solved in [4], [5] by con sidering a simplified model with a bin ary channel and binary signaling. In this work, instead o f focusing on the conv ergence of a particu lar algorithm for a particular func tion, we seek a fundam ental understand ing into the conv ergence of distributed beamfor ming schemes more generally by studying them with in the framework of lo cal ran dom sear ch algo rithms. Thr ough this fr amew ork, we are able to pr ovide a m ore compr ehensive 2 analysis o f the fast convergence and linear scalab ility of the scheme pro posed in [1]. In particular, our analysis d oes not in volv e approxim ation of any sort and hence ma kes statements on convergence and linea r scalability in [2], [ 3] mor e rigoro us. Our result is also stronger than that in [3] in the sense th at conv ergence in probab ility is proved instead of convergence in distribution. Further, we show that due to the special structure of the objective function considered in this p roblem, an y adap- ti ve distributed beamfor ming scheme that can b e refo rmulated as a rand om search algorith m co n verges in p robability . This broad set of algorithms also inclu des the signed and the ρ % solution algorithm s propo sed in [3] and m akes our an alysis more gen eral and rigo rous th an existing work in the literatu re. W e organ ize th e pap er as fo llows: In Section II, w e in- troduce the system model and the received signal magn itude function , which is u sed as our metric to measure the beam- forming array gain thro ughou t the pap er . In Section III, we propo se a framework th at allows for a systematic an alysis of a g eneral set o f adaptive distributed be amformin g schem es. Specifically , we ref ormulate th is set of adaptive distributed beamfor ming schem es as random search algo rithms via a general framework. This reformu lation provides insig hts into the n ecessary con dition for th e convergence o f the scheme propo sed in [1]. These insigh ts lead us to investigate th e proper ties of the receiv ed signal magn itude function in Section IV. W e furth er u se these properties to prove the convergence of the local ran dom search algorith m in probab ility and in m ean, and provide simulations to validate ou r analysis. In Section V, we show th at the time required for the algorith m to co n verge in mean scales linearly with th e n umber o f n odes. W e also provide nu merical re sults that validate ou r analysis. Finally , we conclud e the pa per in Section VI and sug gest directions for futur e resear ch. I I . S Y S T E M S E T U P W e consider the pro blem of distributed b eamformin g, wher e n s transmitters seek to b eamform a co mmon me ssage to one rec ei ver in a distributed manner . W e assume th at each transmitter and the r eceiv er is equip ped w ith one antenna , and that the channels from the transmitters to the receiv er experi- ence freq uency-flat, slo w fading. The discrete-tim e, complex baseband system model over a coherence interval is given b y y [ t ] = n s X i =1 h i g i [ t ] s [ t ] + w [ t ] = n s X i =1 a i b i [ t ] e j ( φ i + ψ i [ t ]) s [ t ] + w [ t ] (1) where s [ t ] ∈ C is the transmitted common message, y [ t ] ∈ C is the received signa l, and w [ t ] ∼ C N (0 , σ 2 ) correspond s to the additive wh ite Gaussian noise. For tr ansmitter i , we d enote the channel fadin g gains by h i = a i e j φ i ∈ C a nd b eamform ing coefficients by g i [ t ] = b i [ t ] e j ψ i [ t ] ∈ C . Note that a i ≥ 0 , b i [ t ] ≥ 0 , and φ i ∈ [0 , 2 π ] , ψ i [ t ] ∈ [0 , 2 π ] for all i and t since they ar e the correspon ding magnitud es and phases of h i and g i , r espectiv ely . Moreover , a i and φ i are considere d to be constant with tim e over the coh erence interval du e to the slow fading assumptio n. W e assume an av erage power constraint o n s [ t ] given by E [ | s [ t ] | 2 ] ≤ P for all t . W e assume a nonco herent commun ication model, w here the realization of the channe l is unknown a t bo th th e tran smitters and receiver . There is, howe ver , an err or-free, zero-d elay feed- back link of fin ite capacity from th e receiver to all tr ansmitters conv eying low-rate p artial chann el state information (CSI) in each time step. The goa l o f distributed beamf orming is to pick the b eam- forming coefficients { g i [ t ] = b i [ t ] e j ψ i [ t ] } to maxim ize the received SNR . In a non coheren t setting and with a low-rate feedback link, b eamform ing can only b e achiev ed adaptively throug h training. W ithout loss of generality , we assume that the signal s [ t ] is constant dur ing the tr aining stage. Furthermor e, we ma ke the following two simp lifications. First, we assume that each tran smitter utilizes the same amount of en ergy for each transmission, i.e., that b i [ t ] = 1 for all i and t , i.e., we do not optimize the beamfo rming gain s, and we therefore set s [ t ] = √ P . T his assumption is justified for situations where the transmitter s rely on a limited energy source (batter y) and allowing them use different am ounts of energy would cause some nodes to use up th eir energy befo re others. Secon dly , we assum e that the receiv er can estimate th e magnitu de of the signal componen t 1 at the receiv er (withou t the noise term w [ t ] in (1)). W e ther efore use received signal magn itude as the metric for optimizin g the beamfor ming phases. The received signal magn itude can be expressed as Mag( θ 1 [ t ] , · · · , θ n s [ t ]) = √ P n s X i =1 a i e j θ i [ t ] (2) where θ i [ t ] = φ i + ψ i [ t ] is the total received p hase f or sensor i . It is easy to see tha t Mag ( · ) is maximized when the phases { θ i [ t ] } are aligned, i.e., they a re equal to each other (modu lo 2 π ). Our goal is to study a daptive distributed beamforming schemes th at achieve this phase align ment through the use of a low-rate fee dback link from the receiver . I I I . A F R A M E W O R K F O R S Y S T E M A T I C A N A L Y Z I N G A D A P T I V E D I S T R I B U T E D B E A M F O R M I N G S C H E M E S In this section , we introd uce a framework f or analyzin g a general class o f ad aptive distributed beamf orming schem es that c an be refor mulated as random sear ch algorithms. Rand om search algo rithms are well studied in the literature [6], [7], [8] as meth ods to maximize an unknown function via rand om sampling. Once an adaptive distrib uted beamformin g sch eme can be successfully r eformu lated as a ran dom search alg o- rithm, a systematic study of the co n vergence of such adaptive scheme is possible. A. Reformulatio n of Adaptive Distrib u ted Beamforming Schemes a s Rando m Sea r ch Algorithms Adaptive d istributed beam forming a lgorithms intro duced in Section II seek to max imize Mag ( · ) given in (2) w ith th e help of a low-rate feedback link. At each step of the adap tation, the 1 A good estimate of the receiv ed signal magnitude can be obtained directl y when the noise is small, or by ave raging ove r sev eral time slots when the noise is not negligib le. 3 signal magnitu de at the receiver is a sample of the function Mag( · ) . Thu s, from th e recei ver po int o f vie w , the pro blem of distributed p hase alignm ent can be con sidered und er the setting of the following problem: Pr oblem 1: Give n a unknown fu nction f : Θ → R , Θ ⊆ R n , wher e only samples o f f ( θ ) ar e available fo r arbitrary θ ∈ Θ , find the glo bal maxima of f . It is importan t to no te th at Pr ob lem 1 is a global max imiza- tion proble m in gener al if no special structure is assumed for the objecti ve fun ction f . T o solve the maximization in Pr o blem 1 , one may be tempted to use gradient-based alg orithms tha t are well-developed in the literatu re. Since it is po ssible for f to possess local maxima, con ventio nal gr adient-ascent methods would fail in gen eral. Besides, acq uiring the g radient o f the function f ma y be infeasible especially when the functio n itself is unknown. Hence, rand om search techn iques [6], [7], [8] are mo re appr opriate in this setting and can b e descr ibed as follows: A Random Search Algorithm: • Step zer o : Initialize the a lgorithm by choo sing θ [0] ∈ Θ . • Step on e : Generate a random perturb ation δ [ t ] from the sample space ( R n , B , µ t ) , where B is a Borel set on R n and µ t is a pro bability measure that could be time - varying. • Step two : Up date the search point by θ [ t ] = D ( θ [ t − 1] , δ [ t ]) , where the map D satisfies the c ondition f ( D ( θ [ t − 1] , δ [ t ])) ≥ f ( θ [ t − 1]) . Clearly , f or a rand om search algorithm , we requir e only function e valuations and contr ol over th e probability measure µ t , which is u sed to sample the func tion. Any adap ti ve dis- tributed beam forming sch eme can be reformulated as a random search algorithm if each distributed tran smitter initializes its phase as in Step zer o , gener ates a rand om per turbation of p hase as in Step on e , and u pdates its new phase by the m ap D as in Step two . The lo w-rate f eedback link is u sed to guarantee the condition f ( D ( θ [ t − 1] , δ [ t ])) ≥ f ( θ [ t − 1 ]) . No te tha t the unknown function f can be any ob jectiv e function that we fin d fit for the distributed transmitter s to optim ize. Th is sug gests that o ur framework can be used to ana lyze a mo re gen eral function op timization prob lem over distributed networks. Note further that the pro bability measure µ t for the sampling can be tim e-varying in genera l. The time- varying natur e of the probab ility measure can be though t of as “ a daptive stepsize ” for d istributed algorithm s in th e most gene ral sense. In th is sense, ou r fram ew o rk c an be used to analy ze a large set of adaptive distributed algorithms. B. One-bit Ada ptive Distrib uted Beamforming Scheme T o illustrate the advantage of our framework, we n ow analyze a one-bit adap ti ve distributed beamfor ming scheme recently prop osed in [1 ]. Specifically , we refo rmulate this scheme as a local rando m search algorithm , which allows for its systematic an alysis. W e b egin by d escribing the o ne-bit adaptive distributed beamform ing scheme as follows: A One-bit Adaptive Distrib uted Beamforming Scheme [1]: • Step zer o : Referring to (2 ) and noting that th e i -th trans- mitter controls its beamforming p hase ψ i [ t ] , the algorithm is in itialized by settin g ψ i [0] = 0 , and hence θ i [0] = φ i for transmitter i . • Step one: I n th is step, a rando m perturba tion δ i [ t ] is gen- erated at each distributed tran smitter suc h that { δ i [ t ] } n s i =1 are i. i.d. u niform random variables in [ − δ 0 , δ 0 ] across time and tr ansmitters, where δ 0 is a constant parameter . The r andom pertu rbation is add ed to th e total phase of each transmitter . Th e distributed transmitter s the n use the perturb ed total phases as their ne w t otal phases to transmit the training symbol. • Step two: After receiving the training sym bols, the re- ceiv er measur es the received sign al mag nitude a nd com- pares it with the signal magnitude recei ved in t he pre vious time slot. If the ne wly received signal mag nitude is larger , the receiver feeds ba ck a “ keep ” beacon to the transmitters. Oth erwise, a “ discard ” b eacon is sent to the transmitters. Note that the b eacon is a broad cast fr om the r eceiv er to all transmitters. Clearly , this feedback scheme o nly r equires one bit of feedbac k infor mation per time step. When a “ keep ” is received at the transmitters, each transmitter selects and keeps its newly u pdated total phase. Othe rwise, the old phase is selected and the new phase discar ded. T his selection proc ess is determined b y whether th e r andom p erturbatio n increases or decreases the array gain for the ad aptive d istributed beamformin g scheme. Specifically , the ev olu tion of θ [ t ] is given b y θ [ t ] = θ [ t − 1] + δ [ t ] , if δ [ t ] ∈ K θ [ t − 1] , if δ [ t ] / ∈ K (3) where θ [ t ] = [ θ 1 [ t ] , · · · , θ n s [ t ]] T , δ [ t ] = [ δ 1 [ t ] , · · · , δ n s [ t ]] T , and K = { δ [ t ] k Mag ( θ [ t − 1] + δ [ t ]) > Mag( θ [ t − 1 ]) } . Matching the steps of the above o ne-bit adaptive sch eme and those of a rand om searc h a lgorithm introdu ced in Sec- tion III-A, it is clear that the o ne-bit ada ptiv e distributed beamfor ming algorith m can be regarded as a special case of the random search algor ithm by setting f = Mag( · ) (4) n = n s (5) Θ = [0 , 2 π ] n s (6) µ t = µ (7) D ( θ [ t − 1] , δ [ t ]) = θ [ t − 1] + 1 { δ [ t ] ∈K} δ [ t ] (8) where 1 {·} is th e indicator function an d µ is unifo rm on [ − δ 0 , δ 0 ] n s , which is a n s -dimension al hypercub e. Note that (8) is the same as the e volution descr ibed by (3). Since th e prob ability measure µ is non-zer o only within a hype rcube, with sides of leng th 2 δ 0 and centered arou nd θ [ t − 1] , the o ne-bit a daptive distributed beam forming scheme can be r eformu lated as a loca l rand om search algorith m. W e emphasize aga in that we can u se this f ramew ork to study more g eneral adaptive distributed beamfor ming sche mes. For example, the pro bability me asure f or sampling m ay b e time- varying a nd with a sup port th at spans the entir e space Θ . W e can also study adap tiv e distributed beamformin g schemes with 4 more than one bit of feedback info rmation. It is also interesting to note the con nection between this local rand om sear ch algorithm and simulated ann ealing [9]. Simulated anne aling is a g eneric p robabilistic alg orithm that approx imates th e glo bal optimal solution of a giv en f unction in a large search space. The a lgorithm uses a p arameter T called the temp eratur e to control the acceptan ce probab ility , i.e., th e prob ability th at th e current state of the a lgorithm tran sitions to a ne w state. If we let T → 0 and assume th at the curre nt state is only allowed to move to neig hborin g states, the simulated annealin g procedu re reduces to a local random sear ch algorithm. A local ran dom search algo rithm, howev er , does not nec - essarily converge in g eneral. For exam ple, if the unk nown function possesses lo cal maxima (th at are n ot global max ima), the sequenc e { θ [ t ] } ∞ t =0 is likely to be trap ped in a lo cal maximum if the local perturb ation δ 0 is not large en ough. Thus, a necessary condition for the con vergen ce of local random search algorithm s for arb itrary δ 0 is th at there is no lo cal maxim um point fo r Mag( · ) . W ith these in mind, two question s arise n aturally: a ) Do es the reformu lated local random search algorithm even conv erge? b ) I f it does, is there a f undame ntal reason b ehind th e co n vergen ce? In the following section , we inv estigate prop erties of the fun ction Mag( · ) towards the goal of add ressing these questions. I V . C O N V E R G E N C E O F T H E D I S T R I B U T E D B E A M F O R M I N G S C H E M E A. Pr op erties of Received Signal Magnitude Function The prope rties of the rec ei ved signal ma gnitude fun ction Mag( · ) do no t d epend on the tim e evolution of its arguments. W e h ence igno re the time dependenc e of θ [ t ] in this section. The following pro position states the first property of Mag( · ) . Pr opo sition 1 : For the receiv ed signal magnitu de fun ction Mag( · ) d efined in (2), all loca l maxima are global maxim a. Pr oof: T o facilitate analysis, w e introdu ce a ch ange of variables x i := x R i x I i = cos θ i sin θ i Eqn. (2) can b e re written as Mag( x 1 , · · · , x ns ) = √ P n s X i =1 a i x i where k x i k 2 = 1 f or all i = 1 , · · · , n s . Th e maximizatio n of Mag( · ) ca n be rewritten as max k x i k 2 =1 ,i =1 , ··· ,n s n s X i =1 a i x i 2 (9) In the following, we will show that all local maxima of this objective function corresp ond to c omplete ph ase align ment for all transmitters. That is, all local m aximum po ints are global maximum points. By relaxing the equality constraints to inequality constraints, the optim ization pr oblem in (9) is equ i valent to max k x i k 2 ≤ 1 ,i =1 , ··· ,n s n s X i =1 a i x i 2 (10) This equiv a lence can be seen as fo llows: if x ∗ is a local maximum with a n inactive constraint k x ∗ k k 2 < 1 , by fixing all other variables { x ∗ j } j 6 = k , we ob tain n s X i =1 a i x ∗ i 2 = k a k x ∗ k + c k 2 = ( a k x ∗ k R + c R ) 2 +( a k x ∗ k I + c I ) 2 where c = [ c R c I ] T is a co nstant vecto r depen ding o n { x ∗ j } j 6 = k . Obviously , the above function can be impr oved by approp ri- ately perturb ing k x ∗ k k accor ding to the signs o f c R and c I . This con tradicts th e fact that x ∗ is a maximum . Thu s, all constraints are active if x ∗ is a maxim um point. This sho ws that the optimiza tion problems (9) and (10) are equivalent. Focusing on the optimization problem with r elaxed con- straints, the Lagra ngian o f (10) reads L ( x , λ ) = −k w k 2 + n s X i =1 λ i ( k x i k 2 − 1 ) where x = [ x T 1 , · · · , x T n s ] T , λ = [ λ 1 , · · · , λ n s ] T , λ i ≥ 0 for all i = 1 , · · · , n s , an d w = P n s i =1 a i x i . By the Lagrang e Multiplier Theor em, all lo cal maxima satisfy ∇ x i L ( x , λ ) = − 2 a i w T + 2 λ i x T i = 0 T (11) n s X i =1 λ i ( k x i k 2 − 1 ) = 0 (12) k x i k 2 − 1 ≤ 0 (1 3) for all i = 1 , · · · , n s . Let x ∗ be a loc al maximum and λ ∗ be the corr esponding Lagrang e multipliers. If λ ∗ i = 0 , Eqn . (11) implies that w = 0 since 2 a i > 0 . In th is case, Mag ( x ∗ ) = 0 and this co ntradicts the fact th at x ∗ is a local maximu m, since we can always im prove Mag ( · ) by letting x ∗ i = [ ξ 0] T , ξ ≤ 1 , and x j = 0 for all j 6 = i . This leads to λ i > 0 for all i . W e hence hav e x ∗ i = a i λ ∗ i w (14) λ ∗ i = a i k w k (15) The optimal solutio ns d escribed by (1 4) and (15), however , also satisfy Mag( x ∗ ) = √ P n s X i =1 a i w k w k = √ P n s X i =1 a i and hence are glo bal m axima. This complete s o ur proo f. Pr opo sition 1 implies that the lo cal random s earch algorithm cannot be trapp ed in a suboptima l lo cal maxim um since all local maxima are global max ima. Furthermore, it also sugge sts that the necessary cond ition f or the conv ergence of ran dom search alg orithms is satisfied. While it is intuitively clear that the local ran dom search algorithm should co n verge according to Pr oposition 1, it is to be no ted that th e con dition is only necessary and may not be sufficient. W e will provide a rigoro us proo f of the conv ergence of the local random search 2 Note that the case where a i = 0 is not interesting since we can always reduce the dimension of the problem by ignoring x i 5 algorithm later . Now , we explore an addition al prop erty of Mag( · ) th at explains the efficiency of the algorith m. Another interesting p roperty of Mag ( · ) is that it is inv a riant under a comm on phase shift to all transmitters. That is, Mag( θ + θ c e ) = √ P n s X i =1 a i e j ( θ i + θ c ) = √ P e j θ c n s X i =1 a i e j θ i = Mag( θ ) where e is a n s × 1 vector with all elemen ts equal to one, and θ c is a commo n phase shift that can depend on { θ i } n s i =1 . One possible choice for the common phase shift is to let θ c ( θ 1 , · · · , θ n s ) be such that the im aginary par t within the modulu s fun ction is canceled, i.e., Mag( θ ) = Mag( θ + θ c ( θ 1 , · · · , θ n s ) e ) = √ P n s X i =1 a i cos ( θ i + θ c ( θ 1 , · · · , θ n s )) = √ P n s X i =1 a i cos θ ′ i = Mag ( θ ′ ) where θ ′ = [ θ ′ 1 , · · · , θ ′ n s ] T . Note that in th e shifted θ ′ domain, the global max ima o ccur on ly wh en θ ′ i = 0 or 2 k π for all i , where k is any in teger . The shift-inv ariant p roperty results in multiple glo bal maxima for the functio n Mag( · ) . In fact, all global maxim a form a one- dimensiona l “rid ge” since if θ ∗ is a global maxim um, ¯ θ with ¯ θ i = θ ∗ i + θ c is also a g lobal maximum . This pr operty leads to the rapid conver gence of the local ran dom searc h alg orithm since converging to any of these glob al max imum points is adeq uate. W e conclud e this section by su mmarizing these two im por- tant pro perties o f Mag( · ) as f ollows: 1) all local maxima are global maxima, and 2) a com mon shift to its argumen ts does not chan ge its value. B. Pr oo f of Con ver gence Intuitively , Pr o perty 1 g uarantees the conv ergence o f any local ra ndom search algorith m. T o make this precise, we introdu ce an ǫ - conv ergence region R ǫ = { θ ∈ Θ : Mag( θ ) > Mag ( θ ∗ ) − ǫ } (16) where θ ∗ is th e optimal total phase a nd satisfies Mag ( θ ∗ ) = √ P P n s i =1 a i . W e define th e co n vergence of a rand om search algorithm in pr obability as follows: Definition 1: A sequ ence { θ [ t ] } ∞ t =0 generated by a random search algo rithm is said to be co n vergent in prob ability if, giv en ǫ > 0 , lim t →∞ Pr [ θ [ t ] ∈ R ǫ ] = 1 In other words, Mag ( θ [ t ]) conv erges to Mag( θ ∗ ) in prob abil- ity . For the proo f of convergence, we futher de riv e a prop osition stating that for any θ outside of R ǫ , th ere is a non -zero p rob- ability of improving Mag( · ) by apply ing a loca l perturb ation to θ . Pr opo sition 2 : For any given θ ∈ Θ \ R ǫ and δ 0 > 0 , there correspo nd γ > 0 and 0 < η ≤ 1 such that Pr [Mag( θ + δ ) − Mag ( θ ) ≥ γ ] ≥ η where δ is a rand om vector with i.i.d. elements uniform ly distributed over [ − δ 0 , δ 0 ] . Pr oof: From Pr opo sition 1 , all loca l maxima are glo bal maxima for th e func tion Mag( · ) . Th is implies that for all θ / ∈ R ǫ and all δ 0 > 0 , there exists a p oint θ u ∈ S θ and a co nstant γ ( θ ) > 0 such that Mag( θ u ) − Mag ( θ ) ≥ 2 γ ( θ ) (17) where the set S θ is a hyper cube of length 2 δ 0 centered a round θ giv en b y S θ = { ω ∈ Θ : ω = θ + δ , δ ∈ [ − δ 0 , δ 0 ] n s } The continuity of Mag( · ) implies that there exists σ ( θ u ) > 0 such that for all ξ ∈ T := { ω ∈ Θ : k ω k ≤ σ ( θ u ) } , we have | Mag( θ u + ξ ) − Mag ( θ u ) | ≤ γ ( θ ) (18) Combining (17) and (1 8), we arrive at a lower bo und Mag( θ u + ξ ) − Mag ( θ ) = Mag( θ u + ξ ) − Mag ( θ u ) + Mag( θ u ) − Mag ( θ ) ≥ − γ ( θ ) + 2 γ ( θ ) = γ ( θ ) Referring to (4) fo r the defin ition o f µ , the above lower bo und leads to Pr [Ma g( θ + δ ) − Ma g( θ ) ≥ γ ( θ )] ≥ µ ( T ) = : η ( θ ) Note that µ ( T ) is a functio n o f θ , since θ u is a f unction of θ . W e complete the proof of the pr oposition by letting γ = inf θ ∈ Θ \ R ǫ γ ( θ ) η = inf θ ∈ Θ \ R ǫ η ( θ ) Note that the proof of th is pro position can easily be gener- alized for any local ran dom perturbation δ . Since b efore the sequence reaches the ǫ -con vergence r egion, there is always a non-ze ro pro bability of improving Mag( · ) for each tim e step, the co n vergence of the sequence is to b e expecte d. A simple deterministic analogue is the co n vergen ce of a mo notonically non-d ecreasing function . T he p robabilistic n ature o f th e algo - rithm complicate s the pro of. This will bec ome clear in th e proof of our next theorem. Theor em 1: For the function Mag( · ) defined in (2), let { θ [ t ] } ∞ t =1 be a seque nce g enerated by the local rando m search algorithm described in Eqn. (4)-(8). Then the resulting se- quence conv erges in proba bility , i.e., given ǫ > 0 , lim t →∞ Pr [ θ [ t ] ∈ R ǫ ] = 1 Pr oof: By Pr o position 2 , we know that given any tim e t Pr [ { Mag( θ [ t − 1] + δ [ t ]) − Ma g( θ [ t − 1]) ≥ γ } or { θ ∈ R ǫ } ] ≥ ¯ η 6 where ¯ η = min { Pr [ θ ∈ R ǫ ] , η } . Since Θ is co mpact and Mag( · ) is continuo us, there always exists a po siti ve integer p such that pγ > Mag ( θ 1 ) − Mag ( θ 2 ) , ∀ θ 1 , θ 2 ∈ Θ The prob ability that the sequence lies in R ǫ after p time steps is hence lower bounded by Pr [ θ [ p ] ∈ R ǫ ] ≥ ¯ η p since { δ [ t ] } ∞ t =0 are indepe ndent acr oss time. This leads to Pr [ θ [ p ] / ∈ R ǫ ] ≤ 1 − ¯ η p and Pr [ θ [ pm ] ∈ R ǫ ] = 1 − Pr [ θ [ pm ] / ∈ R ǫ ] ≥ 1 − (1 − ¯ η p ) m for all m = 1 , 2 , · · · . The lo wer boun d is still valid if we let the sequen ce p rogress ℓ time steps fu rther, i.e., Pr [ θ [ pm + ℓ ] ∈ R ǫ ] ≥ 1 − (1 − ¯ η p ) m for all m = 1 , 2 , · · · , ℓ = 0 , · · · , p − 1 . W e complete the p roof by letting m → ∞ . Theor em 1 states that the local rand om search alg orithm in (4)-(8) conv erges in prob ability , and hen ce also provides a proof of con vergence f or the one-bit adaptive distributed beamfor ming scheme in (3) . In particular, Theo r em 1 im plies the convergence of the sequence { Mag ( θ [ t ]) } ∞ t =0 in p roba- bility . Since the seque nce is non- negati ve and monotonica lly non-d ecreasing, we can conclu de that { Mag( θ [ t ]) } ∞ t =0 also conv erges in mean by the Monoto ne Conv ergence Theo- rem [10]. Furth er , by prop erly generalizin g Pr o position 2, it is straig htforward to show that a ny a daptive distributed beamfor ming schem e that c an be reform ulated a local rando m search algorithm and seeks to maxim ize any objecti ve f unction that satisfies Pr operty 1 conver ges in probability . 0 50 100 150 200 250 300 350 400 0 10 20 30 40 50 60 70 80 90 100 time Received signal magnitude Fig. 1. Ev olutio ns of sequences generate d by the adapti ve distribut ed beamforming scheme. In Fig. 1, we illustrate the e volution of the sequ ences generated by the local random search algor ithm from different initial p oints. T he in itial poin ts are generated ran domly fro m a un iform distribution over Θ . Only th ree sam ple paths of the seque nce ar e in cluded in the figure since similar behaviors can be observed for o ther sam ple paths. For each iter ation, the random p erturbation δ i for the i th tra nsmitter is a un iform ran- dom variable over [ − δ 0 , δ 0 ] , wh ere δ 0 = π / 3 0 . Note that we use th e same channel co efficients to generate the se sequences since th e focu s her e is o n th e effect of different initial points. In particular, the ch annel coefficients are randomly generated from i.i.d . C N (0 , 1) in the beginnin g of the simu lation, and remain fixed afterwards. From the figu re, we observe the rapid co n vergence of the local random search algorithm , ir respective of where it is ini- tialized. W e emph asize aga in th at the fast con vergence results follow f rom the two impo rtant proper ties for the fun ction Mag( · ) as discussed in Section IV -A. Pr operty 1 g uarantees the convergence o f the local search algo rithm; Pr op erty 2 results in multip le g lobal maxim a for th e function Mag( · ) and hence the fast con vergenc e of the algorithm. The simulatio ns provide a partial validation of our proof since we would expect the conv ergence to fail from some in itial points if there were non-o ptimal local maxima for Mag ( · ) . It is to b e no ted that the convergence o f the local ra ndom search algorith m does not guarantee that it is the mo st efficient schem e in te rms of the numbe r of fun ction ev aluations, a nd hen ce th e mo st efficient scheme in terms of ene rgy . Ho wev er , the alg orithm does have a desira ble scaling pro perty , i.e., the time require d for the algor ithm to co n verge in mean scales linearly with the num ber of transmitters. This is th e topic of the f ollowing section. V . S C A L I N G L A W Due to the probabilistic nature o f the local random searc h algorithm , we defined conv ergence in p robability in Section IV -B and showed that the lo cal ran dom search a lgorithm conv erges. For the analysis of the scaling law , however , we can only sh ow c on vergence in mean, which is defin ed as follows: Definition 2: A sequ ence { θ [ t ] } gen erated by a r andom search algor ithm is said to conver ge in m ean if the re exists t N ≥ 0 such that E { δ [ τ ] } t τ =0 | a , θ [0] [Mag ( θ [ t ])] > Mag ( θ ∗ ) − ǫ = √ P n s X i =1 a i − ǫ for all t ≥ t N , where a = [ a 1 , · · · , a n s ] T . That is, Mag( θ [ t ]) conv erges to Ma g( θ ∗ ) in mean . In this sectio n, our g oal is to find the time required for the local rand om search algor ithm to converge in mean , starting from any in itial point. In oth er words, we ar e inter ested in finding th e hitting time 3 of th e rand om search alg orithm, and determin ing its beh avior as a fun ction of the numb er of transmitters. Specifically , we derive an u pper b ound on th e hitting time of the lo cal rando m search alg orithm as a fu nction of n s . No te that the study of the hitting time makes sense only if the sequence indeed con verges in mean, which we established in Section IV -B . 3 The hitting time in this work is defined as the time require d for the algorit hm to con ver ge in mean. 7 T o facilitate analysis, we defin e the in crement functio n of Mag( · ) at time τ as I [ τ ] = [Mag ( θ [ τ ]) − Mag ( θ [ τ − 1])] + = [Mag( θ [ τ − 1] + δ [ τ ]) − Mag( θ [ τ − 1]] + (19) where [ x ] + = max( x, 0) . W e then re write the receiv ed signal magnitud e fu nction at any gi ven time k 0 n s as Mag ( θ [ k 0 n s ]) = k 0 n s X τ =1 I [ τ ] + Ma g ( θ [0]) =: k 0 n s X τ =1 I [ τ ] + c 0 (20) where k 0 is a po siti ve integer and c 0 ≥ 0 . From Pr oposition 2 we have that for any gi ven τ suc h th at θ [ τ − 1 ] / ∈ R ǫ and any local random per turbation δ [ τ ] , there correspo nd γ > 0 and 0 < η ≤ 1 such that Pr [Mag( θ [ τ − 1] + δ [ τ ]) − Mag( θ [ τ − 1]) ≥ γ ] ≥ η Thus, we have E δ [ τ ] | a , θ [ τ − 1] [ I [ τ ]] ≥ γ Pr [Ma g ( θ [ τ − 1 ] + δ [ τ ]) − Mag ( θ [ τ − 1 ]) ≥ γ ] ≥ γ η > 0 for any τ such that θ [ τ − 1 ] / ∈ R ǫ . Refe rring to (1 9)-(20), we obtain E { δ [ τ ] } k 0 n s τ =0 | a , θ [0] [Mag ( θ [ k 0 n s ])] = k 0 n s X τ =1 E δ [ τ ] | a , θ [ τ − 1] [ I [ τ ]] + c 0 ≥ k 0 n s γ η + c 0 ≥ √ P n s X i =1 a i where the last in equality follows by ch oosing k 0 = l √ P max i { a i } γ η m . T his implies that th e h itting time for th e lo cal random searc h alg orithm is at most k 0 n s , from a ny initial point. Hence, th e h itting time fo r th e alg orithm scales linear ly with the num ber of transmitters. 100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800 900 1000 n s Hitting time α = 0.5 α = 0.7 α = 0.9 Fig. 2. Hit ting time for the adapti ve distribute d beamforming scheme with dif ferent val ues of α . In ou r simu lations, we say that the sequence converges to the α frac tion of the global maxima if Mag( θ [ t ]) ≥ α Mag( θ ∗ ) . W e assum e that chann el coefficients are i.i.d. complex Gaussian variables C N (0 , 1 ) , and use the origin as 100 200 300 400 500 600 700 800 0 1000 2000 3000 4000 5000 6000 7000 8000 n s Average convergence time α = 0.5 α = 0.7 α = 0.9 Fig. 3. A vera ge conv ergence time for the adapti ve distribu ted beamforming scheme with dif ferent va lues of α . our initial point. W e set δ 0 = π / 90 for all our simulatio ns. Fig. 2 demo nstrates th e hitting time re quired for the adaptive distributed beamfor ming schem e to converge in a rela ti ve sense when α = 0 . 5 , 0 . 7 , and 0 . 9 . It is clear th at the hitting time increases as α incr eases. Th e scaling law fo r the hitting time with respe ct to n s , howe ver , is the sam e for all values o f α . Indee d, we observe linear scaling for all values of α . T his observation confirms o ur theoretical analysis. Fig. 3 shows th e average c on vergence time for the adaptive distributed b eamformin g scheme to with in a fr action of the globally maximu m value α Mag( θ ∗ ) , for different values of α . It is importa nt to note the difference between the hitting time an d the average convergence time. Since our algorithm is prob abilistic in natur e, the conv ergence time is essentially a random variable a nd each run of the algorithm provides a sample f or th is rand om variable. Fixing th e number of transmitters n s , we o btain th e average conver gence time by av eraging over a h undred samples of this ran dom variable, while the hitting time is ob tained by compar ing E [Mag ( θ [ t ])] with α Ma g ( θ ∗ ) . From Fig. 3, we observe the same linear scaling behavior fo r th e average conver gence time. W e expect this pr operty for the average conv ergence time ca n be shown in a similar man ner . V I . C O N C L U D I N G R E M A R K S A N D F U T U R E W O R K In this work , w e have prop osed a framework that allows for a systematic analysis of adaptive d istributed b eamform ing schemes in sensor/relay networks. W e used this framework to study th e convergence a nd scaling law of a recen tly pro posed one-bit ad aptiv e d istributed b eamformin g scheme [1]. W e first reform ulated the on e-bit ad aptive scheme as a local ran dom search algorithm . This reformulation provided insights into the conv ergence of the o ne-bit adaptive sche me, and led us to in vestigate th e fu ndamen tal properties fo r the receiv ed signal magnitud e fu nction Mag( · ) . W e identified two important prop- erties of the f unction that contribute to the rapid con vergence of the alg orithm. First, all lo cal max ima are global m axima. This prevents any local random search algorithm from b eing 8 trapped in non -optimal lo cal maximum points. Secondly , the Mag( · ) fu nction is inv a riant und er a comm on shift to its arguments. This pro perty results in multiple g lobal max imum points for Mag( · ) and hence the r apid conv ergence of the algorithm . Based on these p roperties, we have shown th e conv ergence of th e algo rithm, both in pro bability and in m ean. W e fur ther pr ovided an upp er bo und on the hitting time of the algorithm , and d emonstrated th at th e h itting time scales linearly with the number of sensor/relay nod es. Th is linear scaling is desirable , especially when the ne twork is densely populated . W e h av e also provid ed simulatio ns that validate our analysis. It is important to no te that the effecti veness of th e one-bit adaptive distributed beamf orming schem e depen ds critically on the prop erties of the fu nction Mag ( · ) . Maxim izing Mag ( · ) is eq uiv alen t to maximizin g the r eceiv ed SNR if ther e is no error in obtainin g the com mon message, wh ich is tr ue in the tr aining stage since the co mmon message is simply fixed and known to the receiver . On the othe r h and if adaptatio n is being per formed blind ly ( without training) it would be necessary to consider the p ossibility of er rors in co mmon message. The cor respond ing objectiv e function m ay th en n ot possess the sam e desirab le prop erties as Mag( · ) , e.g., the objective functio n may possess local m axima that ar e no t global maxima. Much work needs to be d one to under stand how ou r results can be applied in this m ore complicated scenario. One thin g that is clear, h owe ver, is that we will need to d ev elop new algorithm s that exploit the global structure of th e n ew objective func tion since local algorith ms c an be trapped in loc al maxima. Ou r g eneral fram e work for stud ying adaptive beamformin g algor ithms is e ven more useful in this context since it co nnects the pr oblem to a well-studied field of global optim ization algorithms. R E F E R E N C E S [1] R. Mudumbai, B. Wil d, U. Madho w , and K. Ramchandran , “Distribu ted Beamforming using 1 Bit Feedback: from Concept to Realizatio n, ” Allerton Conf . Commun. Cont. and Comp. , 2006. [2] R. Mudumbai, “Energy E f ficient Wi reless Communicati on using Dis- trib uted Beamforming, ” Ph.D thesis , Santa Barbara, Dec. 2007. [3] J. Buckle w and W . Sethares, “Con verge nce of a Class of Decentrali zed Beamforming Algorithms, ” IEEE Tr ans. on Sig. Pro cessing , V ol 56, No. 6, pp. 2280–2288, June 2008. [4] J. Thukral and H. Bolcskei , “Distrib uted Spatial Multiple xing with 1-bit Feedbac k, ” A llerton Conf. Commun. Cont. and Comp. , 2007. [5] M. Johnson, M. Mitzenmach er , and K. Ramchandran, Distribute d Beam- forming w ith Binary Signaling, IE EE Intl. Symp. on Info. Theory , 2008. [6] F . J. Solis and R. J. -B. W ets, “Minimiza tion by Random Search T ech- niques, ” J ournal of Mathemat ics of Operation s Researc h , vol. 6, pp. 19– 30, F eb . 1981. [7] L. Shi and S. Olafsson, “Nested Partition s Method for Global Optimiza- tion, ” Journal of Operations R esear ch , vol. 48, pp. 390–407, May 2000. [8] Z. B. T ang, “ Adapti ve Partition ed Random Search to Global Optimi za- tion, ” IEEE T rans. Automatic Contr ol , vol. 39, pp. 2235–2244 , Nov . 1994. [9] S. Kirkpatrick, C. D. Gelatt , and M. P . V ec chi, “Optimiza tion by Simu- lated Annealing, ” Science , vol. 220, no. 4598, pp. 671–680, 1983. [10] R. A. Durrett, Proba bilit y: Theory and Examples . Duxb ury Press, 2nd ed., 1995. PLA CE PHO TO HERE Che Lin (S’02–M’08) recei ved the B.S. degree in Electric al E nginee ring from Nationa l T aiwan Uni- versi ty , T aipei, T aiwan, in 1999. He recei ved the M.S. degree in Electric al and Computer Engineering in 2003, the M.S. degre e in Math in 2008, and the Ph.D. degree in Electrica l and Computer Engineer- ing in 2008, all from the Uni versity of Ill inois at Urbana-Ch ampaign, IL. Since 2008, he has been at Nationa l T sing Hua Uni ver sity , where he is currently an assistant profe ssor . Dr . Lin recei ved a two-year V odafone graduate fello wship in 2006, the E. A. Reid fello wship awar d in 2008, and holds a U .S. paten t, which has been include d in the 3GPP L TE standard. His research interests include feedback systems, distribu ted algorit hms in networks, practical MIMO code design in general networks, optimizati on theory , and information theory . PLA CE PHO TO HERE V enugopal V . V ee ra valli (S’86–M’92–SM’98– F’06) recei ved the Ph.D. degree in 1992 from the Univ ersity of Illinois at Urbana-Ch ampaign, the M.S. deg ree in 1987 from Carnegie -Mellon Uni ve rsity , Pittsb urgh , P A, and the B.T ech degre e in 1985 from the Indian Institute of T echno logy , Bombay , (Silver Medal Honors), all in Electrical Engineeri ng. He joined the Univ ersity of Illinois at Urbana-Ch ampaign in 2000, where he is currently a Professor in the department of Electrical and Computer Engineeri ng, and a Research Professor in the Coordinated Science Laboratory . He served as a program director for communications researc h at the U.S . Nationa l Science Founda tion in Arlingto n, V A from 2003 to 2005. He has previo usly held academic positions at H arv ard Uni versi ty , Rice Uni ve rsity , and Cornell Uni ve rsity . His research interests include distributed sensor s ystems and networks, wireless communications, dete ction and estimation theory , and informati on theory . He is a Fello w of the IEEE and was on the Board of Governors of the IEEE Information Theory Society from 2004 to 2007. He was an Associate E ditor for Detection and Estimation for the IEE E Transact ions on Informatio n Theory from 2000 to 2003, and an associate editor for the IEEE Transact ions on W irele ss Communicatio ns from 1999 to 2000. Am ong the awa rds he has recei ved for research and teaching are the IEEE Browder J. Thompson Best Paper A ward, the National Science Foundatio n CAREER A w ard, and the Presidential Early Career A wa rd for Scientist s and Engineers (PECASE). He is a distingu ished Lect urer for the IEEE Signal Processing Society for 2010-2011. PLA CE PHO TO HERE Sean P . Meyn recei ved the B.A. degree in Math- ematics Summa Cum Laude from UCLA in 1982, and the PhD degree in Electric al Enginee ring from McGill Uni versity in 1987. Af ter a t wo year po stdoc- toral fel lo wship at the Australia n National Univ ersity in Canberra, Dr. Meyn and his family moved to the Midwest. H e is now a Professor in the Depart- ment of E lectr ical and Computer E nginee ring, and a Research Professor in the Coordinat ed Science Laboratory at the Uni versity of Illinois, where he is directo r of the Decision and Cont rol Lab .. He is an IEEE fello w . He is coautho r with Richard T weedie of the monograph Marko v Chains and Stochastic Stabilit y , Springer -V erlag, London, 1993, and recei ved jointly with T weedie the 1994 ORSA/TIMS Best Publicat ion In Applied Probabilit y A w ard. The 2009 edition is published in the Cambridge Mathematic al Library . His ne w book, Control T echni ques for Comple x Netw orks is published by Cambridge Unive rsity Press. He has held visiting positions at univ ersities all ove r the world, including the Indian Institute of Science, Bangalore during 1997-1998 where he was a Fulbright Researc h Scholar . During his latest sabbatical during the 2006-2007 acade mic year he was a visiting professor at MIT and U nited T echnologi es Researc h Center (UTRC). His research interests include stochast ic processes, control and optimizati on, comple x networks, and information theory . Current funding is provided by NSF , Dept. of Energy , AFOSR, and DARP A.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment