Majorization: Here, There and Everywhere

Statistic al Scienc e 2007, V ol. 22 , N o. 3, 407– 413 DOI: 10.1214 /0883423 060000000097 c  Institute of Mathematical Statistics , 2007 Majo rization: Here, There and Everywhere Ba rry C. Arnold Dedication This article is written for Ingram Olkin on the o ccasion of his 80th birthda y . Ingram has pro vided inspir ation for me o ve r the last 40 y ears and con tin ues to inspire. I am ind eb ted to h im for his encouragemen t and s u pp ort throughout m y career. I am con tributing this hum bly in the sure kn o wledge that he c ould ha v e written it better than I. Abstr act. The app earance of Ma rsh all and Olkin’s 1979 b o ok on in- equalities w ith sp ecial emphasis on ma jorizatio n generate d a su rge of in terest in p oten tial applications of ma jorization and Sch ur con v exit y in a broad sp ectrum of ﬁelds. After 25 y ears this con tinues to b e th e case. T he present article pr esents a sampling of the div erse areas in whic h ma jorization has b een found to b e u seful i n the past 25 y ears. Key wor ds and phr ases: Inequalities, Sch u r con ve x, co vering, waiting time, paired comparisons, p hase type, catc habilit y , disease transmis- sion, a pp ortionmen t, statistical mec h anics, r andom g raph. 1. INTRODUCTION Prior to the app earance of the cel ebrated v olume Ine qualities : The ory of Majorization a nd Its Appli- c ations (Marshall and Olkin, 1979 ) man y researc hers w ere unaw are of the rich b o dy of literature related to ma jorizati on that w as scattered in jour nals in a wide v ariet y of ﬁelds. Ind eed, many ma jorizati on concepts had b een rein v en ted and often rec hristened in dif- feren t r esearc h areas (e.g., as Lorenz or d ominance ordering in economics), c omplicating the diﬃcu lties for the r esearc h er wh en trying to r elate current re- searc h to the ext ant co rpu s. Of co urse, the app ear- ance of th e Marshall and O lkin v olume c hanged all Barry C. Arno ld is Pr ofessor, Dep artment of Statistics, University of Califo rnia, Riversid e, California 92521, USA e-mail: b arry.a rnold@ucr.e du . This is an elec tronic repr int of the original a rticle published by the Institute of Mathematical Statistics in Statistic al Scienc e , 2007 , V ol. 22, No. 3, 407– 413 . This reprint diﬀers from the o riginal in pagination a nd t yp ogr aphic detail. that. They heroically had sifte d the lite rature and endea v ored to arrange ideas in order, often p ro vid- ing references to multiple p ro ofs and m ultiple vi ew- p oint s on k ey results, with reference to a v ariet y of applied ﬁelds. Man y of the ke y ideas r elating to ma- jorization w ere already discussed in the (also justly celebrated) volume ent itled Ine qualities b y Hardy , Littlew o od and P´ oly a ( 1934 ). Indeed, this slim vo l- ume still merits o cca sional revisits since there re- main in it many “seedlings for fur ther researc h” (t o b orrow Kingman’s apt descriptive phase). Of course the Hardy , Littlew o o d and P´ o lya v olume, though slim and printed on sm all pages, wa s all meat and no gra vy: more lik e a s eries of insigh tful telegrams. Only a r elativ ely small num b er of r esearc hers w ere inspired by it to work on questions relating to ma- jorization. But things w ere diﬀeren t after 1979. Marshall and Olkin s old the p ro duct muc h more eﬀectiv ely . When- ev er a situation w as en countered in whic h a solution or an extreme case in v olv ed a discrete uniform dis- tribution, the p ossibilit y of a ma j orizatio n pro of w as no w apparen t if not to all, ce rtainly to man y , and 1 2 B. C. A RNOLD certainly in many diﬀerent areas of researc h. More- o v er, if a uniform allocation or distribution w as in a s en s e optimal, then the concept of m a jorization frequen tly could b e used to o rder comp eting allo ca- tions o r distributions. Naturally extensions of the ma jorization concept w ere p ossible and indeed m an y ha v e b een fr uitfully in tro duced. The fo cus of th e pr esent article is, ho w- ev er, o n classic al ma jorization. T h e goal is to p ro- vide a hin t (via selected examples f r om the p ost- 1979 literature) of the v ast array of settings in whic h ma jorizati on pro vides a u s efu l and int erpretable or- dering. In no sense can such a survey b e complete. I ap ologize, in adv ance, to researc hers who, quite le- gitimate ly , can p oin t to p ap ers of their o wn whic h they feel wo uld b e ev en b etter illustrations of the theme: Ma jorization, here, there and ev erywhere. Nev ertheless it is my hop e that t he examples se- lected will b e found to b e inte resting, to b e su ﬃ - cien tly dive rse in ord er to illustrate the p oten tial ubiquit y of disp ers ion ordering (a.k.a. ma j orization) concepts and, p erhaps, to in spire researc hers to s eek ev en more researc h n ic hes in whic h m a jorization and Sc hur con v exit y will pla y a us efu l role. 2. SOME NEEDED DEFINITIONS W e w ill sa y that a vect or x ∈ R n ma jorizes an- other ve ctor y ∈ R n and write x ≻ y if for eac h k = 1 , 2 , . . . , n − 1 we ha v e k X i =1 x i : n ≤ k X i =1 y i : n and n X i =1 x i : n = n X i =1 y i : n . In the ab ov e we denote the ordered co ordinates of a v ector x ∈ R n b y x 1 : n ≤ x 2 : n ≤ · · · ≤ x n : n . A function g : R n → R is said to b e Sc hur con- v ex if x ≻ y imp lies g ( x ) ≥ g ( y ). F or additional de- tails a nd alternativ e c haracterizat ions of ma joriza- tion an d Sch u r con v exit y , w e natur ally refer to Mar- shall and Olkin ( 1979 ). In short, the v ector x ma jorizes y if the co ordi- nates of x are more disp ersed than are the coordi- nates of y , sub ject to the constraint that the su m of the co ordinates of x and o f y is the same. A Sc h ur conv ex fu n ction then is one that increases as disp ersion increases (where the concept of disp er- sion used is sp eciﬁcally link ed to the ma jorizatio n order). The extremal case under the ma jorization ord er corresp onds to the c hoice x i = ( P n j =1 x j ) /n . I n par- ticular then, a Sch ur con v ex fun ction will tak e on a larger v alue when th ere is s ome v ariabilit y in x than it d o es wh en there is no v ariabilit y [i.e., when x i = ¯ x = ( P n j =1 x j ) /n, i = 1 , 2 , . . . , n ]. Man y examples of Sch ur con ve x functions can of course b e found in the literature. Pe rhaps the sim- plest example is what is cal led a separable co nv ex function. It is of the form g ( x ) = n X i =1 h ( x i ) , where h is a con v ex function. W e no w begin our tour of examples in the liter- ature in whic h ma jorization make s ca meo an d /or starring app earances. One can ev en consider a v ariation of the children’s game “Where’s W a ldo?”. In that ga me a v ery co m- plicated picture is pro vided in whic h, hidden a w a y , is a picture of the hero W aldo. He is alwa ys there, but he is o ften hard to ﬁnd . Simila rly w e can view v ari- ous areas of statistical researc h and/or app licatio ns as b eing rather complicat ed scenes in w hic h p erhaps W aldo, a.k.a. ma jorization, may wel l b e lurking. The searc h b egins. 3. CO VERING A CIRCLE WITH RANDOML Y PLA CED ARCS Supp ose that n arcs of lengths ℓ 1 , ℓ 2 , . . . , ℓ n are placed indep end en tly and uniformly on the unit cir- cle (a circle wit h unit circum f erence). Let P ( ℓ ) de- note the pr obab ilit y that the un it circle is com- pletely co v ered by these arcs. Th e problem is only in teresting when th e total length of the arcs L = P n i =1 ℓ i exceeds 1, the circumference of th e circle. W e therefore a ssum e t hat L > 1. In the sp ecial case in w hic h the arcs are of equal lengths (sa y ¯ ℓ = L/n ), the required probabilit y was pr o vided by S teve ns ( 1939 ). Speciﬁcally w e ha v e P ( ¯ ℓ 1) = n X k =0 ( − 1) k  n k  (1 − k ¯ ℓ ) n − 1 + . (3.1) A t the other extreme, if one arc is of length L and the o thers of length 0, co ve rage is certain. It would app ear then that, in this s ituation, in creasing the v ariabilit y among the ℓ i ’s sub ject to the sum b e- ing equ al to L , might well b e asso ciated with an increase in t he cov erage probabilit y . Prosc han c on- jectured that P ( ℓ ) is a Sc hur co nv ex f u nction. It is MAJORIZA TION: HERE, THERE AND EVER YWHERE 3 indeed Sch u r con ve x b ut it is not that easy to ve rify . Detail s were pro vided by Huﬀer and Shepp ( 1987 ). Not sur prisingly , the argumen t is based on study- ing the eﬀect o n P ( ℓ ) of making a small c hange in t w o u nequal ℓ i ’s (to mak e them more alik e) holding the other lengths ﬁ xed. W aldo is here, bu t h e is n ot easily un mask ed. 4. W AITING F OR A P A TTERN If w e seat a mo nkey at a keyboard and ha v e him t yp e letters, spaces and punctuation marks at ran- dom, it is common knowle dge that ev en tually he will pro du ce a p erfectly typed v ersion of the Gett ysbu rg Address and, for that ma tter, the e ntire conte nts of the 2004 edition of the Encyclop e dia Brittanic a . But w e would hav e to w ait a rather long time to see this. The mathematica l form ulation of the monk ey’s activiti es in v olv es observing a sequence X 1 , X 2 , . . . of indep endent id en tically distributed random v ari- ables with p ossible v alues 1 , 2 , . . . , k and asso ciated p ositiv e p r obabilities p 1 , p 2 , . . . , p k . Let N denote the w aiting time un til a particular consecutiv e string of outcomes is observ ed, or one of a particular set of outcome strings is observ ed. If w e are wait ing for the string t 1 , t 2 , . . . , t ℓ where eac h t j is a n um b er c ho- sen from the set 1 , 2 , . . . , k , there are sev eral w a ys in whic h v ariabilit y can aﬀect the w aiting time random v ariable N . Th e random v ariable will b e aﬀected by v ariabilit y among the p i ’s, the pr ob abilities of th e individual p ossible v alues o f the X ’s. It will b e also aﬀected b y the v ariabilit y among the t j ’s app earing in the string whose app earance we are a wait ing. F or example, w e might exp ect to ha v e to w ait longer for a s trin g of ℓ consecutiv e lik e outcomes than f or a string of ℓ distinct outcomes. P ossibilities for a role for ma jorization abou n d here. In particular, Ross ( 1999 ) considers the w aiting time N unt il w e obs erv e a ru n of k observ ed v alues of the X i ’s that includes a ll k of the p ossible v alues of the X i ’s, as a fun ction of p = ( p 1 , . . . , p k ). Here indeed it is p ossible to v erify that for ev ery n, P ( N > n ) is a Sch ur conv ex fu nction of p , and consequent ly that E ( N ) is also S c h ur con vex as a fun ction of p . The shortest w aiting time is thus asso ciated with the case in wh ic h the p j ’s a re all equal to 1 /k . 5. P AIRED COMP ARISONS The th eory of paired c omparisons has found con- siderable application in the study of pr ofessional sp orting c onte sts. A t the end of a t ypical season eac h of the k teams in the lea gue will hav e pla y ed eac h other team a giv en n umb er, sa y p , of times. F or simplicit y , w e ignore suc h factors as h ome ﬁeld ad- v ant age and w e assume that the rules of the leag ue exclude th e p ossibilit y of ties. S imilar analysis might w ell b e applied to taste-t esting exp eriment s and other paired comparison scenarios, but we will follo w Jo e ( 1988 ) a nd focus on the sp orts setting. In mo d eling th is scenario, it is con v enien t to con- sider a k × k matrix P = ( p ij ) in whic h, for i 6 = j, p ij denotes the probabilit y that t eam i will b eat team j in a particular game. Of course we ha v e p ij + p j i = 1 , recalli ng our assumption that ties do n ot o ccur. W e lea ve the diagonal elemen ts of P empt y so that P h as n ( n − 1) nonnegativ e elemen ts. The strength of a particular team, sa y team i , is to some exten t measured b y its corresp onding row total p i = P j 6 = i p ij . F or a giv en vect or p of team strengths, w e can consider the c lass P ( p ) o f all probabilit y m atri- ces P with only oﬀ-diagonal element s d eﬁ n ed and with r o w totals giv en by p . It is reasonable to a ssume t hat if team i is b etter than team j (i.e., if p ij ≥ 0 . 5) and if team j is b etter than team k , then team i should b e b etter than team k . Jo e calls the matrix P weakly transitiv e if p ij ≥ 0 . 5 and p j k ≥ 0 . 5 imply p ik ≥ 0 . 5. A stronger con- dition i s also plausible. He deﬁnes P to b e strongly transitiv e if p ij ≥ 0 . 5 and p j k ≥ 0 . 5 imply p ik ≥ max( p ij , p j k ). Where do es ma jorization come int o this picture? Eac h matrix P in P ( p ) can b e rearranged as an n × ( n − 1)-dimensional r o w ve ctor denoted by P ∗ . W e will write P ≺ Q iﬀ P ∗ ≺ Q ∗ in the usual sense of m a jorizat ion. A matrix P ∈ P ( p ) is said to b e minimal if Q ≺ P implies Q ∗ = P ∗ up to rearrange- men t. Jo e ( 1988 ) v eriﬁes that any strong transitive P is minimal. V ariations in whic h ties a nd home ﬁeld adv ant age are considered are also discussed in Jo e ( 1988 ). 6. PHASE TYPE DISTRIBUTIONS In a contin u ou s -time Mark o v c hain with ( n + 1) states, of which n states (1 , 2 , . . . , n ) are trans ient and state n + 1 is absorbing, the time T until ab- sorption in state n + 1 is said to h a v e a phase t yp e distribution (Neuts, 1975 ). S uc h distributions are parameterized by an initial distribution v ector for the c hain, α = ( α 1 , α 2 , . . . , α n ) (w e assum e that the probabilit y of b eginning in state n + 1 is 0), and 4 B. C. A RNOLD a m atrix of int ensities of transitions among the n transien t states Q . T he element s of Q satisfy q ii < 0 , i = 1 , 2 , . . . , n , and q ij ≥ 0 , j 6 = i . In suc h a setting T is said to ha ve a phase type distribution with parameters α and Q and w e write T ∼ P H ( α, Q ) . A ve ry simple example is the one in whic h α = α ∗ = (1 , 0 , 0 , 0 , . . . , 0) and Q = Q ∗ where q ∗ ii = − δ , ∀ i and q ∗ ij = δ for j = i + 1 while q ij = 0 otherwise. In this situation the c hain b egins in state 1 , and then successiv ely mo v es through states 2 , 3 , . . . , n , sp end- ing an exp onenti al ( δ ) time in eac h s tate. Conse- quen tly the time to ab s orp tion, say T ∗ , w ill b e a sum of n i.i.d. exp onen tial random v ariables and so T ∗ ∼ gamma( n, δ ) (in queueing con texts this is o f- ten called the Erlang distribution rather than the gamma d istribution). W e sa y th at a p hase t yp e distribu tion is of ord er n if n is the sm allest inte ger suc h that the distribution can b e identiﬁed with the absorp tion time of a c hain with n transient states and one absorbing state. I t app ears that, in some sense, T ∗ exhibits the most regular b ehavi or of any phase t yp e distribution of order n . Th is can b e made pr ecise in t erms of what is call ed the Lorenz order, a natural extension of ma jorizati on. Let L denote the class of n onnegativ e rand om v ariables with ﬁnite p ositiv e exp ecta tions. (This can b e extended to allo w the r andom v ariables to assume negativ e v alues, but for our p resen t pur p oses this i s not needed.) F or X and Y in L , w e will write X ≤ L Y iﬀ E ( g ( X/E ( X ))) ≤ E ( g ( Y /E ( Y )) for ev ery con- tin uous conv ex function g . Ma jorizat ion can b e iden- tiﬁed as a sp ecial case here by c ho osing X and Y to eac h ha v e n equally lik ely v alues x 1 , x 2 , . . . , x n and y 1 , y 2 , . . . , y n , resp ectiv ely , with E ( X ) = E ( Y ). More detailed discussion of the Lorenz order on L ma y b e found in Arnold ( 1987 ). Aldous and Shepp ( 1987 ) sho we d th at T ∗ [with its gamma( n, δ ) distrib ution] has the s m allest co eﬃcient of v ariation among p hase t yp e distribution of order n , that is, it minimizes E (( T E ( T ) ) 2 ). More generally , O’Cinneide ( 1991 ) v er- iﬁed that T ∗ ≤ L T for an y v ariable T that is phase t yp e of order n , th us conﬁr ming the fact that T ∗ exhibits the least “v ariabilit y” (as measured b y t he Lorenz order). 7. CA TCHABILIT Y An island comm unit y con tains an un kno wn num- b er ν of sp ecies of bu tterﬂies. Butterﬂies are sequen - tially trapp ed u n til n individuals hav e b een cap- tured. Denote b y r , the num b er of distinct sp ecies represen ted among the captured butterﬂies. W e may w ell use r (and n ) to help us estimate ν . A typica l stochastic mo d el for this pr oblem is based on the assumption that butterﬂies from sp ecies j, j = 1 , 2 , . . . , ν , enter the trap according to a Poisson ( λ j ) pro cess and that these P oisson pro cesses are inde- p endent. Deﬁne p j = λ j / P ν i =1 λ i . The p robabilit y that a particular butterﬂy trapp ed is fr om sp ecies j is then give n by p j , j = 1 , 2 , . . . , ν . The p j ’s can b e in terpreted a s measur es o f “catc habilit y” of the v arious sp ecies. T he simplest mo d el is that of equal catc hability (i.e., p j = 1 /ν , j = 1 , 2 , . . . , ν ). If we as- sume that ν ≤ n , then, u nder the equal c atc habilit y mo del, a minim u m v ariance u n biased estimate of ν , based on r , exists. It is giv en b y ˆ ν = S ( n + 1 , r ) /S ( n, r ) (7.1) where S ( n, x ) is a Stirling n um b er of the second kind. What happ en s w h en the sp ecies v ary in catc h - abilit y? In an extreme case in wh ich one partic- ular sp ecies is easily trapp ed and the others are extremely diﬃcult to t rap, w e will u sually observe r = 1 and will consequentl y b adly underestimate ν . Indeed as Nay ak and Chr istman ( 1992 ) observ e, the random n um b er R of sp ecies captured has a distri- bution whic h is a S c h ur con v ex function of p . Th us the e stimate ( 7.1 ) and other estima tes whic h are sensible un der equal catc habilit y w ill b e negativ ely biased with the bias increasing as the catc habilit y b ecomes more v ariable. 8. DISEASE TRANSMISSION T ong ( 1997 ) iden tiﬁes an interesti ng m a jorization feature of a disease transmission mo d el d ue to Eisen b erg ( 1991 ). Consider a closed p opulatio n of n + 1 individuals. On e ind ividual (n umb er n + 1) is susceptible to the disease but as y et is u ninfected. The other n individuals are carriers of the disease. If individual n + 1 has a single cont act with ind ivid- ual i , w e denote the p robabilit y of av oiding infection b y p i , i = 1 , 2 , . . . , n . It is assum ed that ind ividu al n + 1 mak es a total of J con tacts with individu als in the p opulation in ac- cordance w ith a preference ve ctor α = ( α 1 , α 1 , α 2 , . . . , α n ), where α i > 0 , i = 1 , 2 , . . . , n , and P n i =1 α i = 1. In addition, individual n + 1 has a lifest yle ve ctor k = ( k 1 , k 2 , . . . , k J ) w here the k i ’s are n onnegativ e in tegers s umming to J . F or give n vect ors α and k , the individual n + 1 pro ceeds as follo ws. He/she ﬁr st pic ks a partner from among the n carriers according MAJORIZA TION: HERE, THERE AND EVER YWHERE 5 to the p r eference vect or α . Thus he/she will select individual 1 w ith pr ob ability α 1 , individual 2 with probabilit y α 2 , and so on. He/she then mak es k 1 con tacts with this partner. Then he/she selects a second partner (i t could b e the same one) acco rd - ing to the p reference vect or α and h as k 2 con tacts with this partner. Th e pr o cess terminates afte r all J = P J i =1 k i con tacts hav e b een made. Denote the probabilit y of escaping infection by H ( k , α, p ), de- p endin g as it do es on lifest yle ( k ), preference ( α ) and v ariable non transmission p robabilities ( p ). There are sev eral p ossib le roles for ma jorizatio n here. V ariabilit y among the coord inates of k , α and/or p can b e exp ected to aﬀect H ( k , α, p ). T ong ( 1997 ) fo cuses on the lifest yle vect or k . Tw o extreme lifest yles are readily identiﬁed. Th e ﬁr st one corre- sp onds to k = ( J, 0 , 0 , . . . , 0) wh ich could b e called a monogamous style . Here a p artner is r andomly c hosen according t o the preference ve ctor α and a ll con tacts are made with this ind ividual. The sec- ond extreme life styl e has k = (1 , 1 , 1 , . . . , 1). In this case eac h con tact is made with a randomly c ho- sen in d ividual. The p robabilit y of escaping in fection with k = ( J, 0 , . . . , 0) is clearly P n i =1 α i p J i while the probabilit y of esca ping infection using the lifest yle (1 , 1 , 1 , . . . , 1) is ( P n i =1 α i p i ) J . It follo ws via Jensen’s inequalit y that one has a larger probabilit y of es- caping infecti on w ith the “monogamo us” lifestyl e ( J, 0 , . . . , 0) than with the “random” lifest yle (1 , 1 , 1 , . . . , 1). This holds for ev ery α and every p . But of course these tw o lifest yles are extreme cases with re- gard to ma jorization. It is then quite p lausib le that the pr obabilit y of escaping infect ion is a Sch u r con- v ex function of the lifest yle v ector k . Ind eed, T ong ( 1997 ) conﬁrms this conjecture. He also is able to get some results when the num b er J o f con tacts is a random v ariable. S everal in teresting asp ects of this problem rema in open. 9. APPORTIONMENT IN PROPORTIONAL REPRESENT A TION The ideal of one m an–one v ote is often app roac hed b y the device of prop ortional r ep r esen tation. Th us if there are N seats a v ailable and if a p olitical p arty receiv ed 100 q % of the v otes, then ideally that party should b e assigned N q seats. Bu t fractional seats cannot b e assigned (or b etter y et ar e not assigned, since there seems to b e no reason why t hey could not b e assigned, except p erhaps for aestheti c con- siderations). W hich metho d of roun ding should b e used to arrive at an assig nment of inte ger-v alued n umb ers of seat s to ev ery part y in a m an n er essen- tially reﬂecting pr op ortional represen tation? T his is not a new problem. Seve ral v ery w ell-kno w n Amer- ican p oliticians hav e prop osed method s of round - ing for use in th is situation. Balinski and Y oung ( 2001 ) pro vide a go o d sur v ey of the metho ds usually considered. Marshall, Olkin and Pu k elsheim ( 2002 ) highligh t the role of ma jorizat ion in comparing the v arious candidate round ing metho d s . John Qu incy Adams p r op osed a method that w as kind to small parties ( round ing u p their repr esen tation), while at the other extreme Th omas Jeﬀerson urged roun ding do wn, whic h fa vo rs large partie s. Other popu lar in- termediate strategies are asso ciated with the n ames Dean, Hill and W ebster. It is easiest to describ e all of these app ortion- men t metho ds in terms of a sequence of signp osts whic h determine r ou n ding decisions. The signp osts s ( k ) are num b ers in the in terv al [ k, k + 1] such that s ( k ) is a strictly increasing function of k . Th e cor- resp ondin g r ounding rule is that a num b er in the in terv al [ k , k + 1] is r ounded do wn if it is less than s ( k ) and is r ounded up if it is greater than s ( k ) . If the n um b er is exactly equal to s ( k ), then w e ma y round up or d own. So-calle d p o w er-mean signp ost sequences hav e b een p opular. They are of th e fo rm s p ( k ) =  k p 2 + ( k + 1) p 2  1 /p , (9.1) −∞ ≤ p ≤ ∞ . The ﬁv e most p opular app ortionmen t method s can all b e view ed as ha ving b een b ased on a particu- lar p ow er-mean signpost sequ en ce. The Adams rule (rounding up) corresp onds to p = −∞ , the D ean rule corresp ond s to p = − 1 , the Hill rule corresp ond s to p = 0, the W ebster rule to p = 1 and ﬁnally the Jeﬀerson ru le (roun ding d o wn) corresp onds to p = ∞ . Marshall, Olkin and Puk elsheim ( 2002 ) sho w that the seating vect or pro duced by a p o w er-mean rounding ru le of order p will alwa ys b e ma jorized b y the seating v ector prod u ced b y a p o we r-mean rounding rule of order p ′ if and only if p ≤ p ′ . Con- sequen tly , among the ﬁve p opular app ortionmen t rules, the c hange when moving from the Adams rule to w ard the Jeﬀerson rule is a c hange in fa v or of large parties in a ma jorization sense. The mo v e from an Adams app ortionment to wa rd a Jeﬀerson app or- tionmen t can actually b e accomplished by a series of single seat reassignmen ts from a p oorer party (with 6 B. C. A RNOLD few er vote s) to a ric her p art y (with more v otes) [p ar- alleling reverse Robin Ho o d (a.k.a. P igou–Dal ton) income transf ers in an economic sett ing]. 10. MA JORIZA TION IN ST A TISTICAL MECHANICS The state space of a p h ysical system, S n , can b e iden tiﬁed with the set of al l probabilit y v ectors p = ( p 1 , p 2 , . . . , p n ) ′ where p i ≥ 0 and P n i =1 p i = 1. A u s e- ful p artial o rder in th is con text is relat ed to the information con ten t of the states. F or t wo states p and q , it is prescrib ed that p ≺ q iﬀ there exists a doubly stochasti c matrix T with p = T q . Bu t of course, app ealing to the classica l result of Hardy , Littlew o od and P´ oly a ( 1929 ), this is in fact the ma- jorization partial order (and the notation is th u s consisten t w ith our usage in earlier s ections of this pap er). In this con text separable conca ve functions are c alled ge neralized en tropies. A related p artial order is deﬁn ed on k -tuples of states. F or t wo k -tup les ( p 1 , p 2 , . . . , p k ) and ( q 1 , q 2 , . . . , q k ) w e deﬁne ( p 1 , p 2 , . . . , p k ) ≺ ( k ) q 1 , q 2 , . . . , q k ) iﬀ there exi sts a s to c hastic matrix T suc h that p i = T q i , i = 1 , 2 , . . . , k . In particular when k = 2 , a par- tial orderin g deﬁned with resp ect to a reference state s b ecomes of interest . The partial order rel- ativ e to s is deﬁn ed by p ≺ s q iﬀ ( p, s ) ≺ (2) ( q , s ) . (10.1) It ma y b e noted that if s is c h osen to b e equal to e = ( 1 n , . . . , 1 n ), then the correspond ing partial order (relativ e to e ) coincides with the u sual ma jorizati on order. T h us the partial ordering ≺ s is a genuine ex- tension of the classical ma jorization order. Dynamic processes in the sta te space S n can b e iden tiﬁed with indexed f amilies o f stoc hastic matri- ces. Suc h processes whic h preserv e the s -partial or- der ha v e b een studied in some detail . A con v enien t in tro ductory reference is Zylk a ( 1985 ). Sc hur con v ex fun ctions and analogous s -Sc h ur con- v ex fu nctions turn out to ha v e useful thermo d y- namic i nterpretat ion in this con text. 11. CONNECTED COMPONENTS IN A RANDOM GRAPH Ross ( 1981 ) considers a random graph w ith no des n umb ered 1 , 2 , . . . , n . S upp ose that X (1) , X (2) , . . . , X ( n ) are indep endent identic ally distributed r an- dom v ariables eac h with p ossible v alues 1 , 2 , . . . , n and wit h common distribution deﬁned b y P ( X ( i ) = j ) = p j , j = 1 , 2 , . . . , n, (11.1) where p j ≥ 0 , ∀ j and P n j =1 p j = 1. W e constru ct the random graph by drawing the n random arcs ( i, X ( i )) , i = 1 , 2 , . . . , n . In this manner, one arc emanates from eac h no d e. Ho wev er, of course, sev eral arcs can ter- minate at the same no d e. T he resulting graph will ha v e a r andom num b er of connected comp onent s. A connected comp onen t of t he graph is a set o f nod es suc h that any pair of them is linke d by an arc in th e graph, and there are no arcs joining an y no des in the set with any no de outside the set. Let us d e- note the random n um b er of su c h connected sub- sets b y M . The distribu tion of M will of course b e inﬂu enced b y the probabilit y v ector p , app ear- ing in ( 11.1 ), whic h go ve rn s the distribution of the random arcs X (1) , X (2) , . . . , X ( n ). F or example, if p = (1 , 0 , 0 , . . . , 0), then all arcs w ill terminate at no de 1 and there will b e a single con- nected s ubset of no des in the random graph, that is, M = 1. The f ollo w in g expression for the exp ecte d v alue of M is pr o vided b y Ross: E ( M ) = X S ( | S | − 1)! Y j ∈ S p j (11.2) where the summ ation extends ov er all nonempty subsets of { 1 , 2 , . . . , n } . It is then p ossible, u sing this expression, to v erify that E ( M ) is a Sch u r conca v e function of p . Consequentl y the exp ect ed n umb er of connected comp onen ts of t he graph is maxi mized if p j = 1 /n, j = 1 , 2 , . . . , n . 12. A STOCHASTIC RELA TION BETWEEN THE SUM OF TW O RANDOM V ARIABLES AND THEIR MAXIMUM Supp ose that X = ( X 1 , X 2 ) is a random vec tor with n onnegativ e co ordinate random v ariables X 1 , X 2 . It is often of in terest to co mpare the tail b eha vior of X 1 , X 2 with that of max( X 1 , X 2 ). In the con text of construction of conﬁdence in terv als for the diﬀer- ence b et wee n normal m eans with un equal v ariances (a Behrens–Fisher setting), Dalal and F ortini ( 1982 ) iden tiﬁed a suﬃ cien t condition f or sto chastic order- ing b et wee n X 1 + X 2 and √ 2 max( X 1 , X 2 ) that in - v olv es Sc hur con vexit y . Sp eciﬁcall y they pro ve t hat a suﬃcie nt condition f or P ( X 1 + X 2 ≤ c ) ≥ P ( √ 2 max( X 1 , X 2 ) ≤ c ) MAJORIZA TION: HERE, THERE AND EVER YWHERE 7 for any c ≥ 0, is that the joint d ensit y of ( X 1 , X 2 ), sa y f ( x 1 , x 2 ), is suc h that f ( √ x 1 , √ x 2 ) is a Sc hur con v ex function of x . The proof in v olv es condition- ing on X 2 1 + X 2 2 and observing th at on any curv e x 2 1 + x 2 2 = t , the joi nt densit y f ( x 1 , x 2 ) in creases as one mo ve s a w a y from the line x 1 = x 2 . An imp ortant sp ecial case in whic h the hyp othe- ses are satisﬁed is the situation in which ( X 1 , X 2 ) = ( | Y 1 | , | Y 2 | ) where Y ∼ N (2) (0 , σ 2  1 ρ ρ 1  ). A related n -dimensional result is also pr o vided b y Dalal and F o rtini ( 1982 ). They sho w that if X 1 , X 2 , . . . , X n are i.i.d. p ositiv e rand om v ariables with com- mon d ensit y f and if log f ( √ x ) is conca ve and f ( x ) /x is nonin creasing, then n X i =1 X i ≤ st √ n max( X 1 , X 2 , . . . , X n ) . 13. FURTHER EXAMPLES The list co uld b e co ntin u ed . Sc hur conv exit y and ma jorizati on can b e found in man y other settings. T o conclude our short s u rv ey , w e w ill merely men- tion brieﬂy a few mo re interesti ng settings in whic h W aldo app ears: (i) the study of p eak edness of univ ariate and m ultiv ariate distributions, (ii) a dmissib ilit y of tests in multiv ariate analysis of v ariance, (iii) p robabilit y con ten t of regions for a Sc h ur con- ca v e join t d ensit y , (iv) the study of dive rsity in ecol ogical en viron- men ts, (v) income and wea lth inequalit y measuremen t (with multiv ariate extensions). As observ ed in th e In tro du ction , there are man y more examples in the literature and there is no rea- son to b eliev e that the s earch for new applications of ma jorizati on and S c h ur con ve xit y will falter in the next 25 y ears. Wh en the Ine qualities v olume cele- brates its golden jubilee, an ev en m ore extensive and fascinating a rray o f app earances can b e conﬁd en tly predicted. The searc h for W aldo will con tin ue apace. REFERENCES Aldous, D. and Shepp, L. (1987). The least v ariable phase type distribution is Erlang. Comm. Statist. Sto chastic Mo d- els 3 467–47 3 . MR092593 6 Arnold, B. C. (1987). Majorization and the L or enz Or der : A Brief Intr o duction . Springer, Berlin. MR091515 9 Balinski, M. L. and Young, H. P. (2001). F air R epr esen- tation Me eting the Ide a of One M an , One V ote , 2nd ed. Brookings Institut e Press, W ashington, D C. MR064924 6 Dal al, S . R. and F or tin i, P. (1982). An ineq ualit y compar- ing sums and maxima with application to Behrens–Fisher type problem. Ann. Statist. 1 0 297–301. MR064274 1 Eisenberg, B . (1991). The eﬀect of v ariable infectivity on the risk of HIV infection. Statist. Me dicine 9 131–139. Hardy, G. H., Littlew ood, J. E. a nd P ´ ol y a, G. (19 29). Some si mple inequalities sati sﬁed by conv ex functions. Messenger of Mathematics 58 145–1 52. Hardy, G. H., Littlew ood, J. E. a nd P ´ ol y a, G. (19 34). Ine qualities . Cambridge U niv. Press. Huffer, F. W . and She pp, L. A. ( 1987). On the probability of cov ering the circle by random arcs. J. Ap pl. Pr ob ab. 24 422–42 9 . MR088980 6 Joe, H. (1988). Majorization, entrop y and paired compar- isons. A nn. Statist. 16 915–9 25. MR094758 5 Marshall, A. W. and Olkin, I. (1979). Ine quali ties : The ory of Majorization and I ts Applic ations . Academic Press, N ew Y ork. MR055227 8 Marshall, A. W., Olkin, I. a nd P ukelsheim, F. (2002). A ma jorizatio n comparison of apportionment metho ds in prop ortional representation. So c. Choic e Welf . 19 885–900. MR193501 0 Na y ak, T. K. and Chri stman, M. C . (1992). Eﬀect of un- equal catchabilit y on estimates of the number of class es in a population. Sc and. J. Statist. 19 281–2 87. MR1183202 Neuts, M . F. (1975). Computational uses of the metho d of phases in the theory of queues. Comput . M ath. Appl. 1 151–16 6 . MR038605 5 O’Cinneid e , C. A. (1991). Phase-t yp e distributions and ma- jorizatio n. Ann. Appl. Pr ob ab. 1 219–227. MR110231 8 Ro ss, S. (1981). A random graph. J. Appl. Pr ob ab. 18 309 – 315. MR059895 0 Ro ss, S. (1999). The mean w aiting time for a p attern. Pr ob ab. Engr g. Inform. Sci. 13 1–9. MR166636 4 Stevens, W. L. (1939). Solution to a geometrical problem in probabilit y . An n. Eugenics 9 315–320 . MR000147 9 Tong, Y. L. (1997). Some ma jorizati on orderings of hetero- geneit y in a class of epidemics. J. Appl. Pr ob ab. 34 84–93. MR142905 7 Zylka, C. ( 1985). A note on the attainabilit y of states by equalizing pro cesses. The or. Chim. A cta 68 363–37 7.

Majorization: Here, There and Everywhere

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment