A penalized inference approach to stochastic block modelling of community structure in the Italian Parliament

A p enalized inference approac h to sto c hastic blo c k mo delling of comm unit y structure in the Italian P arliamen t Mirk o Signorelli and Ernst C. Wit Citation info: Signorelli, M. and Wit, E. C. (2017), A p enalized inference approach to stochas- tic blo ck mo delling of comm unity structure in the Italian Parliamen t. Journal of the R oyal Statistic al So ciety: Series C . DOI: 10.1111/rssc.12234 The published version of this man uscript is av ailable with Op en Access from the w ebsite of the Journal of the Roy al Statistical So ciet y: Series C at h ttp://onlinelibrary .wiley .com/doi/10.1111/rssc.12234/full 1 © 2017 The A uthors Jour nal of the Ro yal Statistical Society: Series C (Applied Statistics) Published b y John Wiley & Sons Ltd on behalf of the Ro yal Statistical Society . This is an open access ar ticle under the ter ms of the Creativ e Commons Attribution-NonCommercial License, which permits use, distr ib ution and reproduction in any medium, provided the original w or k is properly cited and is not used f or commercial pur poses . 0035–9254/17/67000 Appl. Statist. (2017) A penaliz ed inference appr oach to stochastic bloc k modelling of comm unity structure in the Italian P arliament Mirko Signorelli Leiden University Medical Center and Univ ersity of Groningen, The Netherlands, and University of P adov a, Italy and Er nst C. Wit Univ ersity of Groningen, The Nether lands [Received J uly 2016. Revised Ma y 2017] Summary . W e analyse bill cosponsorship netw or ks in the Italian Chamber of Deputies . In comparison with other parliaments, a distinguishing f eature of the Chamber is the large number of political groups . Our analysis aims to infer the pattern of collaborations between these groups from data on bill cosponsorships . W e propose an extension of stochastic bloc k models f or edge- valued graphs and derive measures of group productivity and of collabor ation between political par ties. As the model proposed encloses a large number of parameters, we pursue a penaliz ed likelihood approach that enables us to inf er a sparse reduced graph displa ying collabor ations between political par ties . K eywords : Adaptive lasso; Bill cosponsorship; Community str ucture; Network; P enaliz ed likelihood; Stochastic bloc k model 1. Introduction The legislati ve pr ocess in modern democr acies typicall y in v olves three fundamental steps: the pr oposal of a bill, a discussion on its contents and a ﬁnal vote on it. Throughout this pr o- cess , man y interactions and colla bora tions can arise between dif ferent political actors , who join their ef f orts to support, change or oppose a pr oposed legisla tion. The analysis of these interac- tions can, then, pr o vide insight into the fea tures and the mode of opera tion of dif ferent parlia- ments , and on the wa y and the extent to w hich these interactions can inﬂuence the legisla tiv e pr ocess . T w o types of da ta are often considered in this conte xt. The ﬁrst is represented b y bill cospon- sorships netw orks (F owler, 2006; R occa and Sanchez, 2007; Parigi and Sartori, 2014). A parlia- mentarian can sponsor a bill individually , or cosponsor it to gether with other parliamentarians . In the la tter case, bill cosponsorship implies a f or mal colla bora tion between its pr oponents , who of ﬁcially sta te their a greement and support of the legisla tion pr oposed. The second is giv en by r oll-call v otes (Kirkland, 2014; Dal Maso et al. , 2014), in which par liamentarians express their ﬁnal decision on a bill. Addr ess for correspondence : Mirk o Signor elli, Department of Medical Sta tistics and Bioinformatics, Leiden Univ ersity Medical Center, Einthov enweg 20, 2333 ZC Leiden, The Netherlands . E-mail: m.signorelli @ lumc.nl 2 M. Signorelli and E. C . Wit In this paper w e stud y bill cosponsorship in the Italian Chamber of Deputies ov er the last f our legislatur es, co vering the period 2001–2015. W e represent bill cosponsorships by means of an undir ected gra ph, where a weighted edge displa ys the number of bills tha t two deputies ha ve cosponsor ed to gether . Compar ed with other parliaments, such as the American Congr ess or the Ger man Bundesta g, a distinguishing featur e in the history of the Italian Parliament is the pr esence of a large number of political factions . Our primary aim is to infer a gra ph tha t summarizes colla bora tions within and between parties fr om the netw ork of bill cosponsorships, whose actors ar e the deputies . W e tackle this issue by viewing edges e ij in the graph as a result of a P oisson pr ocess tha t explicitl y depends on gr oup memberships of nodes i and j , as well as on their individual a t- trib utes . The model tha t we pr opose b uilds on the stoc hastic b loc k models that ar e emplo yed in social networ k anal ysis , w hich w e r eview in Section 1.1. W e resort to gener aliz ed linear mod- els and deriv e measures of group rele vance , and of colla bora tion or repulsion between groups . Finally , we pr opose a penaliz ed inference appr oach f or stochastic block models tha t ena bles us to reduce model complexity . W e sho w tha t, with the use of penalized lik elihood methods , a sparse reduced gra ph representing colla bora tions (and r epulsions) between political parties can be obtained directl y fr om the signs of the model parameters . Our anal ysis demonstrates the ev olution of the Italian political system fr om a highl y polariz ed political arena, in w hich deputies base colla borations on their identiﬁca tion with left- or right- wing v alues, tow ards an increasingly fragmented parliament, wher e a rigid separa tion of parties into coalitions does not hold any more, and colla bora tions bey ond the perimeter of coalitions ha ve become possible . 1.1. Stochastic b lock models Community membership can pla y an important r ole in sha ping social interactions . Social net- w orks are often fea tured by the presence of clusters of units tha t ar e str ongly link ed between themselves and weakly connected to individuals tha t fall outside their cluster , so tha t ignoring the preferential a ttachment of units based on community membership can lead to misleading interpreta tions of the determinants of netw ork ties . Thus, cluster identiﬁcation and assessment of the r elationship between gr oups of nodes in a netw ork ha ve been active topics of research in the analysis of social netw orks . Stochastic b lock models w ere ﬁrst introduced as a modiﬁca tion of the p 1 -class of models f or unweighted digr aphs tha t w as pr oposed by Holland and Leinhar dt (1981). If w e denote by X ij a Bernoulli random varia ble that tak es value 1 if an arro w fr om node i to node j is present, and is 0 otherwise, then the p 1 -model assumes tha t pairs of edges or dy ads Y ij = .X ij , X ji / are stochastically independent and expresses the proba bility of observing the arr ow X ij as a function of f our parameters, repr esenting the density of the gra ph θ , the tendency of arro ws to be recipr ocated, ρ , expansiv eness, α i , and popularity, β j , of nodes i and j . Fienberg and W asser man (1981) consider ed a situation in w hich a partition of units into p gr oups, also called b locks , is a vaila ble, proposing a more parsimonious representa tion wher e α i and β j are replaced by p expansi veness gr oup effects α r , such that α i = α i  f or every i and i  belonging to block r , and p popularity gr oup effects β s . The deﬁnition of a stochastic block model was pr oposed b y Holland et al. (1983). According to their deﬁnition, a pr oba bility distribution f or a gra ph deﬁnes a stochastic b lock model if the random v aria bles X ij are independent, and the random v ectors X ij and X kl are identically distrib uted if nodes i and k are members of the same block r , and j and l are in the same b lock s . Stochastic b lock models imply that nodes within a b lock are stochasticall y equiv alent, in the Modelling of Community Structure in the Italian P ar liament 3 sense tha t, if nodes i and k belong to the same b lock r , an y pr oba bility sta tement on the gra ph is left unchanged by interchanging them. Holland et al. (1983) criticiz ed the model that w as pr oposed b y Fienber g and W asser man (1981), deeming it too restrictiv e, and ad voca ted tha t the parameters θ , α r and β s should be replaced b y one parameter θ rs f or each pair of blocks .r , s/ . La ter, W ang and W ong (1987) proposed a netw ork model that retains the original f or mula tion of the p 1 -model with individual effects α i and β j b ut also includes a set of block interaction parameters φ rs : one f or each pair of blocks .r , s/ . Anderson et al. (1992) ela borated on the idea of stochastic b lock models, viewing them as ‘a mapping of appr oxima tely equiv alent actors into blocks or positions and a statement regarding the rela tions between the positions’. They consider ed the p 1 -class of models, and they pr oposed to r epresent r elational ties between blocks of units b y means of a reduced graph. They obtained such a graph setting a cut-of f c on the pr edicted pr oba bility of observing an arr o w fr om nodes in gr oup r to nodes in gr oup s ,ˆ π rs , and dra wing an arro w from r to s if ˆ π rs >c . Stochastic block models ha ve also been emplo yed f or comm unity detection in netw orks, ra ther than to describe rela tionships between blocks of nodes tha t are known a priori . This type of block modelling aims to ﬁnd clusters of highl y inter connected nodes and it is referr ed to as a posteriori block modelling (W asser man and Anderson, 1987; No wicki and Snijders, 2001). 2. Bill cosponsorship in the Italian Parliament The Italian Par liament is based on a bicameral system in which tw o separate assemblies, the Chamber of Deputies and the Sena te, pla y similar roles in the legislati ve pr ocess . Legislations can be pr oposed by dif ferent actors (including deputies, senators, the gov ernment, regions and gr oups of electors); here, we f ocus on legislation proposed by deputies . Each bill can be proposed by a single deputy, or cosponsored b y a gr oup of deputies . In the second case, bill cosponsorship deﬁnes a symmetric r elationship between deputies, w ho formally state their agr eement on the content of the proposed legisla tion b y cosponsoring it. Thus, cosponsorship can be tak en as a measure of pr oximity or colla bora tion between deputies . Bill cosponsorship can be represented as an undirected networ k w here nodes represent par- liamentarians , and the presence of an edge e ij indica tes tha t parliamentarians i and j ha ve cosponsored at least one legisla tion. W e associa te with each edge a weight equal to the n umber of bills tha t the tw o parliamentarians ha ve sponsor ed to gether in a gi ven time course (typicall y , one legisla ture). In the Italian Chamber, each deputy is r equired to express their afﬁlia tion to one and onl y one parliamentary gr oup, which typicall y corr esponds to a political party or to a coalition of parties . As a consequence, membership of parliamentary gr oups generates a partition of deputies into political gr oups, which w e use to assess the patterns of colla bora tion between political parties . Da ta on bill cosponsorship in 27 parliamentary chambers of 20 Eur opean countries ha ve been r ecently collected b y Bria tte (2016), w ho has crea ted and published the corr esponding cosponsorship networ ks . Her e we consider the cosponsorship networ ks f or the Italian Chamber of Deputies between the XIVth and the XVIIth legisla tures (2001–2015) and we integrate these da ta with personal details on deputies retrie ved fr om the W eb site of the Chamber of Deputies ( http://dati.camera.it ). The data tha t ar e anal ysed in the paper and the pr ograms tha t w ere used to anal yse them can be obtained fr om http://wileyonlinelibrary.com/journal/rss-datasets 4 M. Signorelli and E. C . Wit 3. P oisson pr ocess model of bill cosponsorship A graph is a pair G = .V , E/ , which consists of a set of nodes V = { 1, ::: , n } connected by a set of edges E ⊆ V × V . Edges represent rela tionships between nodes, and they can be directed or undirected, as well as weighted or unw eighted. In bill cosponsorship netw orks, each node repr esents a parliamentarian and a w eighted undir ected edge between two parliamentarians displa ys the n umber of bills tha t they ha ve cosponsored together . Thus, hereafter we consider the case of an undir ected gra ph, w here a discrete w eight is associa ted with each edge . Such a gra ph can be con venientl y represented b y means of a symmetric adjacency matrix Y ,w h e r ew e set y ij = 0 if deputies i and j are not connected, and y ij equal to the number of cosponsorships between deputies i and j otherwise . W e assume a bsence of self-loops, i.e . y ii = 0. 3.1. Data-generating process W e view the process of crea tion of edges in the graph as the result of a multiv aria te P oisson pr ocess in a given time course T . T o wit, we can associa te a P oisson pr ocess N ij .t/ with rate λ ij with each pair of deputies .i , j/ in the gra ph. At the beginning of the legisla ture, i.e . t = 0, no cosponsorship has occurred yet, so N ij . 0 / = 0. If after some time t 1 a ﬁrst cosponsorship tak es place between deputies i and j ,w es e t N ij .t 1 / = 1. If a second interaction occurs at t 2 , we set N ij .t 2 / = 2, and so on. Thus, N ij .t/ denotes the number of bill cosponsorships tha t ha ve occurred between i and j a t a given time point t . If we stop the process at t = T , the n umber of cosponsorships N ij .T/ tha t ar e observed until T between each pair .i , j/ of deputies is a realiza tion fr om a P oisson distrib ution with mean μ ij = λ ij T and it deﬁnes a w eighted gra ph, wher e y ij = N ij .T/ . No w, suppose tha t a partition P of deputies into p gr oups or blocks is a vaila ble, and that block membership deter mines the ra tes of each P oisson pr ocess, so tha t we can assume tha t the interaction ra tes λ ij are homo geneous within each pair of blocks .r , s/ : λ ij = ζ rs ∀ i ∈ gr oup r , ∀ j ∈ gr oup s , r , s ∈ { 1, ::: , p } :. 1 / Under the assumption of independence between the univaria te pr ocesses, equa tion (1) deﬁnes a stochastic block model, because N ij .t/ and N kj .t/ are independent, and they are also identically distrib uted if i and k belong to the same block. Our primary interest is to understand which gr oups are more activ e in the netw ork, and how members fr om differ ent groups interact with each other . Thus, we would lik e to decompose μ rs = ζ rs T into a baseline par ameter θ 0 tha t contr ols the o ver all bill cosponsorship activity in the netw ork, tw o main ef fects α r and α s tha t account for the rela tive importance (pr oductivity or popularity) of political parties r and s , and an interaction ter m φ rs tha t accounts for colla bora tion (if positive), indif ference (if n ull) or r epulsion (if nega tive), between pairs of parties . Since a linear rela tionship between μ rs and θ 0 , α r , α s and φ rs is impossible f or the range R + of μ rs , we consider a monotone transf or ma tion g : R + → R of μ rs to be linear in the parameters, i.e . g . μ rs / = θ 0 + α r + α s + φ rs :. 2 / A con venient choice f or g is represented by the lo garithm, but alternati ve choices f or g can be considered as w ell. The stochastic block model in equa tion (2) implies stochastic equivalence of nodes within each block. As alread y noted b y W ang and W ong (1987), stochastic equi valence is often an unrealistic and restricti ve assumption. First, it is reasona ble to imagine tha t deputies fr om the same party might beha ve dif ferently . Further more, factors other than bill cosponsorship could Modelling of Community Structure in the Italian P ar liament 5 also pla y a r ole in the choice to cosponsor bills . Theref ore, w e extend model (2) to let the mean of each univaria te pr ocess depend also on a set of node- or edge-speciﬁc co varia tes x ij , with an associa ted parameter vector β : y ij | .i ∈ r , j ∈ s , x ij / ∼ Po i . μ ij = λ ij T/ , g . μ ij / = θ 0 + α r + α s + φ rs + x ij β : . 3 / Model (3) is not a pr oper stochastic block model, because it allows μ ij = μ kj f or tw o units i and k belonging to the same gr oup r . Nevertheless, it r etains its f ocus on the r ole tha t is play ed by blocks in sha ping the netw ork, including speciﬁc sets of par ameters α r f or block r elev ance and φ rs f or interactions within and between blocks . Clear ly, model (2) can be deri ved as a particular case of model (3) by setting β = 0 . Model estima tion can be perf or med b y specifying a suita ble generalized linear model (Nelder and W edderburn, 1972; McCulla gh and Nelder, 1989). W e model the data-gener ating process in equa tion (3) with log . μ ij / = θ 0 + p  r = 1 α r D r .i/ + p  r = 1 α r D r .j/ + p  r  s φ rs D rs .i , j/ + x ij β , . 4 / wher e D r .i/ = I.i ∈ r/ and D rs .i , j/ = I.i ∈ r , j ∈ s ∨ i ∈ s , j ∈ r/ fo r r  s = 1, ::: , p ar e dumm y varia bles tha t indica te whether a unit i belongs to gr oup r , or w hether the pair of nodes .i , j/ implies an interaction between blocks r and s . How ever, model (4) is not identiﬁa ble without further constraints . T ypicall y the wa y in which identiﬁa bility constraints are speciﬁed is not particularl y important, as each parameteriza tion is equivalent; how ever, as we shall be penalizing some parameters in la ter sections, the parameteriza tion will be important. Thus, w e intr oduce the f ollo wing identiﬁa bility conditions: p  r = 1 α r = 0 and p  s = 1 φ rs = 0 ∀ r = 1, ::: , p , . 5 / wher e f or ease of nota tion w e write φ sr = φ rs . If w e incorpora te these constraints in equa tion (4) by letting α 1 =− Σ p r = 2 α r and φ rr =− Σ s = r φ rs , ∀ r = 1, ::: , p , the model can be rewritten as log . μ ij / = θ 0 + p  r = 2 α r T r .i/ + p  r = 2 α r T r .j/ + p  r 0. Since T A = k − 1 T B , the block means μ rs are not af fected b y the change of time system: μ A rs = T A ζ A rs = k − 1 T B k ζ B rs = T B ζ B rs = μ B rs : This result implies tha t the parameters θ 0 , α r and φ rs in model (2) are left unchanged, so that the model is in variant with r espect to changes of timescale measurement. 4. Inference 4.1. P arameter estimation The parameter vector θ = . θ 0 , α 2 , ::: , α p , φ 12 , φ 13 , ::: , φ p − 1, p , β / tha t is associa ted with model (6) has dimension q = dim . θ / = p.p + 1 /= 2 + dim . β / . In principle, it could be estima ted with maximum likelihood. How ever, the n umber of model parameters q increases quadr aticall y with the n umber of blocks p . In such cases, maximum lik elihood estimation could yield solutions with an e xtremely large n umber of parameters, making interpreta tion cumbersome . Instead, we adv ocate the use of penalized lik elihood methods to achiev e a parsimonious solution. Besides enhancing model interpr eta bility, penalized lik elihood methods ena ble us to detect potentially sparse b lock-model-genera ting mechanisms . In stochastic block models, the block Modelling of Community Structure in the Italian P ar liament 7 interaction parameter φ rs indica tes an attraction ( φ rs > 0) or repulsion ( φ rs < 0) between the pair of blocks .r , s/ , but it can also indica te indiffer ence between some pairs of b locks—a situa tion tha t transla tes into φ rs = 0 in model (3). Wher eas maximum lik elihood is unlik ely to pr oduce model estima tes ˆ φ rs tha t are exactl y n ull, penalized lik elihood is capa ble of distinguishing these cases of indif ference b y shrinking to 0 some of the block interaction parameters . Since the introduction of the lasso (Tibshirani, 1996), penaliz ed inference has become a popular choice f or varia ble selection and the solution of high dimensional pr oblems . Man y methods in this ﬁeld ha ve been intr oduced (see B ¨ uhlmann and van de Geer (2011) and F an and Li (2001) for an ov erview). In this paper we use the adapti ve lasso (Zou, 2006), which is a weighted extension of the least a bsolute shrinka ge and selection opera tor (the lasso) tha t w as intr oduced by Tibshir ani (1996), because it has good consistency properties . The ada ptive lasso aims f or a sparse model solution by maximizing a penaliz ed lik elihood tha t incorpora tes the lo g-lik elihood of the model, and a weighted l 1 -penalty on the parameters tha t are included in the model. This penalty is m ultiplied b y a tuning parameter δ  0, which determines the amount of r egularization tha t is imposed on the parameters . The ada ptiv e lasso pr oblem f or model (6) is max θ log { L. θ / } − δ q  j = 1 w j | θ j | , . 8 / wher e L. θ / denotes the likelihood of the model and w j is the weight that is associa ted with the j th element θ j of θ . The tuning parameter δ is typically chosen either b y cr oss-valida tion, or by minimizing a suita bl y deﬁned inf or ma tion criterion. W e discuss this issue in more detail in Section 4.2. Denote b y θ Å a consistent estima tor of θ and by N = n.n − 1 /= 2 the total n umber of pairs of nodes in the networ k. The a ttractive fea ture of the adapti ve lasso is that if the w eight vector is deﬁned as w = 1 = | θ Å | γ , and if δ = √ N → 0 and δ N . γ − 1 /= 2 →∞ , then the adapti ve lasso estima tor ˆ θ is consistent in v aria ble selection (see theor em 4 in Zou (2006)). The choice of the parameters tha t are subject to the l 1 -penalty mostl y depends on the r ole and the meaning tha t we associa te with them. In our view, the b lock interaction parameter φ rs expr esses the presence of a colla bora tion or repulsion between deputies in parties r and s after we ha ve accounted for both the o verall density of the networ k, θ 0 , and the relev ance of the gr oups, α r and α s . T o retain this interpreta tion, we do not penalize θ 0 nor α r , r = 1, ::: , p , i.e . we set w j = 0i f j ∈ { 1, ::: , p } . Ho wev er, we aim to achie ve some sparsity in the representa tion of r elationships between gr oups by penalizing the φ rs -coef ﬁcients ( r = s ), as well as β . F or the penalty weights, we compute the maximum lik elihood estimate ˆ θ and set w j = 1 = | ˆ θ j | γ , with γ = 2, f or j>p . Because of the identiﬁa bility conditions in equa tion (5), the parameters φ rr ( r ∈ { 1, ::: , p } ) tha t contr ol interactions within each b lock do not explicitl y a ppear in model (6) and, thus, they cannot be penaliz ed. This implies that only the p.p − 1 /= 2 interactions between dif ferent blocks can be penaliz ed, w hereas the p parameters f or within-block interactions ar e subsequently deriv ed as ˆ φ rr =− Σ s = r ˆ φ rs , ∀ r = 1, ::: , p . In practice, in real networ ks with comm unity structure those parameters are typically strongl y positive and, thus, unlikel y to be shrunk to 0. F or this reason, we believ e tha t the par ameterization in equation (6) represents the best compr omise between identiﬁa bility and the need to penalize as man y interaction ter ms as possible . 4.2. Model selection In a penalized lik elihood framew ork, the tuning parameter δ determines the amount of regular - iza tion that it is imposed on the parameters and, ev entually , the lev el of sparsity of the solution. 8 M. Signorelli and E. C . Wit T w o main appr oaches are typically emplo yed for the selection of an optimal tuning parameter δ Å : cr oss-valida tion, or minimization of model inf or ma tion criteria. In the la tter case, we seek δ Å = ar g min δ . D δ + a m h δ / , . 9 / wher e D δ denotes the deviance of the model, m the n umber of observ a tions and h δ the di- mensionality of the model. V arious choices ha ve been pr oposed f or a m . Alongside Akaik e’ s inf ormation criterion AIC, which sets a m = 2, and the Ba yesian inf or ma tion criterion BIC, which tak es a m = log .m/ , recent pr oposals include the generalized inf or ma tion criterion GIC of F an and T ang (2013), w here a m = lo g { log .m/ } lo g .h δ / , and the modiﬁed BIC MBIC of Chand (2012), wher e a m = √ m=h δ . Here, we consider f our simula tions to assess the perf or mance of these criteria in the selection of δ . In each simula tion, w e genera te a sequence of netw orks with increasing n umber of nodes n = 50, 100, 150, ::: , 500, f ollo wing the block model that is deﬁned by equation (2). W e set θ 0 = 0 : 7 and dra w α r ∈ U. − 0 : 3, 0 : 3 / , r> 1. Moreo ver, we set some φ rs -coef ﬁcients, r = s , equal to 0 and dra w the r emaining coef ﬁcients in such a w a y tha t | φ rs |∼ U.c min , c max / , with c max = 0 : 5. Coef ﬁcients α 1 and φ rr , r = 1, ::: , p , are subsequently deriv ed fr om equation (5). The sim ulations dif fer f or the number h of n ull φ rs -coef ﬁcients ( r = s ) and f or the betamin condition ( | φ rs |  c min ) tha t is imposed on the non-n ull φ rs -coef ﬁcients; T a ble 1 in the on-line supplementary ma terial summarizes the differ ent settings in each simula tion. W e perf or m model selection o ver a grid of 100 δ -values . Each selection criterion leads to an optimal δ and corresponding model estimates . T o compare the perf ormance of each criterion in the selection of models tha t are ca pa ble of corr ectly distinguishing signals ( φ rs = 0) and non- signals ( φ rs = 0), we compute the accur acy of each solution, i.e . accuracy = true positiv es + true negati ves p.p − 1 /= 2 , and w e compare it with the maxim um achiev a ble accuracy f or the set of 100 models tha t are considered. As sho wn in Fig. 1 of the supplementary material, e very criterion quickl y achieves the maxim um accuracy w hen a dense model is considered (simula tion A), b ut the accuracy of cr oss-valida tion, AIC and MBIC is often lo wer when sparser models are considered (sim ula- tions B and C), or when signal detection is complica ted by the imposition of a milder betamin condition (sim ulation D). Overall, BIC and GIC outperf or m the competing methods and, thus, they appear to be the best inf or ma tion criteria f or varia ble selection. ( a ) ( b ) Fig. 1. (a) An unweighted graph with 50 nodes, par titioned into ﬁv e groups and (b) a simpliﬁed represen- tation of relationships between groups: , set 1; , set 2; , set 3; , set 4; , set 5 Modelling of Community Structure in the Italian P ar liament 9 4.3. The reduced gr aph A f ocal aspect of stochastic b lock models is the description of the rela tionships between blocks of indi viduals . Anderson et al. (1992) pr oposed to repr esent rela tional ties betw een blocks of units by means of a reduced gra ph, whose nodes are the blocks . The idea behind this reduced gra ph is quite simple: summarize the original graph b y visualizing r elationships betw een blocks directl y, to achiev e a simpler and clearer repr esentation. As an e xample, consider the gra ph in Fig. 1(a). Thr ee gr oups of nodes (sets 1, 4 and 5) appear to be fea tured b y a str ong internal connectivity; besides, nodes within each gr oup tend to be prefer entially link ed to nodes belonging to one or two other gr oups; f or example, it appears tha t nodes in set 3 tend to prefer nodes in sets 1 and 2 to nodes in sets 4 and 5. On the basis of similar observa tions, we can a ttempt to dra w a reduced gra ph that summarizes our intuition: the gra ph in Fig. 1(b) pr ovides an e xample . Dif ferent stra tegies to derive a reduced gra ph fr om a sta tistical model can be considered. Anderson et al. (1992) obtained such a gra ph setting a cut-of f c on the predicted pr oba bility of observing an arr o w from nodes in a gr oup r to nodes in a gr oup s ,ˆ π rs , and dra wing an arr ow fr om r to s if ˆ π rs >c . The r esulting reduced gra ph links blocks tha t are highl y connected, but edges therein do not necessarily display attr action between gr oups . F or example, nodes in a gr oup r could ha ve o verall higher degr ees: if this is so, block r would be connected to an y gr oup, just as a result of the high a ver age degree of nodes in the b lock. Moreo ver, their appr oach cannot be easily gener aliz ed to edge-v alued graphs . Theref ore, we pr opose an alternati ve stra tegy to derive a reduced graph displa ying colla bo- ra tions between parties , w hich is based on the parameter estima tes ˆ φ rs in model (6) ra ther than on ˆ μ rs (or ˆ π rs ). By doing so, we control f or the a vera ge degree of blocks r and s , as well as f or the effect of individual co varia tes . Since an estima te ˆ φ rs > 0 entails evidence of colla boration between deputies in parties r and s , w e dra w an edge between blocks r and s if ˆ φ rs > 0. Furthermore, it is also possib le to deri ve a reduced gra ph tha t displa ys r epulsions by connect- ing blocks such tha t ˆ φ rs < 0. In an unpenalized lik elihood framew ork, ho wev er , such a reduced gra ph is uninteresting, as it is simply the complement of the reduced gra ph of colla bora tions . Instead, as discussed in Section 4.1, penalized inference ena bles us to distinguish colla bora tions and repulsions fr om situations of indiffer ence between parties . In a penalized lik elihood setting, then, the reduced gra ph of repulsions is not just the complement of the reduced graph of col- la borations , but it becomes an interesting outcome of model estima tion tha t can highlight those pairs of parties whose members a void w orking with each other . 5. Analysis of bill cosponsorship netw orks of the Italian Chamber of Deputies W e consider no w the networ ks representing bill cosponsorship in the Italian Chamber of Deputies, w hich we ha ve described in Section 2. W e f ocus our attention on the cosponsorship netw orks of the f our legisla tures XIV –XVII, co vering the period 2001–2015. During this period, the n umber of parliamentary groups ranged fr om 8 (legislatur es XIV and XVI) to 10 (XVII) and 13 (legisla ture XV); in each legisla ture , a mixed gr oup has alw a ys been present, ga thering deputies from small political gr oups with differ ent political orienta tion, w hich did not meet the requir ements (deﬁned in the Chamber’ s regula tions) f or the crea tion of a parliamentary gr oup . W e study the dependence between bill cosponsorship and par liamentary gr oups, contr olling f or some indi vidual a ttrib utes of the deputies . In particular , w e consider gender , education lev el (under gradua te v ersus graduate), age, seniority and the electoral constituency of each deputy . Gender can giv e rise to edges in volving two male, MM, tw o female , FF, and a female and a male, FM, deputies; we tak e MM as reference . Like wise , we tak e interactions between tw o un- 10 M. Signorelli and E. C . Wit der gradua te deputies, UU, as refer ence and intr oduce dummies f or gradua te–undergr aduate, GU, and gradua te–gradua te, GG, interactions . W e distinguish senior deputies, S, w ho had al- read y been parliamentarians bef ore their election in a giv en legisla ture , fr om junior deputies, J, who w ere ﬁrst experiencing being deputies . W e set inter actions betw een junior deputies, JJ, as reference mode and intr oduce two dummies f or junior–senior, JS, and senior–senior, SS, interactions . Further more , we consider the a ge dif ference of the tw o deputies . W e tak e Lom- bardia as reference electoral constituency , and we intr oduce 20 ﬁxed ef fects f or the remaining constituencies (19 regions plus the constituency f or electors living a br oad). W e also consider a dumm y indicating w hether tw o deputies ha ve been elected in the same constituency . A commonly observed featur e of social networ ks is the presence of triadic effects . F or binary gra phs, these triadic ef fects correspond to the fact that the proba bility of observing an edge between tw o individuals increases with the n umber of common neighbours tha t they share . This idea is the basis of exponential random graph models f or binary graphs (F rank and Strauss, 1986), whose estima tion relies on Mark ov chain Monte Carlo simula tion techniques (Snijders , 2002) and is typicall y unfeasib le f or networ ks fea turing mor e than a few hundred nodes . Exten- sions of exponential random graph models f or edge-valued gra phs ha ve been r ecently pr oposed (Desmarais and Cranmer , 2012; Krivitsk y , 2012), but the estima tion of the transitivity ef fect f or lar ge netw orks remains an open issue . T o account f or triadic ef fects , we consider f or each pair of nodes .i , j/ the sta tistic TR ij = Σ k = i , j y ik y jk , w hose value increases with the n umber of shared cosponsors , as well as with the frequency of cosponsorships undertak en with them. W e include TR ij in model (6) and estimate its parameter with a penalized pseudolikelihood a ppr oach. W e remar k tha t the perf or mance of penaliz ed pseudolik elihood in the estima tion of the transitivity term of e xponential random graph models has not been in vestiga ted yet, and the inclusion of TR ij in the model should be regar ded just as an attempt to account f or transitivity ef fects on bill cosponsorship . F or each legislatur e, w e estima te model (6) with the adapti ve lasso, using BIC to select the tuning parameter δ . T a ble 1 sho ws the estimates of θ 0 and β (ex cept f or the regional ef fects, which are r eported in T a ble 2 of the on-line supplementary material). The estima te of the intercept θ 0 is low er for legisla tures XV and XVII, coherentl y with the fact tha t the netw orks f or those legisla tures refer to shorter timeframes (less than 3 years ver sus the 5 years of legisla tures XIV Ta b l e 1 . Effect of individual attributes on bill cosponsorship † Co variate Results f or the follo wing legislatures: XIV XV XVI XVII Intercept θ 0 − 2 : 693 − 3 : 184 − 2 : 767 − 3 : 598 F emale–male FM 0.139 0.155 0.208 0.211 F emale–female FF 0.604 0.714 0.689 0.642 Gradua te–undergradua te GU 0.155 0.000 0.011 0.000 Gradua te–graduate GG 0.158 0.000 0.000 − 0 : 157 Same electoral constituency 0.527 0.516 0.537 0.535 J unior–senior JS − 0 : 045 0.043 0.000 0.231 Senior–senior SS − 0 : 004 0.127 0.176 0.571 Age dif ference − 0 : 020 0.000 − 0 : 061 − 0 : 040 T ransitivity 0.189 0.131 0.058 0.067 †The ta ble displa ys the estima tes of θ 0 (unpenalized) and β (penaliz ed) in model (6) f or the f ollo wing legisla tures: XIV (2001–2006), XV (2006–2008), XVI (2008–2013) and XVII (2013–2015). Modelling of Community Structure in the Italian P ar liament 11 and XVI). Bill cosponsorships turn out to be more frequent between female deputies (FF) and, in general, they are more lik ely to tak e place if at least one of the sponsors is female (FM). The ef fect of education, instead, is not sta ble o ver time . The positive estimates tha t are associated with pairs of deputies w ho were elected in the same elector al constituency clear ly point out that deputies tend to colla bora te on the basis of geogra phic pr oximity . Wher eas in legisla ture XIV junior deputies w ere slightl y more pr oductiv e than their senior collea gues, fr om legisla ture XV onwar ds cosponsorships in volv e more senior than junior deputies . Moreo ver, cosponsorships are more frequent between deputies of similar a ge . Finally, we ﬁnd evidence of transitivity effects . The effects tha t are associa ted with each constituency (T a ble 2 of the supplementary ma terial) are mostl y shrunk to 0 and they do not point to an y peculiar temporal pa ttern. The pa ttern of interactions between political parties can be reconstructed by inspecting the reduced gra phs of colla bora tions in Fig. 2, where an edge displa ys colla bora tions ( ˆ φ rs > 0) between tw o parliamentary groups, a self-loop indica tes tha t there is a tendency of deputies to cosponsor with deputies fr om the same parliamentary group and node siz e is pr oportional to the r elati ve fr equency of cosponsorship, ˆ α r , of deputies in each gr oup . Con versel y, the r educed gra phs repr esenting repulsions ( ˆ φ rs < 0) between parties are sho wn in Fig. 2 of the on-line supplementary ma terial. The ﬁrst, interesting, conclusion is tha t cosponsorships during legislatur es XIV and XV re- ﬂected colla borations within each party, and between parties that belonged to the same political coalition. In fact, both legislatur es featur ed str ong competition between two coalitions , one of which (the right wing in the ﬁrst case, and the left wing in the latter) held the majority in the parliament and could, thus, go vern on its own. This situa tion seems to ha ve genera ted a str ong ideological polarization, which is evident from the pattern of colla bora tions (and repulsions) between the par liamentary groups . The division of the Chamber into tw o coalitions ended with legislatur e XVI, as a centrist party (the Unione di Centr o, UDC) tha t was not part of an y coalition enter ed the Chamber . F or 3 years, the majority was held by the right-wing coalition, w hereas the UDC and the left-wing coalition wer e in opposition. 3 years la ter, a group of right-wing deputies f or med the Futur o e Libert ` a per l’Italia party, FLI, a new political group tha t a bandoned the right-wing coalition and entered a centrist coalition with the UDC . 1 year la ter, the right-wing go vernment resigned and a coalition go vernment, supported b y a heter ogeneous coalition of parties, took its place . Besides cosponsorships within each parliamentary gr oup, our model detects colla bora tions between the main right-wing party (the P opolo della Libert ` a party, PDL) and FLI, between tw o opposition parties (Partito Democra tico, PD, and UDC) and between a left-wing party (the Italia dei V alori party, ID V) and a right-wing gr oup (the P opolo e T erritorio party, P&T). It is also interesting to consider the reduced gra ph displa ying repulsions: most of the edges therein indicate (not surprisingly) the a bsence of colla bora tions betw een parties from dif ferent coalitions, but also between the UDC and FLI, which allied tow ards the end of the legislatur e . In short, cosponsorships in this legisla ture seem to reﬂect mostly the division between the right-wing majority (FLI, the Lega Nor d party, LN, PDL and P&T) and the opposition (PD, ID V, UDC) of the ﬁrst half of the legisla ture, despite the fact tha t the analysis considers cosponsorships ov er the whole legislatur e span. A possible explana tion f or this result is tha t cosponsorship events are more likel y to tak e place in the ﬁrst years of each legisla ture: as a ma tter of fact, owing to the long time tha t is typically necessary f or a bill of parliamentary initia tive to be discussed and appr ov ed, a bill that is pr oposed to wards the end of the legisla ture is extremel y unlikel y to be a ppr ov ed, and this can in turn discoura ge deputies fr om pr oposing bills in the last years of their manda te . The fra gmentation in the composition of the Chamber has become e ven str onger in the cur- rent legislature (XVII). Since none of the f our coalitions no w represented in the Parliament (left 12 M. Signorelli and E. C . Wit wing, right wing, the centrist Scelta Civica, SC, and the Mo vimento 5 Stelle, M5S) could f or m a go vernment alone, alliances between parties belonging to dif ferent coalitions had to be sought, giving rise to heter ogeneous parliamentary majorities . In this case, the reduced graph in Fig. 2 sho ws tha t, besides self-loops accounting f or a tendency tow ards within-gr oup cosponsorship, deputies fr om differ ent right-wing parties colla bora te with each other . Moreo ver , deputies fr om the centrist party SC colla bora te with deputies belonging to the Centr o Democr atico party, CD, a left-wing party which is ideolo gically alike the SC b ut belongs to a dif ferent political coalition. Further colla bora tions ar e detected between tw o left-wing parties (PD and the Sinistra Ecolo g ´ ıa AN UDC FI LN DS Margh mixed RC AN DC FI ID V LN mixed PCI RC PD Udeur RNP UDC V erdi FLI PDL ID V LN mixed PD P&T UDC AN AP FI LN mixed M5S PD CD SC SEL (a) (b) (c) (d) Fig. 2. Reduced graphs representing collabor ations between par liamentary groups based on bill cospon- sorship (the gr aphs displa y collaborations based on model 6 (i.e. ˆ φ rs > 0);  , right-wing parliamentar y groups;  , left-wing groups; , centrist g roups; , the mixed group; , M5S) (node siz e is proportional to the produc- tivity of each par liamentary g roup, ˆ α r ): (a) legislature XIV (2001–2006); (b) legislature XV (2006–2008); (c) legislature XVI (2008–2013); (d) legislature XVII (2013–2015) Modelling of Community Structure in the Italian P ar liament 13 Libert ` a party, SEL) and betw een the mixed gr oup and v arious parties . A part fr om a colla bora- tion with the mixed gr oup, deputies fr om M5S do not seem to colla bora te with an y other party . In short, our analysis of bill cosponsorship netw orks indica tes the ev olution fr om a highl y polarized political arena, in which deputies based colla bora tions on their identiﬁcation with left- or right-wing v alues, to war ds an increasingl y fragmented parliament, wher e a rigid separation of political gr oups into coalitions does not seem to hold an y more, and colla bora tions bey ond the perimeter of coalitions ha ve become possible . One of the drivers of this change is pr oba bly a change of electoral la w in 2005, which made it more dif ﬁcult f or coalitions of parties to obtain a majority in the Sena te . This resulted in the pr ema ture end of legisla ture XV and in less sta ble parliamentary majorities in legislatur es XVI and XVII. Our analysis of bill cosponsorships suggests that also the pattern of colla bora tions between parliamentarians was af fected, inducing deputies to colla borate mor e frequentl y with deputies from dif ferent political coalitions . 6. Conclusion and discussion Community afﬁlia tion can deeply affect social beha viour and the f or ma tion of rela tionships between individuals . In social netw ork analysis, stochastic block models represent a popular appr oach to account f or the ef fect of community membership on the crea tion of ties and to assess community structur es . In this pa per , we ha ve developed an extended stochastic b lock model f or the analysis of bill cosponsorships in the Italian Parliament. This model retains the f ocus on rela tionships between pairs of blocks that characteriz e pure stochastic block models b y including parameters f or gr oup pr oductivity , α r , and interactions between pairs of gr oups, φ rs , b ut it also allows heter ogeneity of units within a block. Because the n umber of parameters increases quadra tically with the n um- ber of gr oups, w e ad v ocate the use of a penalized estima tion a pproach to select a parsimonious model tha t displa ys relev ant colla bora tions and r epulsions between pairs of blocks . W e repre- sent these preferential rela tionships by means of reduced gra phs displa ying the rela tionships tha t exist between b locks . Our analysis of bill cosponsorship in the Italian Chamber of Deputies fr om 2001 to 2015 demonstra tes the ev olution fr om a political system tha t w as str ongly polarized into a left- and a right-wing coalition, in which bill cosponsorship took place almost e xclusiv ely between deputies belonging to the same coalition, tow ards an incr easingly fra gmented political arena, with mor e than two coalitions of parties and in w hich colla bora tions bey ond the perimeter of coalitions are no w possible . Although here we ha ve considered netw orks where edges are undirected and weighted, with weights in the set of na tural n umbers, the models tha t we propose can be easily generaliz ed in tw o directions . Directed edges can be handled by introducing a recipr ocity ter m and a further set of nodal ef fects to distinguish sender and receiv er nodes . As an example, model (3) can be adapted as f ollo ws: y ij | .i ∈ r , j ∈ s/ ∼ Po i . μ ij / , log . μ ij / = θ 0 + ρ + α r + γ s + φ rs + x ij β , wher e α r measures the producti vity of group r (which the sender node i belongs to), γ s the popularity of gr oup s (which the receiv er node j belongs to) and ρ the tendency to recipr ocate arr o ws . Here a positiv e φ rs denotes a ttraction or repulsion fr om nodes in gr oup r tow ards nodes in gr oup s , and, consequently, φ rs = φ sr . Moreo ver, the use of generalized linear models ena bles us to e xtend model (3) easily bey ond P oisson pr ocesses . F or example , if the networ k is unweighted (i.e . y ij ∈ { 0, 1 } ) it sufﬁces to replace 14 M. Signorelli and E. C . Wit the P oisson with a Bernoulli distrib ution, and the lo g-link with a logit or a pr obit link function; if a weighted netw ork with weights in the set of real n umbers is at hand, the P oisson distribution can be replaced with an y contin uous distrib ution, and the identity function becomes a natur al choice f or g . W e r emark tha t our da ta anal ysis relies on bill cosponsorship networ ks that are aggrega ted o ver the span of each legisla ture (Bria tte, 2016). This does not allow us to tak e into account possible changes in membership of parliamentary gr oups within a legisla ture, a practice—kno wn as tr asformismo —tha t is quite frequent in the Italian Parliament. F or this reason, we ha ve r elied on the gr oup memberships of each deputy as reported on the W eb site of the Italian Chamber of Deputies ( http://dati.camera.it ). In principle, our model is capa ble of handling this situa tion. If, f or example, deputy i has been member of party q f or a timespan equal to t 1 and of party r for t 2 , the n umber of bills that they ha ve cosponsored with deputy j ∈ s is still a P oisson pr ocess: N ij .t 1 + t 2 / = N ij .t 1 / + N ij .t 2 / ∼ Po i . λ qs t 1 + λ rs t 2 /: Thus, a v aila bility of da ta disaggr ega ted o ver time w ould allo w us to cope with these changes in party membership, pr o viding a mor e realistic account of this phenomenon. Furthermore, this w ould also entitle us to model directl y the interaction ra tes λ ij between deputies, w hich (as we pointed out in our comment on the r esults f or legisla ture XVI) is unlik ely to be constant acr oss the legisla ture (because of both pr ocedural issues and the changing political en vir onment). In particular, it w ould make it possible to verify the h ypothesis that most cosponsorships tak e place a t the beginning of the legislatur e . Ackno wledgements W e ackno wledge funding from the European Cooperation f or Statistics of Netw ork Da ta Science (‘COST action CA15109’), supported b y the Eur opean Cooperation in Science and T echnology . W e also thank a revie wer, whose suggestions and remarks ha ve contrib uted to impr o ve the paper . References Anderson, C . J ., W asser man, S . and F aust, K. (1992) Building stochastic blockmodels . Soc l Netwrks , 14 , 137–161. Bria tte, F . (2016) Network pa tterns of legislativ e colla boration in tw enty parliaments . Netwrk Sci. , 4 , 266–271. B ¨ uhlmann, P . and van de Geer, S . (2011) Statistics f or High-dimensional Data: Methods, Theory and Applications . New Y ork: Springer . Chand, S . (2012) On tuning parameter selection of lasso-type methods—a Monte Carlo stud y . In Proc . 9th Int. Bhurban Conf. Applied Sciences and T echnolog y , pp . 120–129. New Y ork: Institute of Electrical and Electronics Engineers . Dal Maso, C ., P ompa, G ., Puliga, M., Riotta, G . and Chessa, A. (2014) V oting beha vior, coalitions and go vernment strength thr ough a complex netw ork anal ysis . PLOS ONE , 9 , no . 12, article e116046. Desmarais, B . A. and Cranmer , S . J . (2012) Statistical inference f or valued-edge netw orks: the generalized expo- nential random gra ph model. PLOS ONE , 7 , no . 1, article e30136. F an, J . and Li, R. (2001) V aria ble selection via nonconca ve penaliz ed lik elihood and its oracle properties . J . Am. Statist. Ass . , 96 , 1348–1360. F an, Y . and T ang, C . Y . (2013) T uning par ameter selection in high dimensional penalized lik elihood. J . R. Statist. Soc . B, 75 , 531–552. Fienber g, S . E. and W asser man, S . (1981) Categorical da ta analysis of single sociometric rela tions . Sociol. Methodol. , 12 , 156–192. F owler, J . H. (2006) Connecting the Congress: a study of cosponsorship netw orks . P olit. Anal. , 14 , 456–487. F rank, O . and Strauss, D . (1986) Mark ov gr aphs . J . Am. Statist. Ass . , 81 , 832–842. Holland, P . W ., Laskey, K. B . and Leinhardt, S . (1983) Stochastic blockmodels: ﬁrst steps . Soc l Netwrks , 5 , 109–137. Holland, P . W . and Leinhardt, S . (1981) An e xponential family of proba bility distributions f or directed gra phs . J. Am. Statist. Ass . , 76 , 33–50. Modelling of Community Structure in the Italian P ar liament 15 Kirkland, J . H. (2014) Ideological heter ogeneity and legisla tive polarization in the United States . P olit. Res . Q. , 67 , 533–546. Krivitsk y, P . N . (2012) Exponential-famil y random gra ph models f or valued networ ks . Electr on. J . Statist. , 6 , 1100–1128. McCullagh, P . and Nelder, J . A. (1989) Generaliz ed Linear Models , 2nd edn. London: Chapman and Hall. Nelder, J . A. and W edderb urn, R. W . M. (1972) Generalized linear models . J . R. Statist. Soc. A, 135 , 370–384. No wicki, K. and Snijders, T . A. B . (2001) Estimation and pr ediction for stochastic blockstructures . J . Am. Statist. Ass . , 96 , 1077–1087. Parigi, P . and Sartori, L. (2014) The political party as a networ k of clea vages: disclosing the inner structur e of Italian political parties in the sev enties . Soc l Netwrks , 36 , 54–65. R occa, M. S . and Sanchez, G . R. (2007) The effect of race and ethnicity on bill sponsorship and cosponsorship in Congress . Am. P olit. Res . , 36 , 130–152. Snijders, T . A. (2002) Mark o v chain Monte Carlo estimation of exponential random graph models . J . Soc l Struct. , 3 , no . 2, 1–40. Tibshirani, R. (1996) R egression shrinkage and selection via the lasso . J . R. Statist. Soc. B, 58 , 267–288. W ang, Y . J . and W ong, G . Y . (1987) Stochastic blockmodels f or directed gra phs . J . Am. Statist. Ass. , 82 , 8–19. W asser man, S . and Anderson, C. (1987) Stochastic a posteriori blockmodels: construction and assessment. Socl Netwrks , 9 , 1–36. Wit, E. and McCulla gh, P . (2001) The extendibility of statistical models . Contemp . Math. , 287 , 327–340. Zou, H. (2006) The adapti ve lasso and its oracle pr operties . J . Am. Statist. Ass. , 101 , 1418–1429. Suppor ting information Additional ‘supporting inf ormation ’ ma y be found in the on-line v ersion of this article: ‘W eb-based supporting materials f or: “ A penalized inference appr oach to stochastic blockmodelling of community structure in the Italian Par liament”’. W eb-based supp orting materials for: “A p enalized inference approach to sto c hastic blo c kmo delling of comm unit y structure in the Italian P arliamen t” b y Mirk o Signorelli and Ernst C. Wit Mirk o Signorelli 1,2,3 and Ernst C. Wit 1 1 Johann Bernoul li Institute for Mathematics and Computer Scienc e, University of Gr oningen (NL) 2 Dep artment of Statistic al Scienc es, University of Padova (IT) 3 Dep artment of Me dic al Statistics and Bioinformatics, L eiden University Me dic al Centr e (NL) T able 1: An o verview of Sim ulations A-D. In Simulation A, w e consider a dense mo del (i.e., with high dimensionality h ) with a mo derate b etamin condition imp osed on the non-null φ rs co eﬃcien ts ( | φ rs | ≥ c min ). W e pro- gressiv ely increase the sparsity of the mo del in Sim ulations B and C. In Sim ulation D w e consider a mo del with medium sparsity level (lik e the one in Sim ulation B), but we make signal detection harder by imp osing a milder b etamin condition. Sim ulation # ( φ rs = 0) h Betamin condition A 10 45 (dense) c min = 0 . 2 (mo derate) B 20 35 (medium) c min = 0 . 2 (mo derate) C 30 25 (sparse) c min = 0 . 2 (mo derate) D 20 35 (medium) c min = 0 . 1 (mild) 1 n Accuracy ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 300 350 400 450 500 0.5 0.6 0.7 0.8 0.9 1.0 ● max CV AIC BIC GIC MBIC Simulation A n Accuracy ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 300 350 400 450 500 0.6 0.7 0.8 0.9 1.0 ● max CV AIC BIC GIC MBIC Simulation B n Accuracy ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 300 350 400 450 500 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ● max CV AIC BIC GIC MBIC Simulation C n Accuracy ● ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 300 350 400 450 500 0.6 0.7 0.8 0.9 1.0 ● max CV AIC BIC GIC MBIC Simulation D Figure 1: Results of Sim ulations A-D. Comparison of the accuracy of mo dels c hosen b y 10-fold cross-v alidation (CV), Ak aik e’s Information Crite- rion (AIC), Ba yesian Information Criterion (BIC), the Generalized Informa- tion Criterion (GIC) of F an and T ang (2013) and the mo diﬁed BIC (MBIC) of Chand (2012) with the maximum ac hiev able accuracy (MAX). Ev ery cri- terion quic kly achiev es the maxim um accuracy in Simulation A, where we consider a mo del with few n ull φ rs . In Simulations B, C and D, instead, BIC and GIC outp erform CV, AIC and MBIC: this is particularly apparent when a sparser mo del is considered (Sim ulation C), or when signal detection is made harder b y the imp osition of a milder b etamin condition (Sim ulation D). 2 T able 2: P arameter estimates of regional eﬀects (reference mode: Lom- bardia). Note that most of the eﬀects are shrunk to zero and that, ov erall, the eﬀects are not constant o ver time. Costituency L e gislatur e XIV XV XVI XVII Abruzzo 0 0 0.240 0 V alle d’Aosta 0.750 -0.262 -0.274 0.373 Basilicata 0.146 0 0 0.313 Calabria -0.058 0 0.302 0 Campania 0 0 0.020 -0.289 Emilia Romagna 0 0 0.103 -0.173 F riuli V enezia Giulia 0 0 0 0 Lazio 0.113 0 0 0 Liguria -0.303 0 0 -0.358 Marc he 0.029 0 -0.185 0 Molise -0.426 0.382 0.531 0 Piemon te 0.115 0 -0.016 0 Puglia 0 0 0.221 0 Sardegna 0 0 0 0 Sicilia 0 0 0.132 -0.240 T oscana 0 0.098 0.019 -0.283 T ren tino Alto Adige -0.028 0 0.332 -0.260 Um bria 0 -0.167 0.398 0 V eneto 0.126 0 -0.281 0 Residen ts abroad - 0.390 0.705 0 3 XIV legislature (2001−2006) AN UDC FI LN DS Margh mixed RC XV legislature (2006−2008) AN DC FI IDV LN mixed PCI RC PD Udeur RNP UDC V erdi XVI legislature (2008−2013) FLI PDL IDV LN mixed PD P&T UDC XVII legislature (2013−2015) AN AP FI LN mixed M5S PD CD SC SEL Figure 2: Reduced graphs representing repulsions b et ween parlia- men tary groups based on bill cosponsorship. The graphs display re- pulsions (i.e., ˆ φ rs < 0) b etw een parliamen tary groups. White squares denote righ t-wing parliamentary groups, white circles left-wing groups and darkgrey squares centrist groups. A darkgrey circle denotes the mixed group, whereas a lightgrey circle the Movimen to 5 Stelle. No de size is prop ortional to the pro ductivit y of each parliamentary group ( ˆ α r ). 4

A penalized inference approach to stochastic block modelling of community structure in the Italian Parliament

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment