Evolution of Ego-networks in Social Media with Link Recommendations

Ev olution of Ego-netw orks in Social Media with Link Recommendations Luca Maria Aiello Nokia Bell Labs Cambridge, United Kingdom luca.aiello@nokia-bell-labs.com Nicola Barbieri T umblr New Y ork, NY , USA barbieri@yahoo-inc.com ABSTRA CT Ego-net works are fundamen tal structures in social graphs, y et the pro cess of their ev olution is still widely unexplored. In an online context, a k ey question is how link recom- mender systems ma y skew the gro wth of these netw orks, possibly restraining diversit y . T o shed light on this matter, w e analyze the complete temp oral ev olution of 170M ego- net work s extracted from Flic kr and T umblr, comp aring links that are created spontaneously with those that ha ve been al- gorithmically recommended. W e ﬁnd that the ev olution of ego-net works is bursty , comm unit y-driven , and character- ized b y subsequent phases of explosive diameter increase, sligh t shrinking, and stabilization. Recommendations fav or popular and well-connected nodes, limiting the diameter ex- pansion. With a matc hing exp eriment aimed at detecting causal relationships from observ ational data, we ﬁnd that the bias introduced b y the recommendations fosters global div ersit y in the process of neigh b or selection. Last, with t wo link prediction exp erimen ts, we show how insights from our analysis can b e used to improv e the eﬀectiveness of so cial recommender systems. K eywords Net work ev olution, ego-net works, link recommendation, groups, comm unities, social media, T um blr, Flickr 1. INTR ODUCTION Ego-cen tric so cial netw orks (ego-net works) map the in- teractions that o ccur betw een the social contacts of indi- vidual p eople. Because they provide the view of the so cial w orld from a p ersonal p erspective, these structures are fun- damen tal information blo cks to understand how individual behaviour is link ed to group life and so cietal dynamics. De- spite the gro wing av ailab ility of in teraction data from online social media, little researc h has been conducted to un veil the structure and ev olutionary dynamics of ego-net works [4, 30]. In an online context, p eople expand their so cial circles also as a result of automatic recommendations that are oﬀered Permission to make digital or hard copies of all or part of this w ork for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. Copyrights for components of this work o wned by others than the author(s) must be honored. Abstracting with credit is permitted. T o copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and /or a fee. Request permissions from permissions@acm.org. WSDM 2017, F ebruary 06 - 10, 2017, Cambridg e, United Kingdom c  2017 Cop yright held by the o wner/author(s). Publication rights licensed to A CM. ISBN 978-1-4503-4675-7/17/02. . . $15.00 DOI: http://dx .doi.org/10.1145/3018661.3018733 to them, which makes it harder to disenta ngle sp on taneous user behavior from algorithmically-induced actions. W e aim to provide an all-round description about ho w ego-net works are formed and ho w automated con tact recom- mendations might bias their growth. W e do so by analyzing the full longitudin al traces of 170M ego-netw orks from F lickr and T um blr ( § 3), answering several op en research questions about the shap e of their boundaries, their communit y struc- ture, and the pro cess of neigh b or selection in time ( § 4). The ric hness of the data we study allows for the identiﬁcation of those T um blr links that hav e b een created as a result of recommendations served b y the platform, which p ositions us in an unique standpoint to inv estigate the impact of link recommender systems on the pro cess of net wo rk growt h. Some of our key ﬁndings are: • The bac kb one of a typical ego-netw ork is shap ed within the initial mon th of a no de’s activity and within the ﬁrst ∼ 50-100 links created. In that p eriod, new contacts are added in larger batches and the main comm unities emerge. Unlik e global so cial netw orks, whose diameter shrinks in time, the av erage distance b etw een no des in ego-netw orks expands rapidly and then stabilizes. • The selection criteria of new neigh b ors c hange as n ew con- tacts are added, with p opular conta cts b eing more fre- quen tly follo w ed in earlier stages of the ego’s life, and friends-of-friends b eing selected in later stages. The neigh- bor selection is also heavily driven by the ego-netw ork’s comm unit y struct ure, as p eople tend to gro w diﬀeren t sub- groups sequen tially , with an in-depth exploration strategy . • The link recommender system skews the pro cess of ego- net work construction tow ards more p opular con tacts but at the same time restraining the growth of its diameter, compared to sp on taneous b ehavior. With a matching ex- periment aimed at detecting causal relationships from ob- serv ational data, we ﬁnd that the bias introduced by the recommendations fosters div ersity: people exp osed to rec- ommendations end up creating p ools of con tacts that are more diﬀerent from eac h other compared to those who w ere not exposed. The outcomes of our analysis ha ve theoretical implicatio ns in netw ork science and ﬁnd direct application in link recom- mendation and prediction tasks. W e run a prediction ex- periment ( § 5) to show that simple temporal signals could b e crucial features to impro ve link prediction p erformance, as the criteria of ego-netw ork expansion v ary as the ego grows older. In a second experiment, w e test the algorithmic capa- bilit y to tell apart sp ontaneous links from recommendation- induced ones. This ability op ens up the wa y to train link predictors that mitigate existing algorithmic biases by sug- gesting links whose prop erties b etter adhere to the natural criteria that p eople follow when connecting to others. 2. RELA TED WORK Structure and dynamics of so cial net w orks. F or decades, net work science research has explored extensively the struc- tural and ev olutionary prop erties of online social graphs and of the comm unities they encompass [2 4, 11, 31, 7, 41, 39, 54, 55, 49], unv eiling universal patterns of their dynamics. Indi- vidual connectivity and activity are broadly distributed [39, 51]; the creation of new links is driven by reciprocation, preferen tial attachmen t [38], triangle closure [33], and ho- mophily [2, 58]. G lobally , the num b er of edges in a so cial net work grows sup erlinearly with its num b er of no des, and the a verage path length shrinks with the addition of new nodes [34], after an initial expansion phase [1]. The reg- ular patterns that driv e the link creation pro cess hav e en- abled the developmen t of accurate metho ds for link predic- tion and recommendation [28] based on either lo cal [35] or global structural information [9, 8, 45]. Fine-grained tempo- ral traces of user activity in online so cial platforms op ened up new av enues to inv estigate in detail the impact of time on net work growth [60]. F or example, the relationship be- t ween the node age and its connectivity has b een measured in sev eral online social graphs including Flickr [33, 57]. Ego-net works. T o date, not muc h research has b een con- ducted on how no des build their lo c al so cial neighborhoo ds in time. Researc h done by Aranboldi et al. has lo ok ed in to ego-net works of online-mediated rela tionships including the F aceb ook friendship netw ork [4] and the Twitter follo w graph [5, 3], as well as professional relationships such as Google Sc holar’s co-authorship netw ork [6]. Using commu- nit y detection, hierarchical clusters are discov ered, in agree- men t with Robin Dunbar’s theory on the hierarchical ar- rangemen t of so cial ego-circles [59]. Similar ﬁndings hav e been conﬁrmed b y independent studies on the F acebo ok net- w ork [19]. In the attempt of comparing the prop erties of the global net work with those of ego-netw orks, recent stud- ies fou nd that lo cal structural attributes are c haracterized b y local biases [27] that are direct impli cations of the friendship parado x [21]. Multiple tec hniques ha ve b een prop osed to dis- co ver so cial or topical sub-groups within ego-netw orks [53, 37, 40, 13], but with little attention to the dynamics of their gro wth. Kik as et al. conducted one of the few studies touc h- ing up on the the temp oral ev olution of ego-netw orks, using a dataset of Skyp e contacts [30]. They ﬁnd that most edges are added in short bursts separated by long inactivity in ter- v als. Eﬀect of so cial recommender systems. In the past y ears, computer scien tists develo p ed increasingly eﬀective con tact recommender systems for online so cial media [26]. Only recently , the communit y has adopted a more critical standpoint with resp ect to the eﬀe cts that those recommen- dations ma y hav e on the collectiv e user dynamics. Algo- rithms based on netw ork proximit y are b etter suited to ﬁnd con tacts that are already kno wn by the user, whereas al- gorithms based on similarity of user-generated con tent are stronger at disco v ering new friends [16]. Surveys admin- istered to mem b ers of corp orate so cial net works revealed that con tact recommendations with high n umber of common Figure 1: T umblr’s con tact recommender system. neigh bors are usually well-receiv ed [17]. Recommendation- induced link creations hav e a substan tial eﬀect on the growth of th e social graph ; for example, the introduction of the “p eo- ple you may know” service in F aceb ook increased consider- ably the num ber of links created and the ratio of triangle closures [60]. A recent study on Twitter compared the link creation activity b efore and after the in tro duction of the “Who T o F ollow” service [48], showing that p opular no des are those who most b eneﬁt from recommendations. On a wider p erspective, the debate around recommender systems fostering or limiting access to no vel in formation is still open. On one hand, recommenders may originate a ﬁlter bubble eﬀect by providing information that increasingly reinforces existing viewp oin ts [42]. On the other hand, recent research has pointed out that individual c hoices, more than the eﬀect of algorithms, limit exp osure to cross-cutting conten t [10]. It has b een argued that recommender systems hav e a lim- ited eﬀect in inﬂuencing people’s free will. Observ ational studies on Amazon found that 75% of click-through s on rec- ommended products would lik ely hav e o ccurred also in the absence of recommendations [44]. In the context of link recommendation systems, a key op en question is how they aﬀect ego-net work div ersity . 3. D A T ASET AND PRELIMINARIES W e study t wo so cial media platforms that diﬀer in b oth scope and usage. The data includes only interactions be- t ween users who volun tarily opted-in for research studies. All the analysis has been performed in aggregate and on anon ymized data. 3.1 T umblr T umblr is a p opular so cial blogging platform. The t yp es of user-generated conten t range from simple textual mes- sages to multimedia adv ertising campaigns [15, 25]. Users migh t own multiple blogs, but for the purp ose of this study w e consider blogs as users, and we will use the tw o terms in terch angeably . Users receive up dates from the blogs they follo w; the following relationship is direct ional and might not be reciprocated. In T um blr, 326 million blogs a nd 143 b illion posts hav e b een created 1 since the release of the platform to the public in July 2007. W e extracted a large random sam- ple of the social net work in Octob er 2015, which includes 1 h ttps://www.tum blr.com/ab out (Dec 2016) 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 degree 1 0 - 8 1 0 - 6 1 0 - 4 1 0 - 2 1 0 0 P Tumblr in out 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 degree 1 0 - 8 1 0 - 6 1 0 - 4 1 0 - 2 1 0 0 P Flickr in out Figure 2: Degree distributions in T umblr ( µ in = 108, µ out = 58) and Flickr ( µ in = 19, µ out = 21). almost 7 B follow links created betw een 130 M public blogs o ver approximately 8 years. All the so cial links are marked with the exact timestamps of their creation, whic h allo ws for a ﬁn e-grained longitudinal analysis of the net work ev olution. In Octob er 2012, T um blr launched a new version of its r e c ommende d blo gs feature. On the web interface, a short- list of four recommended blogs is display ed in a panel next to the user’s feed (Figure 1). U sers can get more recom- mendations by clic king on “explore” . At every page refresh, the shortlist may change according to a randomized reshuf- ﬂing strategy that surfaces new recommended con tacts from the larger p o ol. T umblr’s link recommendation algorithm is not publicly disclosed, but it considers a mixture of tw o sig- nals: topical preferences and netw ork structure. The user’s tastes are estimated since the onboarding phase, in which registran ts are asked to indicate their preference on a set of pre-determined topics organized in a taxonom y (e.g., sp orts, football). The topical proﬁle helps to o vercome the cold- start problem. As the num ber of con tacts gro ws, new blogs are recommended following the triangle closure (friend-of-a- friend) principle. F or all the links created after January 2015, w e can re- liably estimate if they hav e b een created as an eﬀect of a recommendation. This information is inferred b y combin- ing the log of recommendation impressions (i.e., when rec- ommendations are visualized by the user) with the log of link creations. When the link is created shortly after the recommendation is display ed, we coun t the link creation as triggered b y a recommendation. 3.2 Flickr Flic kr is a p opular photo-sharing platform in which users can uploa d a large amoun t (up t o 1 TB) of pictures and share them with friends. Users can establish directed social links b y following other users and get up dates on their activity . Since its release in F ebruary 2004, the platform has gath- ered almost 90 million registered mem b ers who upload more than 3.5 million new images daily 2 . W e collected a sample of the follow er net work comp osed by the nearly 40 M public Flic kr proﬁles that are opted-in for research studies and by the 500 M + links that connect them. Links carry the times- tamp of their creation and they span ap proxima tely 12 y ears ending March 2016. Similar to T umblr, Flickr has a contact recommendation module. Ho wev er, we do not hav e access to recommendation data and we cannot measure their eﬀect on the link creation pro cess. 2 h ttp://www.thev erge.com/2013/3/20/4121574/ ﬂic kr- chief- markus- spiering- talks- photos- and- marissa- ma yer 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 time number of Tumblr links (total) triangle-closing links recommended links 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 time number of Flickr Figure 3: Number of links, triangle-closing links, and recom- mended links (T umblr only) created each day , by the no des in our sample, during the whole lifespan of the platforms. 3.3 Concepts and notation Graph and ego-net work. Consider a follo wer graph G composed by a set of no des N and a set of directed edges 3 E ∈ N × N . When building the follow er graph, we draw an edge from no de i to no de j if i has follow ed j at any time. The e go-network G i of no de i ∈ N is the subgraph induced b y i ’s out-neighbors Γ out ( i ) [22]. F ormally: G i = ( N i , E i ), where N i = Γ out ( i ), and E i = { ( j, l ) ∈ E | j ∈ Γ out ( i ) ∧ l ∈ Γ out ( i ) } . Note that the ego-netw ork do es not include the links betw een the ego i and its neighbors. Structural graph metrics in time. The temp oral trace of link creations allows us to build a time graph [32] and to reco ver the structural prop erties of nodes and links at any point in time. The sup erscript t applied to any indicator means that the metric refers to a snapshot of the graph at time t . F or example, the neighbor set of no de i at time t is denoted as Γ t ( i ) and its degree as k t ( i ). When studying the evolution of ego-net works in isolation, we will consider time on a discrete scale where each ev ent corresponds to the n th node b eing added to the ego-net work. W e will use the letter n to denote tim e passing on this discrete scale. All the graphs we consider are directed, so we use the deﬁnition of triangle closure adapted to directed graphs [43]: a new link created b et ween i and j at time t closes a directed triangle if ∃ l ∈ N | l ∈ Γ t out ( i ) ∧ l ∈ Γ t in ( j ). Sp on taneous vs. recommended links. W e distinguish links that are created for eﬀect of a recommendation from those that are not. W e call the links in the ﬁrst group r e c- ommende d and the ones in latter sp ontane ous . 3.4 Data overview The (in/out)degree distributions together with their a ver- age v alues ( µ ) are sho wn in Figure 2. As exp ected, all dis- tributions are broad, with v alues spanning several orders of magnitude. The out-degree distribution in T um blr is capped 3 W e will use the terms dir e cte d e dges , e dges , and links in- terc hangeably to indicate a directional connection b etw een nodes. 1 0 0 1 0 1 1 0 2 1 0 3 number of nodes 1 0 - 1 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 number of links ° = 1 : 8 7 Tumblr Flickr 0 200 400 600 800 1000 n 0.0 0.5 1.0 1.5 2.0 2.5 3.0 network distance Tumblr T u m b l r > 1 0 0 0 Flickr Figure 4: Left: av erage n umber of ego-net work links vs. num b er of nodes; b est ﬁtting pow er-law exponent is reported as reference. Right: av erage net work distance after the n th node is added; the black dotted line is obtained considering only T umblr ego- netw orks with at least 1000 nodes. at 5000 b ecause the platfo rm imposes a n upper bound on the n umber of blogs a user can follo w. On a verage, T umblr users are more connected than Flic kr users, with av erage in- and out-degree 5 and 3 times larger than Flickr, resp ectiv ely . Figure 3 plots the num ber of links created ov er the course of the platforms’ life. Both netw orks hav e exp erienced a no- ticeable growth. F or T umblr, w e can plot the time series of recommended link creations o ccurred after January 2015. W e also calculate the set of links that close at least one tri- angle. In T um blr, the ﬁrst sharp increase in the num b er of triangle-closing links is found b et ween 2012 and 2013. That is determined by the in tro duction of a new link recommender system. A similar pattern has b een observed in F aceb ook after the introduction of the “people you may know” mod- ule [60]. About 27% of recommended links do not close an y triangle: those are recommendations based on the user’s topical proﬁle only . 4. EV OLUTION OF EGO-NETWORKS 4.1 Diameter and connected components The growth of social netw orks is asso ciated with three c hanges in their macroscopic structure: densiﬁcation, diam- eter shrinking, and inclusion of almost all no des in a single gian t connected comp onen t [34]. It is unknown whether the same properties hold at ego-netw ork level. Q1: How do densit y , diameter, and component struc- ture evolv e as the ego-netw ork gro ws? Lik e global netw orks, ego-net works become denser in time. Ego-net works ob ey a densiﬁcation p ow er la w, for whic h the n umber of links scales sup erlinearly with the num ber of nodes |E i | ∼ |N i | γ (Figure 4, left). The exp onen t that b est deﬁnes the scaling in b oth platforms is γ = 1 . 87. More surprisingly , densiﬁcation does not alwa ys lead to the emergence of a single giant connected comp onen t cov er- ing the whole graph. On av erage 4 , the largest comp onen t’s size relativ e to the netw ork size grows as new no des join (Figure 5, left), but it stabilizes around 0 . 8 for net works of 200 no des or more. The num b er of components grows sub- linearly with the num b er of no des (not shown). More no- tably , the diameter shows little signs of shrinking. The net- work distanc e , computed as t he a verage distance betw een all pairs of no des, 5 experiences a three-phases evolution (Fig- 4 Results are qualitatively similar when considering the median. 5 Computed on an undirected v ersion on the graph. Similar re- sults are obtained using diameter or eﬀectiv e diameter. 0 100 200 300 400 500 600 n 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ratio GCC Tumblr Flickr 0 100 200 300 400 500 n 0.0 0.2 0.4 0.6 0.8 1.0 P(new component) Spontaneous Recommended Figure 5: Left: a verage ratio of no des in the giant weak ly con- nected comp onent (GCC) as new nodes are added to the ego netw ork. Right: probabilit y that the n th node included in the ego-netw ork spawns a new disconnected comp onen t, computed for sp on taneous and recommended nodes. ure 4, right). First, at the b eginning of the ego-netw ork life, it expands rapidly ( explor ation ); then, it starts shrink- ing sligh tly ( c onsolidation ) before asintotically conv erging to a stable v alue ( stabilization ). This trend is very diﬀerent from the sharp diameter decline that chara cterizes social graphs. W e sp eculate that the consolidation phase might be connected with the int rinsic h uman limitation to main- tain large so cial groups, as theorized by Robin Dunbar [20]. When the ego’s social neigh b orhoo d exceeds the size that is cognitiv ely manageable by a p erson (roughly , 150 to 200 individuals), a comp ensation eﬀect might b e triggered: new con tacts are not anymore sought further aw ay from the so- cial circles that hav e b een already established, putting an end to the exploration phase. This happ ens at n = 140 in Flic kr and n = 190 in T umblr, v alues that are compatible with Dun bar’s theory . The addition of recommended nodes has a diﬀeren t ef- fect on the ego-netw ork expansion, compared to sponta- neous ones. New recommended contacts tend to b e closer to existing ego-net work mem b ers. At ﬁxed net work size, the addition of new recommended no des increases the netw ork distance 5% less, on av erage, than a sp on taneous no de addi- tion. Ego-netw orks with at least a recommended node hav e smaller net w ork distance than ego-netw orks that grew fully spontaneously; this diﬀerence v aries with the size, being only 2% smaller for netw orks under 50 no des up to 10% smaller for net w orks with 200 nodes or more. Also, spontaneous nodes hav e far higher chances, compared to recommended ones, to spawn a new comp onen t disconnected from the rest of the ego-net work (Figure 5, right). Accoun ting for amalgamation eﬀects. T o explore evo- lutionary trends of ego-netw orks, we rely on aggregate anal- ysis: an indicator is measured on an ego-net work when its n th node is added and then it is a verag ed across all ego- net work s. T rends are discov ered as n grows (e.g., diameter in Figure 4, right). This approach may yield misleading re- sults b ecause av erages computed at diﬀeren t v alues of n are obtained from diﬀerent sample sets. This problem is known as the Simpson ’s Par adox [46] and it is usually addressed b y ﬁxing the sample set [12]. T o account for it, every time w e p erform an evolutionary analysis as n v aries in [1 , n max ], w e compare results obtained in t wo settings: the ﬁrst using the full dataset and the latter considering only the subset of ego-netw orks that reached at least size n max . The results are only slightly diﬀerent across the tw o settings, for all the indicators analyzed. F or the sake of brevity , we rep ort just one example of such comparison. In Figure 4 right, the di-                              1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 common neighbors 1 0 - 1 0 1 0 - 8 1 0 - 6 1 0 - 4 1 0 - 2 1 0 0 P Tumblr Tumblr Recs Flickr Figure 6: Distribution of no des’ popularity (indegree) and simi- larity with ego (common neighbors) at the time of their inclusion in the ego-netw ork. 1 0 0 1 0 1 1 0 2 1 0 3 n 1 0 - 1 1 0 0 1 0 1 1 0 2 1 0 3 common neighbors Tumblr Tumblr Recs Flickr 1 0 0 1 0 1 1 0 2 1 0 3 n 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 1 0 8 1 0 9 pref. attachment 1 0 1 1 0 2 1 0 3 n 1 0 - 3 1 0 - 2 Jaccard 1 0 0 1 0 1 1 0 2 1 0 3 n 1 0 2 1 0 3 1 0 4 1 0 5 indegree Figure 7: Average similarity and p opularit y indicators of n th node added to the ego-netw ork: common neghbors b et ween the ego and the newly added nodes, Jaccard similarity betw een their n eighbors sets, preferential attac hment indicator ( k out ( eg o ) · k in ( newnode )), and indegree of the new no de. ameter evolution for T umblr ego-netw orks that reached at least size 1000 is very similar to the trend found when all ego-net works are considered. 4.2 Popularity vs. similarity The pro cess of link creation in online so cial netw orks is driv en by tw o main factors: p opularit y (that leads to pref- eren tial attachmen t [38]) and similarit y (that leads to ho- mophily [2]). At netw ork scale, their relativ e weigh t in pre- dicting the creation of new links might v ary depending on the type of so cial netw ork [28]. At microscopic scale, it is still unclear ho w popularity and similarity impact the selec- tion of new no des in ego-netw orks, and how their relative importance v aries in time. Q2: Ho w do the criteria of neigh b or selection c hange as the ego-netw ork grows? W e select tw o simple (yet widely-used) proxies of p opu- larit y and similarity . Given an ego i who has added j as its neigh bor at time t , we consider the alter’s indegree k t in ( j ) as an indicator of its p opularit y and the num b er of common neigh bors b et ween the ego i and the alter j , C N t ( i, j ) = | Γ t out ( i ) ∩ Γ t in ( j ) | , as a measure of sim ilarity . Drawing the dis- tributions of k t in and C N t ( i, j ) (Figure 6), we observe that the range of v alues is v ery broad in both platforms. The C N distributions suﬀer from cut-oﬀs (around C N = 200) caused by the scarcity of no des with hundreds of common 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 batch size 1 0 - 8 1 0 - 6 1 0 - 4 1 0 - 2 1 0 0 P ° = 2 : 2 ° = 2 : 4 5 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 batch interarrival (hours) 1 0 - 1 0 1 0 - 8 1 0 - 6 1 0 - 4 1 0 - 2 1 0 0 P ° = 0 : 4 Tumblr Tumblr Recs Flickr Figure 8: Distributions of batc h size ( s b ) and batch interarriv al time ( τ b ). Best ﬁtting pow er law exponents reported as reference. 1 0 0 1 0 1 1 0 2 batch number 1 0 0 1 0 1 batch size 1 0 0 1 0 1 1 0 2 ego age (days) 1 0 0 1 0 1 batch size Tumblr Tumblr Recs Flickr Figure 9: Batc h size as a function of time, measured as the num b er of batc hes created or as the ego’s age measured in num b er of da ys. The T um blr Recs curv es summarize the trend for batches containing at least one recommended no de. neigh bors or more. Recommended T umblr no des yield dis- tributions that are sk ewed tow ards higher v alues because the recommender pic ks by design those proﬁles that are popular and w ell-connected to the ego’s neighbors. As new nodes are added to the ego-net work, the num b er of their common connections with the ego naturally increases (and so do es the preferen tial attac hment indicator, as ex- pected). The Jaccard similarity b et w een their neighbor sets oscillates, increasing when the ego-netw ork’s size is in the in- terv al [100 , 1000] and decreasing otherwise. The p opularit y of new ego-netw ork members, computed as their indegree in the so cial netw ork, decreases as the ego-net work grows. A summary of all the indicators is given in Figure 7. All the trends are similar, yet shifted to ward s higher v al- ues, when considering recommended no des only . In short, recommended no des tend to share more contacts with the ego and to be more p opular, which corrob orates previous observ ations ab out link recommendations b eing b eneﬁcial mostly to p opular no des [48]. The indicator that diﬀers the most is the Jaccard similarity , that increases monotonically with n for recommended contacts. 4.3 T emporal activity Creation of links is not uniform in time. Previous lit- erature found evidence that, globally , the creation of links happens in bursts [32, 30]. A t a lo cal level, we aim to learn ho w often and in which phases of the ego-net work life users select new neigh b ors. Q3: When do ego-netw orks expand? T o measure how muc h no de additions to an ego-netw ork are concentrated in short p erio ds of time, we resort to ba- sic session analysis to group together temp orally-con tiguous ev ents. As is standard practice in the analysis of bro wsing behaviour [47], we split user sessions by timeout: a session starts when a new node is added to the ego-netw ork and 0 20 40 60 80 100 ego age (days) 0.00 0.05 0.10 0.15 0.20 0.25 ratio of nodes added Tumblr Tumblr Recs Flickr Figure 10: Average p ortion of links created in the ﬁrst 100 da ys of the ego’s life, relative to the ﬁnal ego-netw ork size. Only no des who hav e created links for at least 6 months are considered. ends when no other no de has b een added for 25 min utes. W e call b atch a set of nodes added in a single session. W e compute the av erage b atch size s b and the session in- ter arrival time τ b , namely the time (hours) elapsed from the session’s end to the next session’s start. The pro cess of batch creation is bursty when i ) there are strong temp o- ral heterogeneities in the interarriv al time, and ii ) consecu- tiv e link creations are not indep enden t even ts. A standard practice to assess those conditions is to measure the decay of the probability densit y functions for s b and τ b : pow er la w deca ys in the form P ( x ) ∼ x − γ indicate burstiness [29]. The distribution of batc h size s b follo ws a p o wer-la w trend, with exp onen ts 2 . 2 and 2 . 45 in T umblr and Flickr, resp ec- tiv ely (Figure 8 left). In Flickr, the size scales freely as there are no b oundaries preven ting the addition of an y num b er of con tact. In T um blr, we observe a sharp cutoﬀ at 200 as the service p olicy enforces a maxim um limit of 200 link cre- ations p er user p er day . The decay of τ b is similar on b oth platforms, with initial interv als ﬁtting a p o wer-la w with ex- ponent γ = 0 . 4, follow ed by exponential cutoﬀs due to the ﬁnite time windo w (Figure 8, right). If we consider only sequences o f batches containing at least one recommended link, we see that recommendations are as- sociated with the creation of less links p er session, but at higher rate. The av erage batch size is 2 . 18 in Flic kr and 2 . 67 in T um blr; T um blr batc hes with recommended links are 12% smaller (av erage size 2 . 35). The median in terar- riv al time is relativ ely high in b oth platforms —12 days in T umblr, 2 w eeks in Flickr— b ut only 5 da ys for pairs of con- secutiv e batches containing recommended links. No causal claim connecting recommendations and rate of link creation can b e ma de, as a n umber of confounding factors could inﬂu- ence this trend (e.g., users who are more active might more naturally engage in recommendations). How ever, this re- sult pro vides partial evidence that recommendations might con tribute to alter the natural time scale of link creation. The a verage batch size decreases as time passes and the ego-net work gro ws (Figure 9). After the ﬁrst 30 da ys (or the ﬁrst 20-30 batches created), the batch size stabilizes around 2. A similar decreasing trend is also found for the interar- riv al time τ b (not shown). This suggest that no des tend to build most of their ego-net work in the ﬁrst stages of their life. T o conﬁrm that, we compute the av erage daily ratio of the total num b er of the ego-netw ork’s no des added to in the ﬁrst 100 days of the ego’s life. W e only consider users whose link creation activit y spans at leas t 6 months, to av oid b iases in tro duced by users with short lifespan. As exp ected, a big c hun k of no des are typically added in the ﬁrst days of activ- it y (Figure 10). This ﬁnding adds nuance to previous wo rk on temp oral graphs. Studies on Flickr using a coarser tem- 1 5 10 15+ Number of communities 0.0 0.1 0.2 0.3 0.4 0.5 P Tumblr Flickr 0 5 10 15 20 community number 0 10 20 30 40 50 community size Tumblr Flickr Figure 11: Left: distribution of num b er of comm unities in ego- netw orks. Right: av erage size of communities as they app ear in the ego-netw ork; 95% conﬁdence interv als are shown. poral granularit y found that the raw num b er of new links created by the ego in time is uniform ov er time [33]. Here w e ﬁnd that the uniform trend starts only after an initial spik e of link creations. 4.4 Community formation Ego-net works hav e a clear comm unit y structure b ecause people tend to interact with multiple so cial circles (e.g., sc ho ol friends, family members) that are t ypically weak ly connected to one another [37]. W e ask ab out the role of these comm unities in the graph evolution. Q4: Is the ego-netw ork gro wth driv en b y the b ound- aries of its comm unities? The ego-netw ork may follow a depth-ﬁrst expansion pat- tern with resp ect to commun ties, in which the ego prefer- en tially connects to no des b elonging to a communit y b efore exploring others. In Flickr, for example, a p erson could ﬁrst follo w all the accounts of family members and then those of a photography club. Alternatively , the ego-netw ork may expand either breadth-ﬁrst, pic king new no des in a round- robin fashion, or regardless of the comm unit y structure. In social netw ork analysis research we know little evidence in support of any of these scenarios. In the con text of web na vigation and search, in-depth exploration of conten t is of- ten most eﬀective and cognitively more natural [18, 50]; we h yp othesize that the same holds for comm unity explo ration. Measuring the extent to whic h an y of these scenarios re- ﬂects p eople’s b eha viour is challenging, as in reality com- m unities can b e ov erlapping and c hange their b oundaries as the graph grows. Leaving more adv anced measurements for future work, w e assume a static, hard partitioning of nodes in communities. F or all ego-netw orks, w e compute non-o v erlapping communities 6 at time t = T end (the most recen t snapshot in our data), heuristically ﬁltering out sma ll ego-net works with less than 5 members. Ego-net works are often comp osed by few communities, rarely more than 10 (Figure 11, left). T o assess to what extent communities emerge ov er time in an orderly fashion, we rank them by the time the ego has created connections with their no des. Speciﬁcally , we ﬁrst sort all the no des by the time they are added to the ego-net work (e.g., [ j 1 , j 2 , j 3 , j 4 ]). W e then replace nodes with the comm unities they belong to (e.g., [ c 1 , c 1 , c 2 , c 1 ]), rank comm unities b y the median p osition p of their o c- currences in that vector (e.g., p ( c 1 ) = 2 , p ( c 2 ) = 3; c 1 is rank ed ﬁrst, c 2 is ranked last), and ﬁnally replacing the comm unities with their respective ranks, thus obtaining a 6 W e use the comm unity detection algorithm by W altman and Eck [52] that is an optimization of the Louv ain metho d [14]. 0 20 40 60 80 100 n 0.0 0.2 0.4 0.6 0.8 1.0 inversion score Tumblr 0 20 40 60 80 100 n 0.0 0.2 0.4 0.6 0.8 1.0 inversion score Flickr Real Shuffled Figure 12: Average inv ersion score of ego-net work communities as new no des are added, compared to a randomized null-model. 0 10 20 30 40 50 n 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 likelihood over random Tumblr 0 10 20 30 40 50 n 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 likelihood over random Flickr Comm.1 Comm.2 Comm.3 Comm.4 Comm.5 random Figure 13: Average likelihood (ov er random chance) that the n th node in a ego-netw ork b elongs to the k th communit y . sequence of comm unit y ranks R (e.g., [1,1,2,1]). The intu- ition is that a communit y half of whose members has b een added to the ego-net work by time t comes temporally b efore an y other whose ma jority of nodes are still outside the ego- net work at the same time t . If exploration of communities is purely in-depth, R is fully sorted. The sortedness of a list L = { x 1 , ..., x m } can be measured by its inversion sc or e : inv ( L ) = 1 − 2 · |{ ( x i , x j ) | i < j ∧ x i > x j }|  | L | 2  ∈ [ − 1 , 1] inv = 1 indicates sortedness, inv = − 1 inv erse ordering, and inv = 0 randomness. In Figure 12 we plot the a vera ge in- v ersion score of R against the ego-net work size. T o account for the comm unity size heterogeneity , we compare it with a null-m o del where the elemen ts in R are randomly reshuf- ﬂed. The inv ersion score quickly stabilizes as the netw ork gro ws and it has v alues that are consistently higher (double or more) than the null-model’s, supp orting the hypothesis that communities tend to b e explored in depth, one after the other. F urther evidence can b e provided by measuring the prob- abilit y that the n th node added to the ego-netw ork b elongs to the k th comm unit y in the ranking, normalized b y the probabilit y in the null-m o del; v alues higher than 1 indicate abov e-c hance likelihoo d of a no de b eing in a given comm u- nit y . Figure 13 shows the av erage normalized likelihoo d of the ﬁrst 50 no des to b elong to the ﬁrst 5 communities in the rank. Curves for increasing v alues of k emerge ab ov e the randomness threshold one after the other, which backs the h yp othesis of communities b eing explored in-depth. The size of a comm unit y (measured at time T end ) v aries with its temporal rank (Figure 11, right). On av erage, p eo- ple create increasingly larger communities up to the ﬁfth one; from the sixth one on, new communities added b ecome smaller and smaller. 4.5 Recommendations diversity Our analysis shows that the statistical prop erties of rec- ommended links are diﬀerent from those of sp on taneous ones. 1 2 3 4 5 matching sequence length (k) 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 entropy(neighbor set) Spontaneous Recommended Figure 14: Matching exp eriment. Average en tropy of the neigh- bor sets at step k + 1 for groups sharing a matching sequence of length k . 95% conﬁdence interv als are show n. It is known that recommender systems p ostively aﬀect user engagemen t, in terms of time spent, conten t consumption, and user cont ribution [23]. In agreement with established kno wledge, we hav e found that users exposed to recommen- dations create more links, more frequently . It is harder to assess whether recommendations foster or limit access to div erse types of conten t. The academic debate ab out rec- ommendations being the ban e or bo on of so cial media is still v ery lively [42, 10, 44], with evidence brought in supp ort of the tw o views. W e aim to provid e further evidence to shed ligh t on this point in the context of link recommenders. Q5: Do link recommendations foster diversit y? It is hard to infer causality from observ ational data. Match- ing is a statistical technique that is used to ev aluate the eﬀect of a treatmen t on a dep enden t v ariable by comparing indi- viduals who hav e received the treatment with others with similar observ able features who did not receiv e it. The more similar the paired individuals and the higher the num b er of pairs, the higher the conﬁdence of estimating the cause of the treatmen t on the dep enden t v ariable. W e conduct a matching exp erimen t to measure if p eople who follow recommendations end up ha ving ego-net works more similar to one another than if they were to ignore recommendations. W e arrange people in matching gr oups con taining users who are nearly identical in terms of their local connectivit y . W e assign to the same matchin g group users who i ) registered to T um blr less than 30 da ys apart, and ii ) whose very ﬁrst k neighbors are the same and hav e been added in the same order to their ego-netw orks. Each matc hing group is then split in tw o subgroups: a tr e atment group of users whose k + 1 st con tact is a recommended one, and a c ontr ol group of all the remaining users, whose k + 1 st con tact has b een created sp on taneously . The v ariety of con- tacts created at step k + 1 is our dependent v ariable. If the v ariet y measured in one group is signiﬁcantly diﬀerent from the other, we can attribute the divergence to the eﬀect of the recommendation, as the initial conditions of the tw o groups are virtually ident ical. F or example, if we found that the v ariety of no des in the treatmen t group is lo wer than the one measured on the control group, we would conclude that recommendations conform the pro cess of link creation b y inducing users to follow a more restricted set of accounts compared to wh at would happ en by sp onta neous user b eha v- ior. W e measure diversit y of contacts through their entrop y . T o ensure a fair comparison that accounts for size hetero- geneit y , we use normalized en tropy b H . Given a bag of nodes X of size N , where p ( x ) is the n umber o f occurrences of node Model AUC F-Sco re Baseline 0.893 0.813 + age 0.938 0.864 + k out 0.897 0.817 All 0 . 943 0 . 87 T able 1: Link prediction re- sults. 0 10k 20k 30k mean decrease Gini k i n PA Jaccard k o u t days CN Figure 15: F eature im- portance. x ∈ X divided by N , the normalized entrop y is deﬁned as: b H ( X ) = X x ∈ X p ( x ) · l og 2 ( p ( x )) log 2 ( N ) ∈ [0 , 1] . Figure 14 shows the results for a total of approximately 25 K match ing groups, for k ∈ [1 , 5]; requiring k identical links is a to o strong requirement for larger k . Every p oin t is the av erage of all the matching groups for a given k . First, w e observ e that, the higher the k , the lo wer the entrop y . That is exp ected: the higher the num b er of common neigh- bors, the more likely the next selected neighbor will b e the same. Last, most imp ortan tly , the v ariety of the treatmen t group is alwa ys higher; this indicates that recommendations foster diversit y . Although it is diﬃcult to pin down the exact reasons why this happ ens, we provide a p ossible interpreta- tion. Even if th e list of recommended con tacts was the same for all the users in the treatment group, the reshuﬄing of the top recommended contacts that T umblr implements in the link recommender widget in tro duces asymmetries across users. More generally , we could hypothesize that the recom- mender system exposes users to a wider set of potential con- tacts than the ones they would b e exp osed to by browsing or searching on the site, thus providing a wider sp ectrum of options and, in turn, a more diverse set of individual c hoices. T o test the robustness of the results, w e explored some possible alternatives in the setup of the matching exp eri- men t. Sp eciﬁcally w e: i ) measured the en tropy at k + 2 instead of k + 1 (we leav e the k + n generalization for fu- ture work); ii ) selected only control and treatment groups with at least m ∈ [2 , 15] members (the results rep orted are for m = 5); iii ) randomly downsampled the larger group to match the size of the smaller one, to balance the size of the tw o; iv ) run tw o indep enden t exp erimen ts including in the treatmen t group only users whose recommended con tact had 1) at least one common neighbor (i.e., the recommenda- tion is provided based on netw ork topology features) or 2) no common neighbors (i.e., the recommendation is provided on a topical basis). The absolute v alues v ary slightly across setups, but the qualitative results remain the same. 5. IMP A CT ON LINK PREDICTION Our analytical results hav e direct implications on how link recommender systems can b e enhanced to provide more ef- fectiv e suggestions. Next, w e discuss tw o prediction exp eri- men ts that aim to answer t wo researc h questions. Q6: T o what extent temp oral features improv e the abilit y to predict new links? The previous analysis sho wed that egos add new neighbors to their netw ork with criteria that change in time. The re- sulting h yp othesis is that link recommendations that adapt to the current evolutionary stage of the ego-netw ork could k i n PA CN Jaccard 0.000 0.005 0.010 0.015 0.020 normalized feature value spontaneous links k i n PA CN Jaccard 0.00 0.02 0.04 0.06 0.08 normalized feature value recommended links Figure 16: Distribution of structural features of sp ontaneous and recommended links. Giv en directed links in the form ( i, j ) we show the boxplots of the distributions of k in ( j ), P A ( i, j ), C N ( i, j ), and Jaccard( i, j ). gain eﬀectiveness. T o test such h yp othesis, we run a predic- tion experiment. W e consider a snapshot of the T umblr so cial netw ork at an arbitrary time t (January 1 st 2015). T o build a training set, we sample 200 K node pairs ( i, j ) that are not directly connected but with at least one directed common neighbor (i.e., there is a directed path of length 2 from i to j ). Half of the pairs will b e directly connected by a link from i to j before T end (positive examples), the remaining half will re- main disconnected (negative examples). F or every pair, we extract six simple features: i ’s outdegree ( k out ( i )), j ’s inde- gree ( k in ( j )), preferen tial atta chmen t (P A = k out ( i ) · k in ( j )), common neigh b ors (CN = | Γ t out ( i ) ∩ Γ t in ( j ) | ), Jaccard simi- larit y b etw een neighbor sets (Jaccard = | Γ t out ( i ) ∩ Γ t in ( j ) | | Γ t out ( i ) ∪ Γ t in ( j ) | ), and i ’s age measured in num b er of days elapsed from i ’s proﬁle creation to t . Age and i ’s outdegree are the tw o tempo- ral features whose eﬀectiveness we wan t to inv estigate: one measures time on a contin uous scale, the other on the dis- crete scale of link creation even ts. The simple features ab ov e can predict the no de pair class v ery accurately , and a summary of this ev aluation (10-fold cross v alidation using random forest) is giv en in T able 1. The mo del trained on the full set of features yields an AUC of 0.943 and a F-measure of 0.87, whic h is an improv ement of 5 . 6%/7% in terms of AUC/F-mea sure ov er a baseline mo del that do es not consider temp oral features. Adding the feature k out to the baseline mo del yields only a slight improv ement in accuracy (0 . 45%/0 . 49%), while considering age improv es the accuracy in a more consistent wa y (5 . 04%/6 . 27%). Fig- ure 15 summarizes the importance of the feature s in this link prediction setting b y measuring the me an de cre ase Gini , the a verage gain of purity achiev ed when splitting on a giv en v ariable (the higher, the b etter) [36]. In line with previous w ork [56], this analysis further conﬁrms the imp ortance of the temp oral features and found that time matters when recommending new contacts. Q7: Is it p ossible to limit the bias of the recom- mender system while k eeping its high accuracy? In Section 4.2 w e observed that the distribution of no des’ popularity and of their st ructural similarity with ego are sta- tistically diﬀeren t when considering recommended vs. spon- taneous links. T o further inv estigate this diﬀerence, we ana- lyze the distribution of all structural features ov er a sample (200k) of recommended and sp on taneous links, equally rep- resen ted. F or sake of presentation, w e normalize the v alue of each feature on a scale 0 to 1, by dividing the raw feature v alue by its maximum v alue in the whole 200k sample. As sho wn in Figure 16, spontaneous links tend to exhibit less de- gree of structural ov erlap than recommended links (the me- dian Jaccard v alue on recommended links is 4 times larger than the v alue recorded on spontaneous ones). The same observ ation holds when analyzing the no de p opularit y; the median in-degree of the target node on recommended links is one order of magnitude higher than the corresp onding v alue computed on sp ontaneous links. Learning to what exten t it is p ossible to automatically tell recommended links and spon taneous li nks apart would allow us to train new recommender systems to suggest links whose properties b etter adhere to the natural criteria that p eople follo w when adding new contacts. T o gauge this possibil- it y , w e run a second prediction exp erimen t with the same features and setup of the previous one but with a diﬀerent selection of p ositive and negative examples. W e pick 100 K pairs that will b e connected in the future through a spontaneous link as p ositiv e examples, and as man y pairs that will b e connected through a recommended link as negative examples. A random forest classiﬁer is able to distinguish the tw o classes prett y accurately (AUC = 0 . 795, F-measure = 0 . 721). In a more realistic scenario, the recommender should learn to recog nize the space betw een recommended links and truly negativ e examples (links that are never formed). T o mo del that, we add to the training set 100 K negativ e pairs that will not be connected at any time in the future. This setting ac hiev es a b etter performance (AUC = 0 . 823, F-measure = 0 . 771) with around 90% accuracy on the negative class and 55% on the p ositiv e one. In short, ev en if very basic struc- tural and temp oral features are used, it is p ossible to eﬀec- tiv ely use the output of curren t link recommenders to train new recommenders that smo oth the algorithmic bias and produce suggestions that better simulate the sp on taneous process of link selection. 6. CONCLUSIONS W e hav e provided a large-scale analysis of ego-netw ork ev olution on tw o online platforms, exp osing the dynamics of their bursty evolution, communit y-driven growth, diame- ter expansion, and selection of new no des based on a time- v arying in terplay b et ween similarity and p opularit y . By studying the set of T umblr links created as a result of al- gorithmic suggestions, we ﬁnd that recommend ed links hav e diﬀeren t statistical prop erties than sp on taneously-generated ones. W e also ﬁnd evidence that link recommendations fos- ter net work diversit y by leading no des that are structurally similar to c ho ose diﬀerent sets of new neigh b ors. Our work has some limitations. Flic kr and T umblr are mainly interest netw orks, where people follow each other based on topical tastes. Some of the results we rep ort here migh t not generalize to so cial netw orks that aim mostly at connecting p eople who know each other in real life (e.g., F aceb ook, LinkedIn). Also, w e only consider the netw ork structure and disregard any notion of node proﬁle, including posting activit y in time and user-generated conten t; that information would help t o further detail the dynamics of ego- net work expansion with resp ect to other dimensions such as topical similarit y b etw een proﬁles. Our results hav e a num b er of practical implications. W e pro vide further evidence that in online so cial netw orks not all links are created equal; net work analysts who produce net work growth mo dels based on the observ ation of online social netw orks’ longitudinal traces should consider weig ht- ing links that emerge sp on taneously diﬀerent from those that are created algorithmically . Through our prediction exp eri- men ts, we also provide a hint ab out how link recommender systems could incorporate signals on the ego-netw ork’s evo- lutionary stage to improv e the qualit y of suggestions. W e hope our work pro vides yet another step tow ards a b etter understanding of the evolutionary dynamics of social net- w orks. 7. A CKNO WLEDGMENTS W e would like to thank Martin Sav eski for his v aluable suggestions and the anonymous reviewers for helpful com- men ts. 8. REFERENCES [1] Y. Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong. Analysis of top ological c haracteristics of huge online social netw orking services. In WWW , 2007. [2] L. M. Aiello, A. Barrat, R. Schifanella, C. Cattuto, B. Markines, and F. Menczer. F riendship prediction and homophily in so cial media. ACM TWEB , 2012. [3] V. Arnaboldi, M. Con ti, M. La Gala, A. Passarella, and F. Pezzoni. Ego net work structure in online so cial netw orks and its impact on information diﬀusion. Comp. Comm. , 2016. [4] V. Arnaboldi, M. Con ti, A. Passarella, and F. Pezzoni. Analysis of ego netw ork structure in online so cial net works. In So cialCom , 2012. [5] V. Arnaboldi, M. Con ti, A. Passarella, and F. Pezzoni. Ego netw orks in twitter: an exp erimen tal analysis. In InfoCom , 2013. [6] V. Arnaboldi, R. Dun bar, A. Passarella, and M. Conti. Analysis of co-authorship ego netw orks. In Network Scienc e . Springer, 2016. [7] L. Bac kstrom, D. Huttenlo c her, J. Kleinberg, and X. Lan. Group formation in large so cial netw orks: membership, growth, and ev olution. In KDD . ACM, 2006. [8] L. Bac kstrom and J. Lesko vec. Supervised random walks: Predicting and recommending links in so cial netw orks. In WSDM , 2011. [9] B. Bahmani, A. Chowdh ury , and A. Go el. F ast incremental and p ersonalized pagerank. Pr o c. VLDB Endow. , 2010. [10] E. Baksh y , S. Messing, and L. A. Adamic. Exp osure to ideologically diverse news and opinion on facebo ok. Scienc e , 2015. [11] A.-L. Barabˆ asi, H. Jeong, Z. N´ eda, E. Rav asz, A. Schubert, and T. Vicsek. Evolution of the social net work of scien tiﬁc collaborations. Physic a A: Statistic al me chanics and its applic ations , 311(3), 2002. [12] S. Barbosa, D. Cosley , A. Sharma, and R. M. Cesar Jr. Averag ing gone wrong: Using time-aw are analyses to b etter understand b eha vior. In WWW . A CM, 2016. [13] A. Bisw as and B. Biswas. In vestigating communit y structure in p erspective of ego netw ork. Exp ert Syst. Appl. , 2015. [14] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. F ast unfolding of communities in large netw orks. Journ. of Stat. Me ch. , 2008. [15] Y. Chang, L. T ang, Y. Inagaki, and Y. Liu. What is tumblr: A statistical o verview and comparison. SIGKDD Explor. Newsl. , 2014. [16] J. Chen, W. Geye r, C. Dugan, M. Muller, and I. Guy . Make new friends, but keep the old: recommending p eople on social net working sites. 2009. [17] E. M. Daly , W. Geyer, and D. R. Millen. The network eﬀects of recommending so cial connections. In R e cSys , 2010. [18] P . De Bra, G.-J. Houb en, Y. Kornatzky , and R. Post. Information retriev al in distributed hypertexts. In RIAO , 1994. [19] A. De Salve, M. Dondio, B. Guidi, and L. Ricci. The impact of user’s av ailability on on-line ego netw orks: a facebo ok analysis. Comp. Comm. , 2016. [20] R. Dun bar. Gr o oming, gossip, and the evolution of language . Harv ard Universit y Press, 1998. [21] S. L. F eld. Why y our friends hav e more friends than you do. Am. Jour. of So c. , 1991. [22] L. C. F reeman. Centered graphs and the structure of ego netw orks. Math. So c. Sci. , 1982. [23] J. F reyne, M. Jacovi, I. Guy , and W. Gey er. Increasing engagemen t through early recommender interv ention. In R e cSys , 2009. [24] L. Garton, C. Haythorn thw aite, and B. W ellman. Studying online so cial net works. Journ. of Com. Me d. Comm. , 1997. [25] M. Grbovic, V. Radosavljevic, N. Djuric, N. Bhamidipati, and A. Nagara jan. Gender and in terest targeting for sponsored post advertising at tumblr. In KDD , 2015. [26] P . Gupta, A. Go el, J. Lin, A. Sharma, D. W ang, and R. Zadeh. Wtf: The who to follow service at t witter. In WWW , 2013. [27] S. Gupta, X. Y an, and K. Lerman. Structural prop erties of ego netw orks. In SBP-BRiMS . Springer, 2015. [28] M. A. Hasan and M. J. Zaki. Social Network Data Analytics , chapter A Surv ey of Link Prediction in So cial Netw orks. Springer US, Boston, MA, 2011. [29] M. Karsai, K. Kaski, A.-L. Barab´ asi, and J. Kert´ esz. Universa l features of correlated bursty behaviour. Scientiﬁc r ep orts , 2012. [30] R. Kik as, M. Dumas, and M. Karsai. Burst y ego cen tric netw ork evolution in skype. Soc. Net. An. and Min. , 2013. [31] G. Kossinets and D. J. W atts. Empirical analysis of an evolvin g so cial net work. Science , 2006. [32] R. Kumar, J. Nov ak, P . Ragha v an, and A. T omkins. On the bursty evolution of blogspace. In WWW , 2003. [33] J. Lesk ovec, L. Backs trom, R. Kumar, and A. T omkins. Microscopic evolution of social net works. In KDD , 2008. [34] J. Lesk ovec, J. Kleinberg, and C. F aloutsos. Graphs ov er time: Densiﬁcation laws, shrinking diameters and p ossible explanations. In KDD , 2005. [35] D. Liben-Now ell and J. Kleinberg. The link prediction problem for so cial net works. In CIKM . [36] G. Louppe, L. W ehenkel, A. Sutera, and P . Geurts. Understanding v ariable imp ortances in forests of randomized trees. In NIPS . Curran Asso ciates Inc., 2013. [37] J. Mcauley and J. Lesko vec. Disco vering so cial circles in ego netw orks. TKDD , 2014. [38] A. Mislo ve, H. S. Koppula, K. P . Gummadi, P . Druschel, and B. Bhattacharjee. Gro wth of the ﬂickr so cial netw ork. In WOSN , 2008. [39] A. Mislo ve, M. Marcon, K. P . Gummadi, P . Drusc hel, and B. Bhattacharjee. Measuremen t and analysis of online social net works. In IMC , 2007. [40] S. A. Muhammad and K. V an Laerho ven. Duk e: A solution for discov ering neighborho od patterns in ego netw orks. In ICWSM , 2015. [41] G. P alla, A.-L. Barab´ asi, and T. Vicsek. Quantifying so cial group evolution. Natur e , 446(7136), 2007. [42] E. P ariser. The Filter Bubble: How the New Personalize d Web is Changing what We R e ad and how We Think . Penguin Bo oks, 2012. [43] D. M. Romero and J. M. Kleinberg. The directed closure process in hybrid so cial-information netw orks, with an analysis of link formation on twitter. In ICWSM , 2010. [44] A. Sharma, J. M. Hofman, and D. J. W atts. Estimating the causal impact of recommendation systems from observ ational data. In EC , 2015. [45] D. Shin, S. Cetintas, K.-C. Lee, and I. S. Dhillon. T umblr blog recommendation with b oosted inductive matrix completion. In CIKM , 2015. [46] E. H. Simpson. The interpretation of in teraction in contingenc y tables. Journal of the R oyal Statistic al Soc iety. Series B , 1951. [47] A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web searc h sessions. Inf. Pro c. and Man. , 2006. [48] J. Su, A. Sharma, and S. Go el. The eﬀect of recommendations on netw ork structure. In WWW , 2016. [49] C. T an and L. Lee. All who wander: On the prev alence and characteristi cs of multi-comm unity engagemen t. In WWW . ACM, 2015. [50] L. T auscher and S. Green b erg. Ho w people revisit w eb pages. Int. J. Hum. Comput. Stud. , 1997. [51] J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The anatomy of the facebo ok so cial graph. CoRR , abs/1111.4503, 2011. [52] L. W altman and N. J. Ec k. A smart lo cal moving algorithm for large-scale mo dularit y-based community detection. Eur. Phys. J. B , 2013. [53] L. W eng and T. M. Len to. T opic-based clusters in egocentric netw orks on faceb ook. In ICWSM , 2014. [54] C. Wilson, B. Bo e, A. Sala, K. P . Puttaswam y , and B. Y. Zhao. User interactions in social netw orks and their implications. In Eur oSys , 2009. [55] J. Y ang and J. Lesko vec. Patterns of temp oral v ariation in online media. In WSDM , 2011. [56] Y. Y ang, N. Chawla, Y. Sun, and J. Hani. Predicting links in multi-relational and heterogeneous net works. In ICDM . IEEE, 2012. [57] D. Yin, L. Hong, X. Xiong, and B. D. Davison. Link formation analysis in microblogs. In SIGIR , 2011. [58] G. Y uan, P . K. Muruk annaiah, Z. Zhang, and M. P . Singh. Exploiting sentimen t homophily for link prediction. In R e cSys , 2014. [59] W.-X. Zhou, D. Sornette, R. A. Hill, and R. I. Dunbar. Discrete hierarchical organization of social group sizes. R oyal So ciety of L ondon B , 2005. [60] M. Zignani, S. Gaito, G. P . Rossi, X. Zhao, H. Zheng, and B. Y. Zhao. Link and triadic closure delay: T emp oral metrics for so cial net work dynamics. In ICWSM , 2014.

Evolution of Ego-networks in Social Media with Link Recommendations

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment