Social groups are fundamental building blocks of human societies. While our social interactions have always been constrained by geography, it has been impossible, due to practical difficulties, to evaluate the nature of this restriction on social group structure. We construct a social network of individuals whose most frequent geographical locations are also known. We also classify the individuals into groups according to a community detection algorithm. We study the variation of geographical span for social groups of varying sizes, and explore the relationship between topological positions and geographic positions of their members. We find that small social groups are geographically very tight, but become much more clumped when the group size exceeds about 30 members. Also, we find no correlation between the topological positions and geographic positions of individuals within network communities. These results suggest that spreading processes face distinct structural and spatial constraints.
Social groups are common among animals and humans [1][2][3][4][5]. In humans, they reflect friendship, kinship, and work relationships, and can also be seen as social networks. From an evolutionary and historical perspective, the formation of such network groups -consisting of agglomerations of dyadic interactions -has been constrained by geography. In contrast, larger social units, enabled by modern technology and political organization, offer drastically different opportunities for social interactions and for group assembly over larger geographic ranges. This raises two sorts of questions. First, is the structure of "old-fashioned" groups similar to the large-scale groups possible in modern society? And second, what role does geography play in group formation?
If we represent the social relationships among a population of people as a network, then groups can be seen as “communities” within the population that consist of sets of nodes that are relatively densely connected to each other but sparsely connected to other nodes in the network [6,7]. While social communities have been studied for a long time [8], it has recently become feasible, with mobile phone data, to monitor the social interactions and geographic positions of millions of individuals [9,10], and to apply algorithmic detection of communities on a large scale [6,7]. The structure of dyadic social interactions is known to depend on geography, for example, as shown by the decay of friendship probability with distance, based on voluntary self-reports of hometown and US state, in a blog community [11], and the decrease in communication probability with distance based on the zip codes of cell phone billing addresses [12]. In addition, a previous study has shown that smaller communities are more homogeneous with respect to the billing postal codes of their members [13], while another presented evidence that this persists across a hierarchy of communities [14]. However, there are no prior large-scale studies of the way in which community structure depends on geography, where the actual communication locations are used and where geographical properties of communities themselves are examined (see Fig. 1).
Figure 1. Visualization of a community in the mobile phone network. This juxtaposition of (A) the topological structure and (B) the geographical structure demonstrates the interplay of these two dimensions. The purple and orange nodes are geographically close, but topologically they lie at five degrees of separation. In contrast, the red and green nodes are connected to each other, and also share several neighbors, yet they are geographically separated by a large distance. Overlapping nodes in (B) have been moved slightly for visual clarity.
With respect to group formation, geography can be seen as a kind of constraint. That is, social connections not only face network constraints and opportunities (we tend to form ties with others who are the friends of our friends), but also, quite obviously, geographic constraints and opportunities. What is unclear, however, is the way in which such geographic constraints and opportunities affect and shape network communities above and beyond their effect on dyadic interactions.
We create a network of social interactions by measuring ties between individuals based on mobile phone call and text messaging data from an unnamed European country. Based on the records of 72.4 millions calls and 17.1 million text messages accumulated over a one-month period, the resulting network has 3.4 million nodes connected by 5.2 million weighted (non-binary) ties, resulting in an average degree k ≈ 3.0. Each time a user initiated or received a call or a text message, the location of the tower routing the communication was recorded [10]. We exploited these records to assign each individual to the location where they conducted most of their cell phone communication, which for most individuals is likely to correspond to the location of their home or work. This resulted in one coordinate pair (x i , y i ) per user, which enabled us to define the geographic distance for any user pair as
We used this to compute the probability of a call-tie and the probability of a text-tie as a function of distance (Fig. 2).
Although from the point of view of technology there is very little difference between placing a shortdistance or long-distance communication (for either voice or text), we find that the probability of communication is strongly related to the distance between the individuals, and it decreases by approximately The probability of having a tie decreases as a function of distance. Two limiting cases, corresponding to exponents one and two, are shown as dashed lines. Note that if geography played no role, we would expect P (d) to be independent of distance d, resulting in a horizontal line in this plot. Inset: Tie strength, in contrast to the communication probability, is nearly flat with distance, although there is a minor decreasin
This content is AI-processed based on open access ArXiv data.