Inference of the Russian drug community from one of the largest social networks in the Russian Federation
The criminal nature of narcotics complicates the direct assessment of a drug community, while having a good understanding of the type of people drawn or currently using drugs is vital for finding effective intervening strategies. Especially for the Russian Federation this is of immediate concern given the dramatic increase it has seen in drug abuse since the fall of the Soviet Union in the early nineties. Using unique data from the Russian social network ‘LiveJournal’ with over 39 million registered users worldwide, we were able for the first time to identify the on-line drug community by context sensitive text mining of the users’ blogs using a dictionary of known drug-related official and ‘slang’ terminology. By comparing the interests of the users that most actively spread information on narcotics over the network with the interests of the individuals outside the on-line drug community, we found that the ‘average’ drug user in the Russian Federation is generally mostly interested in topics such as Russian rock, non-traditional medicine, UFOs, Buddhism, yoga and the occult. We identify three distinct scale-free sub-networks of users which can be uniquely classified as being either ‘infectious’, ‘susceptible’ or ‘immune’.
💡 Research Summary
The paper tackles the growing problem of drug abuse in the Russian Federation by exploiting a novel data source: the Russian‑language segment of the global social network LiveJournal, which hosts over 39 million registered users worldwide. The authors first construct a comprehensive drug‑related lexicon that combines official terminology (e.g., names of controlled substances) with colloquial slang, euphemisms, and transliterations commonly used in Russian online communities. This dictionary contains roughly 1,200 entries and each term is assigned a weight reflecting its typical relevance to drug discussion.
Using a context‑sensitive text‑mining pipeline, the authors scan the blog posts of all Russian‑language LiveJournal users. Rather than a naïve keyword count, the algorithm evaluates surrounding words, co‑occurrence patterns, and frequency to distinguish genuine drug references from benign uses of homonymous words (for example, differentiating “трак” as a slang term for a drug shipment from its literal meaning “truck”). Users whose posts exceed a predefined threshold of drug‑related content (e.g., cumulative weighted score > 5 or at least three consecutive drug‑related mentions) are classified as members of an “online drug community.”
The resulting community comprises about 0.8 % of the examined Russian‑language accounts but shows markedly higher activity levels (posts per day, comment volume) and a denser friendship network than the rest of the platform. To understand the sociocultural profile of these users, the authors extract the self‑declared “interests” fields from user profiles and compare the distribution of interests between the drug community and the broader user base. Statistical analysis reveals that members of the drug community are disproportionately interested in Russian rock music, non‑traditional medicine (e.g., alternative healing, herbal remedies), UFOs and extraterrestrials, Buddhism, yoga, and occult or mystical topics. This pattern suggests that drug use in this context is intertwined with alternative worldviews, subcultural identity formation, and a search for non‑mainstream forms of meaning, rather than being driven solely by material addiction.
Network analysis is then applied to the friendship graph. The authors compute degree distributions, clustering coefficients, and average path lengths, finding that the overall LiveJournal network exhibits a scale‑free topology. Within the drug‑related subgraph, three distinct, also scale‑free, subnetworks emerge, which the authors label “infectious,” “susceptible,” and “immune.”
-
Infectious subnetwork – Contains high‑degree nodes that actively post drug‑related content. These hubs have short average distances to other users, making them efficient vectors for the rapid diffusion of drug‑related information, analogous to super‑spreaders in epidemiological models.
-
Susceptible subnetwork – Consists of users with moderate degree who currently post little or no drug‑related material but are structurally close to the infectious hubs. Their position makes them vulnerable to future exposure and potential recruitment into the drug community.
-
Immune subnetwork – Comprises low‑degree users who are largely disconnected from the drug‑related hubs and show negligible drug‑related posting. They represent a relatively protected segment of the population.
The identification of these three groups provides a quantitative framework for viewing drug use as a socially transmitted phenomenon. By mapping the network’s topology, interventions can be targeted more efficiently: for example, focusing educational or counter‑information campaigns on the infectious hubs could disrupt the primary transmission pathways, while monitoring susceptible users could enable early preventive measures.
The authors acknowledge several limitations. LiveJournal’s user base is not a statistically representative sample of the Russian population; it skews toward younger, more internet‑savvy individuals, and geographic or socioeconomic biases may affect generalizability. The drug lexicon, despite being extensive, may miss emerging slang, leading to false negatives. Moreover, textual evidence of drug discussion does not equate to actual consumption or trafficking behavior; offline validation would be required to confirm real‑world relevance.
Future work is suggested in three directions: (1) integrating data from additional platforms popular in Russia (e.g., VKontakte, Telegram) to broaden coverage; (2) employing machine‑learning classifiers (e.g., transformer‑based language models) to improve context understanding and reduce reliance on static dictionaries; and (3) linking online indicators with law‑enforcement or health‑service records to assess predictive power for real‑world drug‑related incidents.
In conclusion, the study demonstrates that large‑scale, publicly available social‑media data combined with sophisticated text‑mining and network‑analysis techniques can uncover hidden drug‑using subpopulations, characterize their cultural affinities, and map the structural pathways through which drug‑related information spreads. This methodological advance offers a valuable complement to traditional field surveys and provides policymakers with actionable, data‑driven insights for designing targeted prevention and intervention strategies in the Russian Federation.