Scaling of human behavior during portal browsing

Scaling of human behavior during portal browsing

We investigate transitions of portals users between different subpages. A weighted network of portals subpages is reconstructed where edge weights are numbers of corresponding transitions. Distributions of link weights and node strengths follow power laws over several decades. Node strength increases faster than linearly with node degree. The distribution of time spent by the user at one subpage decays as power law with exponent around 1.3. Distribution of numbers P(z) of unique subpages during one visit is exponential. We find a square root dependence between the average z and the total number of transitions n during a single visit. Individual path of portal user resembles of self-attracting walk on the weighted network. Analytical model is developed to recover in part the collected data.


💡 Research Summary

The paper presents a quantitative investigation of how users navigate within a web portal, focusing on the statistical regularities that emerge from the sequence of sub‑page visits. By processing server logs, the authors reconstruct a weighted, undirected network in which each node represents a sub‑page and each edge weight w₍ᵢⱼ₎ records the number of direct transitions between the two pages. The first set of results concerns the topology of this network. Both the distribution of edge weights and the distribution of node strengths (the sum of incident edge weights) follow power‑law forms, P(w) ∝ w⁻ᵅ and P(s) ∝ s⁻ᵝ, extending over several orders of magnitude. Moreover, node strength grows faster than linearly with degree, s ∝ k^γ (γ > 1), indicating that highly connected pages attract disproportionately more traffic than would be expected from degree alone. This confirms that portal structures exhibit classic scale‑free characteristics, with a small core of “hub” pages concentrating the bulk of user flow.

The temporal dimension of user behavior is examined next. The dwell time τ that a visitor spends on a single sub‑page is also power‑law distributed, P(τ) ∝ τ⁻ᵟ, with an exponent δ ≈ 1.3. Such a heavy‑tailed distribution reflects a mixture of very short glances and prolonged stays, a pattern reminiscent of bursty human activity observed in other online contexts. The authors further analyze the number of distinct sub‑pages z visited during a single session. Unlike the heavy‑tailed weight and strength distributions, P(z) follows an exponential decay, P(z) ∝ e⁻ˡᵃᵐᵇᵈᵃz, indicating that sessions with many unique pages are rare. A striking empirical relationship emerges between the total number of transitions n in a session and the average number of distinct pages ⟨z⟩: ⟨z⟩ ≈ C √n. This square‑root law suggests that early in a session users explore new pages almost linearly, but as the session progresses the probability of revisiting already seen pages rises sharply, slowing the growth of novelty.

To capture these observations, the authors propose a self‑attracting random walk model on the weighted portal network. In the model, each time a node is visited its attractiveness is increased, thereby raising the probability of returning to it in subsequent steps. Initially the walk behaves like an unbiased random walk, but the reinforcement mechanism quickly creates a bias toward previously visited nodes. Simulations of this process reproduce the empirical power‑law exponents for edge weight and node strength, as well as the √n scaling between n and ⟨z⟩. The analytical treatment, based on mean‑field approximations, yields closed‑form expressions that match the numerical data for the early‑time regime; however, the model does not incorporate semantic relationships between pages, individual user preferences, or diurnal traffic patterns, which limits its ability to capture finer‑grained deviations observed in the real data.

Overall, the study demonstrates that human navigation on a portal is governed by universal scaling laws typical of complex networks and bursty temporal processes. The findings have practical implications for portal design, content placement, and recommendation algorithms: recognizing that a few hub pages dominate traffic can guide the allocation of resources, while the √n law informs expectations about how quickly users will encounter new content during a session. From a theoretical perspective, the work adds to the growing body of evidence that simple reinforcement mechanisms can generate the heavy‑tailed distributions and sublinear growth patterns characteristic of many human‑generated sequences. Future research could extend the model to incorporate heterogeneous user classes, content semantics, and adaptive interface elements, thereby bridging the gap between abstract statistical regularities and personalized user experience.