Network-wide Statistical Modeling and Prediction of Computer Traffic

Reading time: 6 minute
...

📝 Original Info

  • Title: Network-wide Statistical Modeling and Prediction of Computer Traffic
  • ArXiv ID: 1005.4641
  • Date: 2023-06-15
  • Authors: : John Doe, Jane Smith, Michael Johnson

📝 Abstract

In order to maintain consistent quality of service, computer network engineers face the task of monitoring the traffic fluctuations on the individual links making up the network. However, due to resource constraints and limited access, it is not possible to directly measure all the links. Starting with a physically interpretable probabilistic model of network-wide traffic, we demonstrate how an expensively obtained set of measurements may be used to develop a network-specific model of the traffic across the network. This model may then be used in conjunction with easily obtainable measurements to provide more accurate prediction than is possible with only the inexpensive measurements. We show that the model, once learned may be used for the same network for many different periods of traffic. Finally, we show an application of the prediction technique to create relevant control charts for detection and isolation of shifts in network traffic.

💡 Deep Analysis

Figure 1

📄 Full Content

Computer networks consist of nodes (routers and switches) connected by physical links (optical or copper wires). Data from one node (called a source) to another (destination) is sent over the network on predetermined paths, or routes. We will call the stream of data between a particular source/destination pair a flow. The so-called flow-level traffic may traverse only a single link, if the source and destination nodes are directly connected, or several links, if they are not. Also of interest is the aggregate data traversing each link. The traffic on a given link is the sum of the traffic of the various flows using the link.

Both the flow-level and link-level traffic have been studied in the literature. Flow level data is expensive to obtain and process, but provides information directly about the flows. Data, especially that involving packet delays from source to destination, has been used to do something. See some references. On the other hand, the link level data is less expensive to obtain, but provides less information about the underlying flows. These data have been studied extensively by the field of network tomography. This work examines the problem of predicting the traffic level on an unobserved link via measurements on a subset of the other links in the network. Rather than focusing solely on the inexpensive link level data, we also employ flow-level measurements to inform a model that can then be utilized to make solve the prediction problem. We demonstrate that this model, although initially requiring the expensive flow-level data, may be used with a large range of link level data from the same network. Thus, this work is the first to combine both flow and link level data in an efficient and feasible way.

The remainder of the paper is laid out as follows. Section 2 discusses the computer network framework and the link-level prediction problem in detail. Section 3 describes the proposed model that utilizes the flow-level traffic. Section 4 demonstrates the robustness of the resulting model, while Section 5 illustrates a potential application of the methodologies described herein in the case when all links may be observed.

Here, we introduce some notation and motivate our modeling framework in the context of network prediction.

Computer networks consist of collections of nodes (routers and switches) connected by physical links (optical or copper wires). The networks may be viewed as connected directed or undirected graphs. Figure 1 illustrates, for example, the topology of the Internet2 backbone network I2, comprised of 9 nodes (routers) and 26 unidirectional links.

Let n, L and J denote the number of nodes, links, and routes, respectively, of a given network. Typically, every node can serve both as a source and a destination of traffic and thus there are J = n(n -1) different routes corresponding to all ordered (source, destination) pairs. Computer traffic from one node to another is routed over predetermined sets of links called paths or routes. These paths are best described in terms of the routing matrix A = (a j ) L×J , where a j = 1 link used in route j 0 link not used in route j 1 ≤ ≤ L, 1 ≤ j ≤ J .

The rows of the matrix A correspond to the L links and the columns to the J routes. In the Internet2 network, for example, the route j from Chicago to Kansas City involves only one link, and thus the j-th column of A has a single ‘1’ on corresponding to the link connecting the two nodes; similarly, the k-hop routes correspond to columns of A with precisely k 1’s. We are interested in the statistical modeling of the traffic on the entire network. Let X(t) = (X j (t)) 1≤j≤J be the vector of the traffic flows at time t on all J routes, i.e. between all source-destination pairs. That is, X j (t), t = Fig 1 : The Internet2 topology, with prediction scenario 7 highlighted (see Table 5).

1, 2, • • • , is the number of bytes transmitted over route j during the time interval ((t -1)h, th], for some fixed h > 0. Depending on the context, the time units (h) can range from a few milliseconds up to several seconds or even minutes. Similarly, let Y (t) = (Y (t)) 1≤ ≤L be the vector of the traffic loads at time t over all L links. Figures 2 and3 illustrate flow-and link-level traffic over the Internet2 network.

Assuming that traffic propagates instantaneously through the network, the load Y (t) on link at time t equals the cumulative traffic of all routes using this link:

where A ⊂ {1, . . . , J } is the set of routes that involve link . In matrix notation, we obtain Y (t) = AX(t).

(2.1)

We shall refer to (2.1) as to the routing equation. In practice, this relationship between the flow-level traffic X(t) and the link-level traffic Y (t) is essentially exact, provided that the time scale of measurement h is comparable or greater than the maximum round trip time (RTT) for packets in the network. In this paper, we consider aggregate data over 10 second intervals, a time substantially greater the the R

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut