Building Models for Aggregate Traffic on Congested Links

Why we need better models for aggregate traffic:

Good models for aggregate traffic are needed for simulations, experiments, and analysis focused on evaluating mechanisms for active queue management, scheduling, differentiated and integrated services services, aggregate-based congestion control, and the like. Research in this area requires a model for the aggregate traffic at a congested or occasionally-congested link; queue management, scheduling, and QoS mechanisms are of little concern for those links that *never* experience congestion. We already know a lot about characterizing aggregate traffic in terms of the range of round-trip times, the distribution of connection sizes, the long-range dependence of the aggregate traffic, etc. Topology-related factors such as the effects of congestion elsewhere in the network (on both the forward and reverse paths) can also significantly affect the behavior of aggregate traffic on a link. There is a wide range of research and investigation currently in progress on these issues.

In this web page we attempt to summarize what we collectively know about the characteristics of aggregate traffic, and what we know about realistic or interesting models for generating aggregate traffic for experiments and simulations. Good models of aggregate traffic are needed for analysis as well (so that the analysis is not restricted to models of one-way traffic with long-lived flows with a common round-trip time). In addition, we attempt to identify key open issues. One end goal is a set of reference scenarios for generating aggregate traffic, for use in simulations, experiments, and analysis.

We are not in this web page investigating models for end-to-end paths. Such models would be needed in simulations, experiments, and analysis investigating the end-to-end performance of transport protocols. Key properties of end-to-end paths include loss rates and per-packet delay, as well as reordering, corrupted packets from noisy links, reverse-path congestion, asymmetric bandwidth, and the like. The performance of transport protocols is of course also affected by the router mechanisms and competing traffic experienced along the path.

Characterizing the congested links:

* An earlier questions page asked the following question: Which links in fact experience congestion in the current Internet (and what kinds of links can be expected to experience congestion in the future Internet). Where does the congestion occur? In edge networks? Peering points? Transoceanic links? Links outside of North America? How can the congested links be characterized, in terms of bandwidth, propagation delay, level of statistical multiplexing, and the like.

* Is it helpful to have separate models for different types of congested links: e.g., access links, campus links, transoceanic links, links to public or private peering points, etc.?

* What are "typical" patterns for the level of congestion?

Answer: We don't know. The "level of congestion" can be quantified by the packet drop rates at the queue. The Internet Traffic and Weather section of the web page on Measurement Studies of End-to-End Congestion Control in the Internet includes pointers to the Internet Traffic Report, the Internet Weather Report, the Internet End-to-end Performance Monitoring Group, and other sites that have long had measurements of the packet loss rates of pings to various routers in the Internet. Most of this measurement is necessarily about end-to-end path properties, rather than direct router measurements, but it gives an upper bound on packet loss rates for the links along that path.

* How often are there periods of extreme congestion, e.g., from flash crowds, DoS attacks, link failures, or other causes?

Characterizing the range of round-trip times:

* An earlier questions page asked the following question: For packets on a particular link, each packet could be assigned an estimated end-to-end delay (one-way or round-trip), based on the IP source and destination addresses for that packet. For packets on a particular link, what can we say about the distribution of end-to-end delay?

Answers. The summary, so far, is that a range of more than 10:1 in round-trip times seems common, but with most (85%, in one case) of the connections having round-trip times between 15 and 500 ms.

* For round-trip time measurements greater than 500 ms, to what extent is this delay due to queueing delay? to routing problems? to delay at the end-node?

Characterizing the distribution of flow sizes:

* What is the distribution of flow sizes, ranging from "mice" (short web transfers) to "elephants" (long-lived sessions)?

Answer: The section on Mice and Elephants on the web page on Measurement Studies of End-to-End Congestion Control in the Internet includes pointers to a range of measurements for the distribution of flow sizes. The distribution of connection sizes is usually modeled by a log-normal distribution for the body of the distribution, and with a heavy tail (e.g., a Pareto distribution with shape parameter less than two).

The web page on Self-Similarity and Long Range Dependence in Networks also has some pointers to work in this area.

The web page on Modeling Peer-to-Peer Traffic.

Characterizing the non-TCP traffic:

* What is the fraction of non-TCP traffic on a link?

Answers: Recent measurements tend to show 90-95% of the bytes on a link from TCP. The section on Bandwidth used by Different Traffic Types on the web page on Measurement Studies of End-to-End Congestion Control in the Internet includes pointers to a range of studies showing the traffic breakdown on various links by protocol and by application.

* How can this non-TCP traffic be characterized, in terms of the applications and other characteristics.

Characterizing the effects of congestion and other delays elsewhere on the forward and reverse paths.

* The paper by Sarvotham et al. on Connection-level Analysis and Modeling of Network Traffic suggests that bursts arise from high-volume "alpha" traffic, with the "beta" traffic consisting of traffic constrained by low-bandwidth links elsewhere in the network.

Characterizing the delay distribution.

* The internet draft on Statistics of One-Way Internet Packet Delays by Corlett et al., 2002:
"The two local datasets are characterized by quiet periods ... separated by periods of severe volatility. In contrast, the international dataset showed only small variations in [delay statistics] over a four-day measurement period."
Several of these questions are taken from Sally's Questions web page.
Proposed addition to this page can be sent to Sally Floyd.
This material is based upon work supported by the National Science Foundation under Grant No. 0230921. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Last modified: September 2002