This is a short note to recommend changing the permitted initial window in TCP from 1 packet to 4K bytes. (More precisely, the initial window would be the maximum of the initial segment size and 4380 bytes (3x1460). An additional constraint would be that the initial window would be at most four times the initial segment size. As always, the upper bound imposed by the receiver's advertised window would also still apply. For brevity, for the rest of the note I refer to this simply as a larger initial window.) We emphasize that we do not propose REQUIRING a larger initial window for TCP; we simply propose explicitly ALLOWING one. This was discussed at the End-to-end Research Group meeting in November 1996. This general change is supported by a number of other people, including Fred Baker, Jon Crowcroft, Van Jacobson, Jamshid Mahdavi, Matt Mathis, and Vern Paxson. That does not, however, necessarily mean that they have each read and agree with every word in the most recent draft of the note. I also thank Bob Braden, K.K. Ramakrishnan, and Joe Touch for feedback on this note. The change: The recommendation is to allow the initial window used by a TCP connection to increase from 1 packet (or more precisely, segment) to roughly 4K bytes. The initial window would be the maximum of the initial segment size and 4380 bytes (3x1460). An additional constraint would be that the initial window would be at most four times the initial segment size. Thus, TCP connections sending 1460-byte packets could initially send three packets, and TCP connections sending 512-byte packets could initially send four packets. This increased initial window would be optional: that a TCP MAY start with a larger initial window, not that it SHOULD. This would only apply to the initial window of the connection, in its very first roundtrip time of data, or to connections that are just beginning to send data after a long quiescent period. This would not change the behavior after a retransmit timeout, when the sender would continue to slow-start from an initial window of one packet. The benefits: (1) For connections with only a small amount of data to send, a larger initial window would reduce the time needed to send all of the data (assuming moderate packet drop rates). For the many email and web page files that are less than 4K bytes, the larger initial window would reduce the data transfer to a single roundtrip time. (2) For connections that will be able to use a large congestion window, this eliminates several round-trip times in the initial slow-start to increase the congestion window from one packet to two, four, and then eight packets. This would be of particular benefit for high-bandwidth large-propagation-delay TCP connections, such as those over satellite links. (3) For connections with packets of no more than 2K bytes, this means that the sender can initially send at least two packets. For data receivers that use a "delayed ACK" but that send an ACK for at least every second packet (as recommended in RFC 1122 Section 4.2.3.2), this means that the data receiver will send an ACK immediately after receiving the second packet. This avoids the delay (of 0.1 seconds or more) that can occur when a sender starts with a congestion window of one packet. Implementation issues: When implemented along with Path MTU Discovery, only one of the packets in the initial window should have the "Don't Fragment" bit set. If implemented, the initial window MUST be configurable. The default setting of the initial window (to either one segment, or up to 4380 bytes) SHOULD be per assigned numbers. Thus implementations will use the preconfigured standard value by default, but the standard value can be tuned within the allowed range for some specific context. Even though the initial window is at most four times the initial segment size, under some limited conditions TCP may send more than four packets in the initial burst. This would occur, for example, if the TCP data sender sends an initial large packet with the "Don't Fragment" bit set, discovers that the MTU should be set to 512 bytes, and then retransmits eight 512-byte segments. This larger initial window should not be viewed as an encouragement for web browsers to open four simultaneous TCP connections all with larger initial windows. (Web browsers should not open four simultaneous TCP connections to the same destination in any case, because this works against TCP's congestion control mechanisms.) Are there drawbacks to the connection? In high-congestion environments, particularly for routers that have a bias against bursty traffic (as in the typical Drop Tail router queues), a TCP connection could sometimes be better off starting with an initial window of one packet. There are scenarios where a TCP connection slow-starting from an initial window of one packet might not have packets dropped, while a TCP connection starting with an initial window of four packets might have packets dropped unnecessarily, due to the inability of the router to handle small bursts. This could result in an unnecessary retransmit timeout. For a large-window connection that is able to recover without a retransmit timeout, this could result in an unnecessarily-early transition from the slow-start to the congestion-avoidance phase of the window increase algorithm. These premature packet drops should not happen in uncongested networks, or in moderately-congested networks where the congested router used RED (Random Early Congestion) queue management [FJ93]. Some TCP connections will receive better performance with the higher initial window even if the burstiness of the initial window results in premature packet drops. This will be true if (1) the TCP connection recovers from the packet drop without a retransmit timeout, and (2) the TCP connection is ultimately limited to a small congestion window by either network congestion or by the receiver's advertised window. Because some connections could get better performance with an initial window of one packet, using a larger initial window should be optional, not a requirement. Is there any danger to the network? We consider two separate potential dangers for the network. The first danger would be a scenario where a large number of packets on congested links were duplicate or unnecessarily-retransmitted packets that had already been received at the receiver. The second danger would be a scenario where a large number of packets on congested links were packets that would be dropped later in the network before reaching their final destination. Unnecessarily-retransmitted packets: As described in the previous section, the larger initial window could occasionally result in a packet dropped from the initial window, when that packet might not have been dropped if the sender had slow-started from an initial window of one packet. However, Appendix A shows that even in this case, the larger initial window would not result in a large number of unnecessarily-retransmitted packets. Packets dropped later in the network: How much would the larger initial window for TCP increase the number of packets on congested links that would be dropped before reaching their final destination? This is a problem that can only occur for connections with multiple congested links, where some packets might use scarce bandwidth on the first congested link along the path, only to be dropped later along the path. First, many of the TCP connections will have only one congested link along the path. Packets dropped from these connections do not ``waste'' scarce bandwidth, and do not contribute to congestion collapse. However, some TCP connection paths will have multiple congested links, and packets dropped from the initial window could use scarce bandwidth along the earlier congested links before being dropped. To the extent that the drop rate is independent of the initial window used by TCP packets, the problem of congested links carrying packets that will be dropped before reaching their destination will be similar for TCP connections that start by sending four packets or one packet. It is true that for a network with high packet drop rates, increasing the initial TCP congestion window could increase the packet drop rate even further. This is in part because routers with Drop Tail queue management have difficulties with bursty traffic in times of congestion. However, this should be a second order effect. This has not been explored with extensive simulations, or with extensive analysis, and simulations would certainly be useful. However, given uncorrelated arrivals for TCP connections, the larger initial TCP congestion window should generally not significantly increase the packet drop rate. There are other changes in the network are also making a larger initial window less of a problem. These include the increasing deployment of higher-speed links where 4K bytes is a rather small quantity of data and the deployment of queue management mechanisms such as RED that are more tolerant of transient traffic bursts. The current dangers of congestion collapse most likely now come not from a 4K initial burst from TCP connections, but from the increased deployment of UDP connections without any end-to-end congestion control at all. Are there fairness considerations? No. This does not mean that a TCP connection with an initial window of one 1460-byte packet and a TCP connection with an initial window of three 1460-byte packets would see exactly the same throughput in all scenarios. This means that there would be no systematic unfairness against TCP connections that do not use the larger initial window. Summary: This is a small change to make to TCP that does not present any major dangers, and that is likely to be of benefit to TCP connections with long roundtrip times (saving several roundtrip times of the initial slow-start). The process: What would be the process of making this change? If a discussion results in rough consensus on the end-to-end-interest mailing list, then I will write an internet draft, gather supporting documentation, and submit it to the Transport Area Directors, who would submit it to the IESG. The current standards: What do the current standards say about the initial congestion window? Section 4.2.2.15 of RFC 1122 says the following: Recent work by Jacobson [TCP:7] on Internet congestion and TCP retransmission stability has produced a transmission algorithm combining "slow start" with "congestion avoidance". A TCP MUST implement this algorithm. I am not aware of any other discussion of the initial window in the standards literature. Appendix A: In the current environment (without Explicit Congestion Notification), all TCPs use packet drops as the feedback mechanism from the network about the limits of available bandwidth. However, the change to a larger initial window should not result in a large number of unnecessarily-retransmitted packets. If a packet is dropped from the initial window, there are three different ways for TCP to recover: (1) Slow-starting from a window of one packet, as is done after a retransmit timeout, or after Fast Retransmit in Tahoe TCP; (2) Fast Recovery without SACK, as is done after three DUP ACKs in Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the sender and the receiver support the SACK option. In all three cases, if a single packet is dropped from the initial window, there are no unnecessarily-retransmitted packets. (Note that for a TCP sending four 512-byte packets in the initial window, a single packet drop will not require a retransmit timeout, but can be recovered from using the Fast Retransmit procedure.) What if multiple packets are dropped from the initial window? Using the first recovery method, slow-starting from a window of one packet, the number of unnecessarily-retransmitted packets is limited [FF96]. In the second case of Fast Recovery without SACK, multiple packet drops from a window of data generally result in a retransmit timeout. Again, the number of unnecessarily-retransmitted packets is small. In the third case, of Fast Recovery with SACK, there can only be unnecessarily-retransmitted packets if a precise pattern of ACK packets are also lost [F96], or if packets are seriously-reordered in the network. In any case, the number of unnecessarily-retransmitted packets due to a larger initial window should be small. References: [FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of Tahoe, Reno, and SACK TCP. To appear in Computer Communications Review, July 1996. [F96] Floyd, S., Issues of TCP with SACK. Technical report, January 1996. Available from http://www-nrg.ee.lbl.gov/floyd/. [JF93] Floyd, S., and Jacobson, V., Random Early Detection gateways for Congestion Avoidance. IEEE/ACM Transactions on Networking, V.1 N.4, August 1993, p. 397-413.