Automatic Bandwidth Delay Product Discovery Probably the most important unsolved technical problem with TCP over high-performance networks today is automatically determining the bandwidth-delay-product (BDP), which is used to specify the simplified maximum TCP window size for each TCP session. The correct BDP is extremely important for maximum utilization of high-performance networking without undue memory consumption, particularly over very long-distance high-performance networks. The simplified BDP is calculated by multiplying the round-trip-time (RTT) by the maximum bandwidth of the least-capable hop of all of the router hops between two hosts using the TCP protocol. The RRT is easily obtained with the "ping" protocol, which utilizes an ICMP echo request message, which is a special IP message that is echoed back to the sending host by the receiving. Simply measuring the time it takes to get the echoed message back provides the RTT. Right now, however, there is no way quick and easy method to automatically obtain the least-bandwidth number between an arbitrary pair of TCP hosts, so all users who expect to obtain high-performance networking results are required to manually compute this value, which essentially implies that such users must have an intimate knowledge of the complete network topology between their own host and any host with which they wish to communicate. Furthermore, every application that is to make use of this information requires special coding and special user-interface parameters. This is a ludicrous situation akin to having to be an expert automobile mechanic to be able to drive an automobile (which was actually the case at the dawn of the automobile age). As long as it requires networkengineer training for any user that must deal with BDP-discovery, we too will remain only at the dawn of the high-performance networking era. The method proposed herein for automatic BDP discovery and caching is to use a simple mechanism modeled after the ICMP Echo Request and Echo Reply protocol to discover the bandwidth of the least-capable hop between a given source and destination host pair. This new mechanism could be a new type of ICMP Request/Reply pair, or it could be a simple enhancement to the existing Echo Request/Reply, but using a new IP option class/number combination. The main difference between the new mechanism and the existing ICMP Request/Reply pair is that the router would have to process two new fields in the message. This new mechanism would actually be different from the ICMP Echo protocol in only a couple of ways, and would work as follows: 1. Two new fields would be defined for the Request and Reply message types: a) A Next-Hop-Least-Bandwidth-Request Field, and b) A Next-Hop-Least-Bandwidth-Reply Field 2. BDP Request and Reply would work like Echo Request and Reply does now except each router along the path would intercept the BDP messages and perform the following before forwarding the message to the next hop: a) If the message is a Request, then the router would compare the value of the Next-Hop-Least-Bandwidth-Request field with the router’s best knowledge of the next hop’s maximum possible bandwidth, and if this next hop bandwidth is less than the field’s value, then the new value would be overwritten in the field, otherwise the field’s existing value is unchanged. Essentially, the router should always compare the maximum possible bandwidth of the next hop with the current Next-Hop-LeastBandwidth-Request value. The goal is to return the maximum possible bandwidth of the least-capable link in the path so that the maximum TCP window will never be too small to consume all possible bandwidth, should such bandwidth ever be available. If a router isn't exactly sure of a link's maximum possible bandwidth, it should therefore use the largest bandwidth it thinks a link may be capable of. Note that this procedure is not a bandwidth allocation scheme and a router should never use a value smaller than the maximum bandwidth that a link is capable of. b) If the message is a Reply, then the router would compare the value of the Next-Hop-Least-Bandwidth-Reply field with the router’s best knowledge of the next hop’s maximum possible bandwidth, and if this next hop bandwidth is less than the field’s value, then the new value would be overwritten in the field, otherwise the field's existing value is unchanged. Essentially, the router should always compare the maximum possible bandwidth of the next hop with the current Next-Hop-Least-BandwidthReply value. The goal is to return the maximum possible bandwidth of the least-capable link in the path so that the maximum TCP window will never be too small to consume all possible bandwidth, should such bandwidth ever be available. If a router isn't exactly sure of a link's maximum possible bandwidth, it should therefore use the largest bandwidth it thinks a link may be capable of. Note that this procedure is not a bandwidth allocation scheme and a router should never use a value smaller than the maximum bandwidth that a link is capable of. 3. The source host would fill in the initial Next-Hop-Least-BandwidthRequest value, while the destination host would fill in the initial NextHop-Least-Bandwidth-Reply value. The destination host is also responsible for converting the received Request message into the outgoing Reply message, preserving all information received in the Request message, just as it would for a normal Echo Request message. The destination host could also extract the value of the Next-Hop-LeastBandwidth-Request Field. Note that when the BDP Reply arrives at the source host, it would provide the RRT plus the least-bandwidth information for both the forwarding and return paths between the source host and the destination host, which is all the information necessary to calculate the pair of simplified BDPs. This information would be cached in a new kernel table indexed by destination host IP addresses, and individually cached entries would timeout after some suitable interval to force fresh information to be obtained. Development of this BDP protocol initially requires the cooperation of at least one router vendor, though a crude prototype could be demonstrated with traceroute and SNMP-derived information.