As part of our Net100 efforts in improving bulk transfers over high speed, high latency networks, we are evaluating the Web100 tools. This work is being done by Tom Dunigan and Florence Fowler. The Web100 project provides the software and tools necessary for end-hosts to automatically and transparently achieve high bandwidth data rates (100 Mbps) over the high performance research networks. Initially the software and tools have been developed for the Linux operating system. The Web100 distribution contains kernel patches, an API, and some GUI tools for observing and controlling the TCP parameters of a given flow (see GUI example).
At ORNL, we are interested in using the Web100 tools for diagnosing network throughput problems and utilizing the auto-tuning capabilities to optimize bulk transfers over gigabit links without using parallel streams. We also have been using TCP over UDP to allow us to tune TCP-like parameters and get detailed feedback on losses/retransmissions. This work is now part of the multi-instituion Net100 project.
To date we have Web100 installed on five ORNL x86/Linux PCs (100/1000E), a laptop with wireless (802.11b) and 56kb slip/ppp, a home PC over cable modem, and a UT x86/Linux PC (100E). The Web100 project also provides access to web100-enabled Linux boxes at NERSC, NCAR, NCSA, SLAC, CERN, and LBL.
Our current Net100 Web100-based tools include:
Java bandwidth tester
As a demonstration project, we have developed a Java bandwidth tester where the server reports back web100 info such as window sizes, RTT's, retransmits, SACK-enabled, etc. as illustrated in the following figure
The applet talks with a web100-enabled server that reports back bandwidth and information on losses and buffer sizes. Try out our ORNL web100 java applet. or our UT web100 java applet. The server keeps a few of the Web100 stats of the transfer of the server to the client in a log file. Here is a summary of lossy sessions from three servers that have been active at one time or another since fall of '01.
We have modified ttcp to report Web100 counters for a session. The user provides a config file specifying which web100 variables to report at the end of the transfer.
One of the interesting problems is how long to run a ttcp or iperf
to get a good bandwidth estimate.
Often 10 seconds is chosen, but better bandwidth estimates can be had
with web100, me thinks.
Notice in the following plot (ns simulation, no packet loss, delayed ACKs)
we plot both average and instantaneous bandwidth for a 100 Mbs path
with 100ms RTT, and a 1Gbs path with 200 ms RTT.
For the 100 Mbs path, a 10 second test will report about 90% of the available bandwidth. For the 1 Gbs path, a 10 second test will only report about 60% of the real bandwidth. Also notice that slow-start is over in about two seconds for the 100 Mbs path, but takes nearly 5s for the 1 Gbs path. If one waits til slow start is over, and then uses web100 variables to get the data throughput for about a second, one can get a good bandwidth estimate with a test under 10 seconds. The duration of slow-start is roughly log2 of window size (in segments) times the RTT (actually, a little less than twice this because of delayed ACKs). For example, bandwidth delay produce for 1 Gbs/200ms RTT is about 16667 1500-byte segments. Log2 of that is 14, so 2*14*.200 is 5.6 seconds. For a 10GigE on the same 200ms path, slow-start lasts about 7 seconds (see the graph below). The time to reach 90% of the bandwidth is about 10 times the slow-start duration minus 10 time the RTT. Increasing the MTU/MSS can help, but doubling the MTU only removes one RTT from the duration of slow-start. Web100 can also tell you during the test if you're experiencing drops, if so, you might as well terminate the throughput test. Les Cottrell and Ajay Tirumala have some experimental results with iperf quick mode. (Actually using iperf -i 1 can also get instantaneous snap shots.)
We developed a simple web100 daemon (webd) that has a config file of network addresses to "watch", and webd logs a selected set of Web100 variables when a "watched" stream closes. At present, the daemon awakes each 30 seconds to check stream status. The log is a simple flat ascii file that could be used for statistical analysis or autotuning. An entry from the log file looks like
We also have a tracing daemon (traced) that has a config file specifying paths and web100 variables to trace at 0.1 sec intervals. A tracefile for each flow is produced. We are working with PSC on a more general Web100 tracing facility.
Web100 validation testing
We did some tests to see if the web100 kernel (0.2) ran any slower for bulk transfers. We tested over the local net with 100 Mbs Ether and GigE with jumbo frames. The following results show that the web100 additions have little effect on throughput.
We have run tests from NERSC, ANL, SDSC, ucar.edu, UT, SLAC, CERN, wireless (802.11b), ISDN, SLIP/PPP, and home cable systems. Wide-area networks included ESnet (OC12/OC3), UT (VBR/OC3), and Internet 2. Local-area networks included 100T and GigE (including jumbo frame). We have also run tests over our NISTnet testbed. We have validated (and submitted some bug fixes) many of the Web100 variables with tcpdump/tcptrace.
Web100 as a diagnostic tool
We have used web100 to uncover a variety of network problems.