Supported by the GlobalNOC at Indiana University

Most PopularMost popular assets for this branch of the site.
  1. University of Hawaii Asymmetrical Performance
Recent ChangesRecently edited assets for this branch of the site.
  1. University of Hawaii Asymmetrical Performance

Knowledge Base


One of the Performance Engagement Team's missions is to share information with the R&E community to help others recognize and troubleshoot potential network performance issues on their own. This section will house post-resolution information, in an anonymized fashion, of issues that the PET deems to be interesting or useful outside of the incident's original actors. 



University of Hawaii Asymmetrical Performance

Keywords: Asymmetrical, Hawaii, UOH

University of Hawaii Asymmetrical Performance

Completed: Novemeber 10th 2016

 Initial symptom of an issue was seen in the Netsage dashboard (http://data.ctc.transpac.org/maddash-webui/index.cgi?dashboard=IRNC%20Mesh) manifesting itself as poor BWCTL throughput from a Perf Sonar node at University of Hawaii toward all other external Pacific IRNC test points. A notable exception was that performance was graphed as good in both directions to the Transpac Seattle node. 

 

When manually confirming the dashboard results, an MTU mismatch was located between the test.seat.transpac.org 10GE adapter and the Transpac router causing ethernet negotiated frames at 1500 rather than the host configured jumbo frames. Curiously, when fixed to be mtu 9000 end to end, what had previously been our only “good” test, now showed the same “poor” result as the others. Seeing this change led to the discovery that manually adjusting the MSS in testing parameters to be smaller had the curious effect of better performance:

 

 [xxxxx@test ~]$ bwctl -T iperf3 -s uhmanoa-tp.ps.uhnet.net 
TCP MSS: 8948 (default)
[ 14] 0.00-10.00 sec 270 MBytes 227 Mbits/sec 29 sender
[ 14] 0.00-10.00 sec 264 MBytes 221 Mbits/sec receiver

[xxxxx@test ~]$ bwctl -T iperf3 -s uhmanoa-tp.ps.uhnet.net -m 5010
[ 15] 0.00-10.00 sec 425 MBytes 357 Mbits/sec 19 sender
[ 15] 0.00-10.00 sec 418 MBytes 351 Mbits/sec receiver

[xxxxx@test ~]$ bwctl -T iperf3 -s uhmanoa-tp.ps.uhnet.net -m 4010
[ 15] 0.00-10.00 sec 2.04 GBytes 1756 Mbits/sec 26 sender
[ 15] 0.00-10.00 sec 2.03 GBytes 1745 Mbits/sec receiver

[xxxxx@test ~]$ bwctl -T iperf3 -s uhmanoa-tp.ps.uhnet.net -m 3010
[ ID] Interval Transfer Bandwidth Retr
[ 15] 0.00-10.00 sec 3.05 GBytes 2616 Mbits/sec 0 sender
[ 15] 0.00-10.00 sec 3.03 GBytes 2605 Mbits/sec receiver

[xxxxx@test ~]$ bwctl -T iperf3 -s uhmanoa-tp.ps.uhnet.net -m 2000
[ 15] 0.00-10.00 sec 4.22 GBytes 3626 Mbits/sec 0 sender
[ 15] 0.00-10.00 sec 4.22 GBytes 3626 Mbits/sec receiver

 

It was clear from this testing that something in the middle of the hosts was modifying behavior.

 

Several detailed UH maps depicting both physical and logical network layout were provided to the IRNC NOC, and a request was made to the layer2 provider, Network A, for the same. 

 

A software bug was identified with the Network A vendor, Vendor A, which loosely matched characteristics in the issue and despite not being able to replicate in a lab environment, a chassis code upgrade was performed. Following the upgrade, there was no change in behavior. Another hardware point in common for all traffic matching the issue was a switch at the UH border, in transit from router to optical.  This device from Vendor B is known to have small buffers and no cut-through switching. Maintenance was set up to bypass this switch altogether. Testing with the switch bypassed revealed a dramatic increase in performance with ip mtu 9000.

 

 

 Final Findings

  • Ultimately, the issue was caused by an under buffered switch model from Vendor B unable to perform optimally on longer path
  • The initial issue matched the signature of the equipment Vendor A's hardware bug, which falsely pointed the finger in that direction until the switches could be upgraded and removed from the troubleshooting path.
  • Having accurate layer2 maps with equipment models identified from the beginning of the investigation might have provided the engineer with the possibility that the issue was somewhere other than on Vendor A’s switches. 


Your request has been completed.