Maximum Throughput Site to Site VPN Solution
Posted: Wed Jan 25, 2017 9:34 pm
I am working on a maximum throughput site to site VPN solution and have dedicated VPN hardware on each end as well as dedicated traffic generation computers (used iperf, iperf3, and SoftEther Traffic generator). I am bench testing and everything is directly connected. End to end latency is ~5 ms through loaded tunnel.
Pre-test 1: direct iperf (no SoftEther tunnel) between "vpnserver" and "vpnbridge" is ~920 Mbps with sub 0.1 ms latency as it is 1 GigE NIC hardware and 1 GigE switch.
Test 1: "vpnserver" and "vpnbridge" are Quad core i7-6700K @ 4.00 GHz on each end and with 1 GigE NIC hardware and 1 GigE switch. It appears I can max the tunnel out at ~850/850 Mbps bidirectional throughput. It does not appear to matter at all how many TCP sessions are active.
Pre-test 2: direct iperf (no SoftEther tunnel) between "vpnserver" and "vpnbridge" is ~9.15 Gbps with sub 0.1 ms latency as it is 10 GigE NIC hardware and 10 GigE switch.
Test 2: "vpnserver" and "vpnbridge" are Octo core Xeon D-1541 @ 2.10 GHz on each end and with 10 GigE NIC hardware and 10 GigE switch. It appears the tunnel maxes out at ~425/425 Mbps bidirectional throughput. It does not appear to matter at all how many TCP sessions are active, I tried everything from 32 down to 1.
Enabling SoftEther Cascade "half-duplex" mode appeared to have no impact on overall throughput.
Disabling QoS appeared to have no impact on overall throughput.
During Test 2, I began going into the BIOS and reducing the number of active cores:
8 cores ~ 425/425 Mbps bi-directional throughput
4 cores ~ 425/425 Mbps bi-directional throughput
2 cores ~ 390/390 Mbps bi-directional throughput
1 cores ~ 375/375 Mbps bi-directional throughput
(The reduced core count throughput is from my memory, I was not recording at the time.)
I also tried "compression" but that really slowed things down (~125/125 Mbps bi-directional throughput).
My traffic load is iperf with "-d" and with many sessions running which matches my real world use case of supporting many users on each side of the tunnel and accessing multiple resources (not just a single bulk file copy).
My current conclusion based on testing and results is that SoftEther appears to be single CPU bound and therefore maximum throughput is a result of single core maximum CPU processing.
Can anyone confirm or deny this, please?
Did I miss something in my testing? Suggestions?
Does anyone know of a way to increase overall site to site throughput?
Is there a methodology for SoftEther to maximize core count by establishing a tunnel per core and then balancing the overall traffic load across cores (a single traffic flow being limited to a single tunnel but the aggregate being spread across all available cores)?
Thank you!
Pre-test 1: direct iperf (no SoftEther tunnel) between "vpnserver" and "vpnbridge" is ~920 Mbps with sub 0.1 ms latency as it is 1 GigE NIC hardware and 1 GigE switch.
Test 1: "vpnserver" and "vpnbridge" are Quad core i7-6700K @ 4.00 GHz on each end and with 1 GigE NIC hardware and 1 GigE switch. It appears I can max the tunnel out at ~850/850 Mbps bidirectional throughput. It does not appear to matter at all how many TCP sessions are active.
Pre-test 2: direct iperf (no SoftEther tunnel) between "vpnserver" and "vpnbridge" is ~9.15 Gbps with sub 0.1 ms latency as it is 10 GigE NIC hardware and 10 GigE switch.
Test 2: "vpnserver" and "vpnbridge" are Octo core Xeon D-1541 @ 2.10 GHz on each end and with 10 GigE NIC hardware and 10 GigE switch. It appears the tunnel maxes out at ~425/425 Mbps bidirectional throughput. It does not appear to matter at all how many TCP sessions are active, I tried everything from 32 down to 1.
Enabling SoftEther Cascade "half-duplex" mode appeared to have no impact on overall throughput.
Disabling QoS appeared to have no impact on overall throughput.
During Test 2, I began going into the BIOS and reducing the number of active cores:
8 cores ~ 425/425 Mbps bi-directional throughput
4 cores ~ 425/425 Mbps bi-directional throughput
2 cores ~ 390/390 Mbps bi-directional throughput
1 cores ~ 375/375 Mbps bi-directional throughput
(The reduced core count throughput is from my memory, I was not recording at the time.)
I also tried "compression" but that really slowed things down (~125/125 Mbps bi-directional throughput).
My traffic load is iperf with "-d" and with many sessions running which matches my real world use case of supporting many users on each side of the tunnel and accessing multiple resources (not just a single bulk file copy).
My current conclusion based on testing and results is that SoftEther appears to be single CPU bound and therefore maximum throughput is a result of single core maximum CPU processing.
Can anyone confirm or deny this, please?
Did I miss something in my testing? Suggestions?
Does anyone know of a way to increase overall site to site throughput?
Is there a methodology for SoftEther to maximize core count by establishing a tunnel per core and then balancing the overall traffic load across cores (a single traffic flow being limited to a single tunnel but the aggregate being spread across all available cores)?
Thank you!