Poor LACP LAG Utilization Using SRC IP+DST IP Hash Algorithm

Poor LACP LAG Utilization Using SRC IP+DST IP Hash Algorithm

Poor LACP LAG Utilization Using SRC IP+DST IP Hash Algorithm
Poor LACP LAG Utilization Using SRC IP+DST IP Hash Algorithm
Thursday - last edited Yesterday
Model: TL-SX3016F  
Hardware Version: V1
Firmware Version: 1.20

I have a 4-port LACP LAG configured between my 10G switch and each of my 3 Proxmox hosts. The hosts are using a layer 2+3 LACP hash, and the switch is configured to use a layer 3 hash (SRC IP+DST IP). I tested load balancing efficiency using iperf. My findings were very underwhelming. I found that even if the host distributes load across all 4 LAG member interfaces, the switch will, at most, use 2 interfaces. I expected to see this improve by increasing concurrent data streams but saw no change.


I understand that load balancing isn't exactly the primary purpose of the hashing algorithm, but these results seem a little extreme, almost like the switch isn't making much of an attempt at all. Is there anything that I can do to improve load balancing efficiency?

 

Additional testing notes:

  • I tested up to 14 concurrent data streams.
  • Each iperf data stream was bound to a different IP address on a different subnet.
  • If I connect 2 hosts directly, Proxmox manages to push approximately 30G. The switch maxes out at 12G.
  • Load balancing efficiency improves if I switch the hash algorithm to DST IP instead, but this seems less ideal for other reasons.
  • I'm using Omada controller version 6.0.0.25.

 

Thanks in advance

 

-Newb

  0      
  0      
#1
Options
1 Accepted Solution
Re:Poor LACP Load Balancing Performance Using SRC IP+DST IP Hash Algorithm-Solution
Friday - last edited Yesterday

  @GRL 

 

A lot of the heavier traffic (Ceph, VM migrations, iSCSI, etc) between hosts is associated with Linux VLAN interfaces which have unique IP addresses but not unique MAC addresses (they inherit the MAC address of the LAG), making source/destination MAC hashing a poor choice.

 

In any case, I figured out more precisely what is going on. TP-Link's hashing algorithm just XORs the source and destination IP addresses. I didn't realize this until I ran show etherchannel load-balance. During my initial testing, I used a standard/predictable addressing scheme like this:

Host1 IPs

  • 192.168.21.1
  • 192.168.22.1
  • 192.168.23.1
  • ...etc

Host2 IPs

  • 192.168.21.2
  • 192.168.22.2
  • 192.168.23.2
  • ...etc

 

If you XOR any 2 source and destination addresses on the same subnet, you end up with 00000000.00000000.00000000.00000011 because only the last 2 host bits are different. Same hash = same member interface = no balancing. I don't know if that's what you were getting at by talking about entropy; I would describe it more as a consequence of the hashing algorithm being dead simple, but I'm sorry if I failed to pick up on the specifics of what you were trying to tell me.

 

I confirmed that randomizing the IP addresses a bit more significantly increases LAG utilization. Alternatively, there is a way to force MAC address hashing to work better as well. Just for the sake of any future readers, you can go to /etc/network/interfaces on Proxmox and add hwaddress ether <some_random_MAC> to the VLAN interface block. Both methods resulted in approximately the same increase in LAG utilization when using the same number of iperf data streams.

 

-Newb

Recommended Solution
  0  
  0  
#5
Options
4 Reply
Re:Poor LACP Load Balancing Performance Using SRC IP+DST IP Hash Algorithm
Thursday - last edited Thursday

  @NewbAdmin 

 

I found i got much better LAG balancing with using only SRC or only DST hashing.  The switch only applied the hash/loadbalance on data its putting on to the lag, not what its receiving on the lag, and getting the best mix at both ends is key

 

For example

 

Gateway <10g> Core Switch <4x 1g LAG> POE Switch <> EAPs

 

I find that i get the best LAG usage if i set the POE switch to "DST IP"- because the gateway IPs are different, the gateway MAC is always the same,  and the core switch to "DST MAC" since wireless clients have different MACs but very similar IPs

 

MAC hashing gives the most entropy and therefore the best likelyhood of different links used per stream - as long as the MACs are different

IP hashing can struggle if its going to the same IP - no entropy in that

  1  
  1  
#2
Options
Re:Poor LACP Load Balancing Performance Using SRC IP+DST IP Hash Algorithm
Thursday - last edited Thursday

  @GRL 

 

Here's my understanding of LACP load balancing. Please point out anything you think is wrong or incomplete.

  • An ethernet frame enters the switch.
  • The switch uses metadata in the frame to calculate a hash. Said metadata might be any combination of the source/destination MAC address, the source/destination IP address, and the source/destination port number. The switch should produce a unique hash for any variation in the metadata used to calculate the hash.
  • The switch uses the calculated hash to identify a data flow and ensures that data in the same flow will always land on the same member interface.
  • The more unique flows the switch is able to identify, the more likely it is to spread the traffic out across the member interfaces. 2 unique flows probably won't balance very well, but total LAG utilization should increase as more flows are identified (e.g. 50 flows would get balanced better than 10).

 

With that understanding in mind, let's say I have 3 devices on the network.

  • Device1 IP Address: 192.168.1.1
  • Device2 IP Address: 192.168.1.2
  • Device3 IP Address: 192.168.1.3

 

Device1 and Device2 are both talking to Device3:

192.168.1.1 --> 192.168.1.3

192.168.1.2 --> 192.168.1.3

 

If I create an LACP LAG between the switch and Device3 and configure the switch to use DST IP hashing, the switch should identify exactly 1 flow because the destination IP address is the same for both conversations (Device1 and Device2 are both sending data to Device3).

 

However, if I configure the switch to use SRC IP+DST IP hashing, then the switch should identify 2 unique flows because packets from Device1 and Device2 each have different source IP addresses, thus resulting in different hashes for their respective traffic.

 

Why then does DST IP hashing outperform SRC IP+DST IP hashing? I know load balancing behavior differs by manufacturer, but that seems very backward.

 

-Newb

  0  
  0  
#3
Options
Re:Poor LACP Load Balancing Performance Using SRC IP+DST IP Hash Algorithm
Friday

  @NewbAdmin 

 

How do you find the standard Src + Dst MAC hash in terms of stream utilisation?  It might work better for you as in your case it has two things, one of which will always be vastly different (celient mac) and therefore the entropy (difference) is higher leading to likely better lag member utilisation.  It is a tricky thing to balance

  0  
  0  
#4
Options
Re:Poor LACP Load Balancing Performance Using SRC IP+DST IP Hash Algorithm-Solution
Friday - last edited Yesterday

  @GRL 

 

A lot of the heavier traffic (Ceph, VM migrations, iSCSI, etc) between hosts is associated with Linux VLAN interfaces which have unique IP addresses but not unique MAC addresses (they inherit the MAC address of the LAG), making source/destination MAC hashing a poor choice.

 

In any case, I figured out more precisely what is going on. TP-Link's hashing algorithm just XORs the source and destination IP addresses. I didn't realize this until I ran show etherchannel load-balance. During my initial testing, I used a standard/predictable addressing scheme like this:

Host1 IPs

  • 192.168.21.1
  • 192.168.22.1
  • 192.168.23.1
  • ...etc

Host2 IPs

  • 192.168.21.2
  • 192.168.22.2
  • 192.168.23.2
  • ...etc

 

If you XOR any 2 source and destination addresses on the same subnet, you end up with 00000000.00000000.00000000.00000011 because only the last 2 host bits are different. Same hash = same member interface = no balancing. I don't know if that's what you were getting at by talking about entropy; I would describe it more as a consequence of the hashing algorithm being dead simple, but I'm sorry if I failed to pick up on the specifics of what you were trying to tell me.

 

I confirmed that randomizing the IP addresses a bit more significantly increases LAG utilization. Alternatively, there is a way to force MAC address hashing to work better as well. Just for the sake of any future readers, you can go to /etc/network/interfaces on Proxmox and add hwaddress ether <some_random_MAC> to the VLAN interface block. Both methods resulted in approximately the same increase in LAG utilization when using the same number of iperf data streams.

 

-Newb

Recommended Solution
  0  
  0  
#5
Options