[Note] LACP HAL learning note === ###### tags: `LACP`, `IEEE 802.3ad`, `port channel`, `trunking`, `bundling`, `bonding`, `channeling`, `teaming`, `chip design`, `Realtek`, `Marvell` [toc] ## Introduction > Link aggregation refers to various methods of combining (aggregating) multiple network connections in parallel in order to increase throughput beyond what a single connection could sustain, and to provide redundancy in case one of the links should fail. A link aggregation group (LAG) is the collection of physical ports combined together. Other umbrella terms used to describe this method include EtherChannel and Port Bundling. All ports in a trunk gruop must be configured with the same speed and the trasmission mode if they are set to full-duplex. **In this note, I will depict LACP behavior of realtek's chip at first. And then, I will compare it with marvell's. And according to these behaviors, we could implement the HAL, i.e.,Hardware Abstraction Layer.** ## Purpose - For higher bandwidth - For link redundancy (automatically failover) - For high-bandwidth load balancing ## Standardization process - In a November 1997 meeting, the IEEE 802.3 group took up a study group to create an inter-operable link layer standard. - In 2000, Initial release 802.3ad. - In 2008, Move to 802.1 layer. ## Features in a chip 1. Load balancing 2. Traffic separation ### Load balancing Traffic would be distributed to trunk ports by hash function. A switch would take contents in packet or frame header as the hask key. In realtek, these keys are configurable by 7-bits hash mask register field that encompasses: | BIT | Acronym | Description | | --- | ------- | ----------------------- | | 0 | SPA | source physical port | | 1 | SMAC | source MAC address | | 2 | DMAC | destination MAC address | | 3 | SIP | source IP | | 4 | DIP | destination IP | | 5 | SPORT | source L4 port | | 6 | DPORT | destination L4 port | A hask key can be any combination of options, for instant, Set this hash mask to 0x3 by choosing SPA XOR SMAC as our hask key. ### Traffic separation As the name implies, traffics are separated. Consider a situation that a user would like to separate DLF traffic from other one since they have more chance to be malicious. The chip provides this mechanism to divvy known multicast traffic and L2 lookup miss traffic to a specific port. By configuing it, there are 4 types of modes: - Disable traffic separation - Separate known multicast traffic - Separate L2 lookup miss (DLF, Destination Lookup Failure) - Separate both ## L2 Table updating or flushing when LAG member change When member ports of a link aggregation is changed, L2 table entries may need to be amended to prevent packets mis-delivery or L2 table inconsistent. Thus, we have to update or flush L2 MAC table in below two conditons. 1. When a new leader port is added to a group. 2. When a leader port is removed from a group. The root cause is that all the SPA of L2 enrties learned by member ports in a LA group would be the same logical port ID (i.e., the leader port, the smallest logical port will be selected as a leader port). For example, observe the L2 MAC table and assume in phrase 0, we built a LAG(Link Aggregation Group) port 3, 4 and 5, and port 3 is the leader port. Phrase 0. | MAC address | Port | | ----------- | ---- | | SMAC_3 | 3 | | SMAC_4 | 3 | | SMAC_5 | 3 | Phrase 1. We add port 2 into this LAG. Port 2 will replace port 3 as the leader port. | MAC address | Port | | ----------- | ------- | | SMAC_2 | 2 | | SMAC_3 | ~~3~~ 2 | | SMAC_4 | ~~3~~ 2 | | SMAC_5 | ~~3~~ 2 | Phrase 2. We remove port 2 from this LAG. Port 3 will take over port 2 as the leader port. | MAC address | Port | | ----------- | ------- | | SMAC_2 | 3 | | SMAC_3 | ~~2~~ 3 | | SMAC_4 | ~~2~~ 3 | | SMAC_5 | ~~2~~ 3 | Instead of updating L2 table entries, another simple way to fix this mis-deliver issue is to remove all entries with SPA of original logical ID and entries of new member ports. For instant, if member ports of a LAG is changed from port 3, 4, 5 to be port 8, 9, 10, then we just remove entries with SPA 3, 8, 9, 10. ## Comparison with MARVELL's chip The chip design between RTK and MVL is different. For RTK's chip, it takes the smallest port ID in a LACP group as a leader port. For MVL's chip, it uses the other unique port ID as the leader port. Let's see how it works. In MARVELL's chip, there are some trunking rules which is responsible for - MAC table trunk learning - Loop prevention - Load balancing :::info Assume that we would like to create a LAG X with physical ports 3 and 4, and create the other LAG Y with physical 1, 2, 7. ::: ### Trunk learning First, we should let port 3/4 learning appears to be occurring as if these tow MACs were one MAC. Thus, we have to set the PAV (Port Association Vector). Data contained in PAV is used as a port vector loaded into MAC table when learning occurs on a port. Below table shows the PAV values when LAG is created. | Port ID | Trunk ID | Default PAV | Trunked PAV | | ------- | -------- | ----------- | ----------- | | 0 | | 0x001 | 0x001 | | 1 | Y | 0x002 | 0x086 | | 2 | Y | 0x004 | 0x086 | | 3 | X | 0x008 | 0x018 | | 4 | X | 0x010 | 0x018 | | 5 | | 0x020 | 0x020 | | 6 | | 0x040 | 0x040 | | 7 | Y | 0x080 | 0x086 | | 8 | | 0x100 | 0x100 | ### Loop prevention Secondly, we need to make sure any frame received from port 3 does not send out port 4 and vice versa. Port members in a LAG should look like and act like a single MAC or port. Therefore, we have to configure the port-based VLAN table. Below table shows the values in VLAN table when LAG is created. | Port ID | Trunk ID | Default VLAN Table | Trunked VLAN Table | | ------- | -------- | ----------- | ----------- | | 0 | | 0x7FE | 0x7FE | | 1 | Y | 0x7FD | 0x779 | | 2 | Y | 0x7FB | 0x779 | | 3 | X | 0x7F7 | 0x7E7 | | 4 | X | 0x7EF | 0x7E7 | | 5 | | 0x7DF | 0x7DF | | 6 | | 0x7BF | 0x7BF | | 7 | Y | 0x77F | 0x779 | | 8 | | 0x6FF | 0x6FF | ### Load balancing Lastly, we have to ensure that any frame egresses out this trunk uses only one of the links of a trunk. ## Reference https://en.wikipedia.org/wiki/Link_aggregation