Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Force10 - Hash Collisions And How To Avoid Them

Summary: How to avoid hash collisions on Force10 switches.

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content


Symptoms

Example Log Entry:

 

May 20 19:12:20: %EX8PB:2 %MACAGT-2-HASH_COLLISION_LOG: Mac:00:02:e8:d6:58:20/Vlan:203 could not be added to L2 CAM on portpipe 2 linecard 2 due to hash collision. Total number of hash collisions: 30211

May 20 19:12:20: %EX8PB:2 %MACAGT-2-HASH_COLLISION_LOG: Mac:00:02:e8:d6:58:20/Vlan:203 could not be added to L2 CAM on portpipe 3 linecard 2 due to hash collision. Total number of hash collisions: 31979

 

How does it work:

 

In the Switch CAM table, there are a specific number of entries allocated for "Host table" which holds a portion for ARP on /32 networks and a specific amount for all other entries.

For example if there are 1024 Index values which point to arrays of 8 memory locations then each index value can hold eight values. All 8 in an array can be ARPs, but in total, across all locations, ARP entries can not exceed the portion dedicated to this function. Different switches have varying values.

When adding an ARP entry for an IP-address to the switch’s CAM, the switching chip calculates an index value (0-1023) using the IP address, and the ARP entry is saved to the location pointed by this hash algorithm.

In certain instances the hashing algorithm wants to store the index in a location which with all memory locations being used, and a hash collision is encountered.

When an IP address encounters a hash collision, its ARP entry will not be added to the CAM.  Instead the CPU will have to load it in its software table. When traffic to that IP needs to be forwarded, the switch cannot do it in hardware.  That traffic is then forwarded to the CPU and is soft forwarded.  This introduces additional load on the CPU.  This will tend to introduce latency for the specified path. In certain instances the amount of soft forwarding can exceed the CPU’s ability to process it and lead to packet loss.

Resolution

Workarounds for hash failures :

 

Upgrade to software allowing for DUAL HASHING. Specific platforms post release 9.3 have the ability to perform dual hashing. Dual hashing support for both L2 and L3 tables is available. This feature is enabled by default on all those platforms running  9.3. Switch tries to re-hash and re-order the tables to accommodate new entries whenever a hash collision happens.

Add a routing layer.  For core switch hash failures.  The best way to overcome this limitation is to use a Top-of Rack (TOR) design and enable routing between the TORs and core switches.  That way we can reduce the ARP table size on the Core.  Add this routing layer between individual hosts and the core will relieve the core from having to learn all individual hosts’ ARP entries.

Reduce ARP timeout. Default is 4 hours. By reducing the length of time ARP’s are retained it allows for more frequent introduction of new ARP entries. This will of course also force all entries to cycle through faster and will increase ARP traffic for the attached networks.

Distribute IP addresses in the connected L3 network.  A mapping of ALL possible IP addresses in important subnets to their corresponding hash values can be created but is extremely cumbersome to produce. IP’s can then be redistributed to avoid hash failures. This is the least effective short term fix available.

 

 

Article Properties


Affected Product

C7008/C300 Aggregation Core chassis Switch, PowerSwitch S4810P, Force10 S60-44T

Last Published Date

21 Feb 2021

Version

3

Article Type

Solution