Broadcom settings on NIC

Question

We have Dell poweredge servers with broadcom onboard and PCI cards. The settings below are configurable and we can't find anything that says what to set them to. We're connecting them to an iSCSI network that connects to our Equallogic SANs. We heard different things all the time. Please help and if possible provide a link to documentation. All the documentation we can find just says turn on flow control, enable jumbo frames (MTU >= 9000), and that's it.

Ethernet@WireSpeed
Flow Control
Interrupt Moderation
IPv4 Checksum Offload
IPv4 Large Send Offload
IPv6 Checksum Offload
IPv6 Large Send Offload
Jumbo Packet
Locally Administered Address
Number of RSS Queues
Pause On Exhausted Host Ring
Priority & VLAN
Receive Buffers
Receive Side Scaling
Speed & Duplex
Transmit Buffers
Virtual Machine Queues
VLAN ID
VMQ Lookahead Split

The default settings seems to vary, so I don't trust just letting them be.

Also, we were looking at offloading the iscsi traffic to separate pci NICs and were told it would help us to get the following NIC instead of using a basic gigabit NIC. But now we're being told to disable TOE, so I don't know why we'd buy the more expensive NIC.

Basic one: http://www.dell.com/us/business/p/broadcom-netxtremeii-5708-pcix/pd?refid=nic_broadcom_5708_toe&baynote_bnrank=0&baynote_irrank=0&~ck=baynoteSearch

Other one: http://www.dell.com/us/business/p/broadcom-netxtremeii-5708-pcix-iscsi/pd?refid=nic_broadcom_5708&baynote_bnrank=0&baynote_irrank=0&~ck=baynoteSearch

jp1110 · Answer

What are you hoping to accomplish (fix conflicts, standardize your deployments, tune SAN parameters)?

If your attempting to fine tune then start simple, initially. Beging to layer your approach (proof of concept). First I would, abstract out how the network is layed out (production, SAN, management) and how its managed (vlan, routing, ACL's, etc). Start from there...

MrVault · Answer

we are trying to reduce discards and errors on the interfaces so that the iSCSI traffic is more stable, with less drops, etc. we are using MPIO as well, but it's not reliable enough. we have the same setup across many like systems who are not reporting these errors and discards, so we're trying to figure out what setting makes the most sense.

Essentially for the most reliable, stable connection with the highest performance opportunity, we're hoping Dell has done the due diligence and come up with the recommended configuration for these settings. One that results in the least dropped connections and allows the data to move the fastest.

Also trying to find out which NIC is the right to purchase so we can put the iSCSI storage on 2 NICs and the regular LAN access/management traffic on two others.

jp1110 · Answer

Sounds like there are two objectives; stablize iSCSI connection & tune solution for peak performance (COMMs advanced features as well as MPIO function and net performance)?

Let's start with looking to stablize the connection/s by understanding what you have. What can you tell me about where these errored states are being reported - switchport interface counters, switch logs, OS netsh interface..., Target stats, etc... Can you also share a depiction of your setup (either explain or general drawing)?

I'll be asking quite a few more questions from the Init to the Target as we progress, but lets start with these questions first. Bear with me.

MrVault · Answer

We have a flat network with gigabit switches interconnected in a hub/spoke topology. They are all in the same VLAN on same subnet. Connected to the various switches are Windows Server 2008 R2 servers with 2 NICs. Attached to the various switches are Dell Equallogic SAN arrays using iSCSI connectivity. In this particular situation, the servers are attached to the same switch as the array and are in the same subnet.

Below are the stats for one of the switchports in question. the other is the same pretty much.

Hardware is GigabitEthernet, address is 000c.db6d.5c8c (bia 000c.db6d.5c8c)

Configured speed auto, actual 1Gbit, configured duplex fdx, actual fdx

Configured mdi mode AUTO, actual MDIX

Member of L2 VLAN ID 1, port is untagged, port state is FORWARDING

STP configured to ON, priority is level0, flow control enabled

mirror disabled, monitor disabled

Not member of any active trunks

Not member of any configured trunks

Port name is Server1-NIC1

IP MTU 10222 bytes

300 second input rate: 21361776 bits/sec, 1206 packets/sec, 2.15% utilization

300 second output rate: 45856168 bits/sec, 1684 packets/sec, 4.60% utilization

39485349681 packets input, 66689472467389 bytes, 0 no buffer

Received 241303 broadcasts, 914355 multicasts, 39484194023 unicasts

0 input errors, 0 CRC, 0 frame, 0 ignored

0 runts, 0 giants

41517355665 packets output, 144012302456805 bytes, 0 underruns

Transmitted 96183737 broadcasts, 122737070 multicasts, 41298434858 unicasts

0 output errors, 0 collisions

---------------------------------------------

here are the netstat output on the server:

C:\>netstat -s

IPv4 Statistics

Packets Received = 1458824526

Received Header Errors = 0

Received Address Errors = 6427

Datagrams Forwarded = 0

Unknown Protocols Received = 0

Received Packets Discarded = 452371

Received Packets Delivered = 3317488819

Output Requests = 1833484949

Routing Discards = 0

Discarded Output Packets = 802

Output Packet No Route = 1

Reassembly Required = 457925

Reassembly Successful = 91585

Reassembly Failures = 0

Datagrams Successfully Fragmented = 0

Datagrams Failing Fragmentation = 0

Fragments Created = 0

IPv6 Statistics

Packets Received = 0

Received Header Errors = 0

Received Address Errors = 0

Datagrams Forwarded = 0

Unknown Protocols Received = 0

Received Packets Discarded = 9114

Received Packets Delivered = 9716

Output Requests = 18832

Routing Discards = 0

Discarded Output Packets = 0

Output Packet No Route = 4

Reassembly Required = 0

Reassembly Successful = 0

Reassembly Failures = 0

Datagrams Successfully Fragmented = 0

Datagrams Failing Fragmentation = 0

Fragments Created = 0

ICMPv4 Statistics

Received Sent

Messages 437364 439713

Errors 0 0

Destination Unreachable 9119 11481

Time Exceeded 0 0

Parameter Problems 0 0

Source Quenches 0 0

Redirects 0 0

Echo Replies 0 428225

Echos 428245 0

Timestamps 0 0

Timestamp Replies 0 0

Address Masks 0 0

Address Mask Replies 0 0

Router Solicitations 0 0

Router Advertisements 0 0

ICMPv6 Statistics

Received Sent

Messages 9114 9114

Errors 0 0

Destination Unreachable 9114 9114

Packet Too Big 0 0

Time Exceeded 0 0

Parameter Problems 0 0

Echos 0 0

Echo Replies 0 0

MLD Queries 0 0

MLD Reports 0 0

MLD Dones 0 0

Router Solicitations 0 0

Router Advertisements 0 0

Neighbor Solicitations 0 0

Neighbor Advertisements 0 0

Redirects 0 0

Router Renumberings 0 0

TCP Statistics for IPv4

Active Opens = 13723304

Passive Opens = 14495802

Failed Connection Attempts = 258

Reset Connections = 257731

Current Connections = 1690

Segments Received = 3291832677

Segments Sent = 1812636097

Segments Retransmitted = 17947801

TCP Statistics for IPv6

Active Opens = 13

Passive Opens = 9

Failed Connection Attempts = 4

Reset Connections = 14

Current Connections = 0

Segments Received = 602

Segments Sent = 591

Segments Retransmitted = 11

UDP Statistics for IPv4

Datagrams Received = 24746073

No Ports = 452332

Receive Errors = 39

Datagrams Sent = 2461345

UDP Statistics for IPv6

Datagrams Received = 0

No Ports = 9114

Receive Errors = 0

Datagrams Sent = 9114

AND more:

C:\>netstat -e

Interface Statistics

Received Sent

Bytes 4148407736 2097489827

Unicast packets 1930810659 2414008888

Non-unicast packets 92435683 110666

Discards 30186 30186

Errors 0 0

Unknown protocols 0

That's all I can get at this point.

jp1110 · Answer

Well, there is more than just discard problems (resets, fragmentation, retransmits). Can you provide the switch stats for these ports, and if possible the switch logs for errored events (purge any company IP data before posting). I'm wondering if your seeing these issues as a function of resource limitations on the switch (rcv/tx buffer)... I assume the interfaces (ports, vlans, virtual interfaces) are all set to support JUMBO? EQL default is 9k and your server is +10K bytes.

Can you tell us what type switch this is and if flow control is enabled on the switch too?

INITIATOR:

General Server Configuration

- IP MTU 10222 bytes

- flow control enabled

General IPv4 Stats

- Received Address Errors = 6427 <-- ?!

- Received Packets Discarded = 452371 <-- Not surprised...

- Discarded Output Packets = 802 <-- timed out...

- Reassembly Required = 457925 < -- Fragmentation of frames?!

- Reassembly Successful = 91585 <-- 5 to 1 correction. Not good.

ICMP (IPv4) (Rcv/Trx)

- Messages 437364 439713

- Destination Unreachable 9119 11481 <-- ?!

ICMP (IPv6) (Rcv/Trx)

- Messages 9114 9114

- Destination Unreachable 9119 11481

TCP Statistics for IPv4

- Failed Connection Attempts = 258

- Reset Connections = 257731 <-- !

- Segments Retransmitted = 17947801 <-- wow!

seen enough on this end...

MrVault · Answer

Hi. Had serious issues at work this week with san arrays dieing, volumes, dieing, etc. You wouldn't believe it. Anyway, The MTU size on all our server nic cards is set to 9000. The switches have an MTU value of 10218 or something close to that. The reassembly numbers are disconcerting. There must be something wrong here. Is it possible some of that data is due to the traffic that is on the NIC but not iSCSI traffic? Like RDP, SQL connections, etc? We want to segregate them, but right now that's not an option for us unfortunately.

jp1110 · Answer

Sorry to hear that. It does sound like fragmentation is the problem. Look to my previous post and see if you can provide some of that info. To your question about traffic not completely being iSCSI, yes.

JP

MrVault · Answer

Thanks. I don't know if flow control is enabled on every switch. I had only ever read it should be on the host side. Should it be on every switch and are there any concerns with that?

Also, I don't understand how smaller traffic (MTU:1500) could be the reason for fragmented data numbers. It would be wasted space, but 9000 or 10218 is enough to cover the 1500 for regular traffic, right?

jp1110 · Answer

Flow Control should only be deployed at edge/access switches to hosts for congestion management. Never at the core. As for a smaller frame (MTU 1518) with the switch jumbo frame setting (9k) that should not be a problem.

Here are some things I have seen with interoping switches (Cisco, Juniper, Brocade, Nortel, Foundry, & Powerconnect) while using jumbo frame...

- COMMs (server nic) driver. This might be a point to look at. Set your NIC to MTU 1500 and monitor traffic for a spell and check your counters. If they stop that's one datapoint closer to helping you understanding.

- switch to switch incompatibility. I've seen some switches have resource problems where when set to a higher supported size and negotiate to lower, the consequence is allocated resources on the switch regardless on nonusage (i.e. rcv/tx buffer). This might be an area of interest for you. Try to dial down your 10219 to 9000. Check your counters.

- Switch mac table corruption causing frame to be lost. We found this to be true in some switches when utilization was high. You might want to monitor your proc usage. Again just another datapoint to help you understand.

- Switch to switch phy design. I've actually run across an issue recently where a switch and controller eth interface hand a preamble problem (interframe gap) resulting in packet losses but here was the catcher - more penevolent when Jumbo frames were used.

- not all switches metrics are the same - subject to interpretation. Meaning 9000 byte size frame is not neccessarily 90000 byte size frame. Some vendors adjust Only for payload and don't count the header and fcs... Dial down your switches to the 9000 metric and set your init and Target MTU to under that... 8000 byte size frames. Check counters... another data point...

This is really going to bite if its just an outdated driver problem... check to see if you have out of sequence frames. This would point to the switch fabric.

MrVault · Answer

so should I type 'no flow-control' on our core hub switches? Only the Firewall and other access switches are plugged into the core switches right now. I've updated the NIC drivers, with no improvement in numbers. I don't know if Brocade and Foundry switches can control specific MTU size. I think if you turn on jumbo frames it sets the MTU to whichever each supports. if processor usage is low, is mac table corruption still possible? how do I verify/fix?

jp1110 · Answer

Without understanding how your topology is really layed out, I can only tell you that flowcontrol is typically at the edge to help manage traffic bursts and such to your attached devices/appliances.

Can you show a topology/depiction of what you have (10,000 foot view) without giving up any of your trade secrets...

jp1110 · Answer

Also did you attempt to reduce the MTU of your server and check the counters.  That's relatively easy and quick to do...

MrVault · Answer

I'm going to wait a few more days, but there are zero errors or discards since I installed latest drivers and management suite from Broadcom site. I saw in an iSCSI document put out there that Dell recommends you get Broadcom's site's driver, not from Dell's support site. Very non-intuitive IMO. Anyway, we'll see.

MrVault · Answer

Can you point me to an article that discusses more in detail why not to turn on flow control at the core level?

jp1110 · Answer

Glad to see the errors resolve after the driver update.  Told you it was going to bite!  There are many sources for flow control best practices.  Google 'flow control at link layer'...  Keep in mind, Core is network layer, make sense?...

EqualLogic

Broadcom settings on NIC

Was this post helpful?