What are you hoping to accomplish (fix conflicts, standardize your deployments, tune SAN parameters)?
If your attempting to fine tune then start simple, initially. Beging to layer your approach (proof of concept). First I would, abstract out how the network is layed out (production, SAN, management) and how its managed (vlan, routing, ACL's, etc). Start from there...
we are trying to reduce discards and errors on the interfaces so that the iSCSI traffic is more stable, with less drops, etc. we are using MPIO as well, but it's not reliable enough. we have the same setup across many like systems who are not reporting these errors and discards, so we're trying to figure out what setting makes the most sense.
Essentially for the most reliable, stable connection with the highest performance opportunity, we're hoping Dell has done the due diligence and come up with the recommended configuration for these settings. One that results in the least dropped connections and allows the data to move the fastest.
Also trying to find out which NIC is the right to purchase so we can put the iSCSI storage on 2 NICs and the regular LAN access/management traffic on two others.
Sounds like there are two objectives; stablize iSCSI connection & tune solution for peak performance (COMMs advanced features as well as MPIO function and net performance)?
Let's start with looking to stablize the connection/s by understanding what you have. What can you tell me about where these errored states are being reported - switchport interface counters, switch logs, OS netsh interface..., Target stats, etc... Can you also share a depiction of your setup (either explain or general drawing)?
I'll be asking quite a few more questions from the Init to the Target as we progress, but lets start with these questions first. Bear with me.
We have a flat network with gigabit switches interconnected in a hub/spoke topology. They are all in the same VLAN on same subnet. Connected to the various switches are Windows Server 2008 R2 servers with 2 NICs. Attached to the various switches are Dell Equallogic SAN arrays using iSCSI connectivity. In this particular situation, the servers are attached to the same switch as the array and are in the same subnet.
Below are the stats for one of the switchports in question. the other is the same pretty much.
Hardware is GigabitEthernet, address is 000c.db6d.5c8c (bia 000c.db6d.5c8c)
Configured speed auto, actual 1Gbit, configured duplex fdx, actual fdx
Configured mdi mode AUTO, actual MDIX
Member of L2 VLAN ID 1, port is untagged, port state is FORWARDING
STP configured to ON, priority is level0, flow control enabled
mirror disabled, monitor disabled
Not member of any active trunks
Not member of any configured trunks
Port name is Server1-NIC1
IP MTU 10222 bytes
300 second input rate: 21361776 bits/sec, 1206 packets/sec, 2.15% utilization
300 second output rate: 45856168 bits/sec, 1684 packets/sec, 4.60% utilization
39485349681 packets input, 66689472467389 bytes, 0 no buffer
Received 241303 broadcasts, 914355 multicasts, 39484194023 unicasts
Well, there is more than just discard problems (resets, fragmentation, retransmits). Can you provide the switch stats for these ports, and if possible the switch logs for errored events (purge any company IP data before posting). I'm wondering if your seeing these issues as a function of resource limitations on the switch (rcv/tx buffer)... I assume the interfaces (ports, vlans, virtual interfaces) are all set to support JUMBO? EQL default is 9k and your server is +10K bytes.
Can you tell us what type switch this is and if flow control is enabled on the switch too?
INITIATOR:
General Server Configuration
- IP MTU 10222 bytes
- flow control enabled
General IPv4 Stats
- Received Address Errors = 6427 <-- ?!
- Received Packets Discarded = 452371 <-- Not surprised...
- Discarded Output Packets = 802 <-- timed out...
- Reassembly Required = 457925 < -- Fragmentation of frames?!
- Reassembly Successful = 91585 <-- 5 to 1 correction. Not good.
Hi. Had serious issues at work this week with san arrays dieing, volumes, dieing, etc. You wouldn't believe it. Anyway, The MTU size on all our server nic cards is set to 9000. The switches have an MTU value of 10218 or something close to that. The reassembly numbers are disconcerting. There must be something wrong here. Is it possible some of that data is due to the traffic that is on the NIC but not iSCSI traffic? Like RDP, SQL connections, etc? We want to segregate them, but right now that's not an option for us unfortunately.
Sorry to hear that. It does sound like fragmentation is the problem. Look to my previous post and see if you can provide some of that info. To your question about traffic not completely being iSCSI, yes.
Thanks. I don't know if flow control is enabled on every switch. I had only ever read it should be on the host side. Should it be on every switch and are there any concerns with that?
Also, I don't understand how smaller traffic (MTU:1500) could be the reason for fragmented data numbers. It would be wasted space, but 9000 or 10218 is enough to cover the 1500 for regular traffic, right?
Flow Control should only be deployed at edge/access switches to hosts for congestion management. Never at the core. As for a smaller frame (MTU 1518) with the switch jumbo frame setting (9k) that should not be a problem.
Here are some things I have seen with interoping switches (Cisco, Juniper, Brocade, Nortel, Foundry, & Powerconnect) while using jumbo frame...
- COMMs (server nic) driver. This might be a point to look at. Set your NIC to MTU 1500 and monitor traffic for a spell and check your counters. If they stop that's one datapoint closer to helping you understanding.
- switch to switch incompatibility. I've seen some switches have resource problems where when set to a higher supported size and negotiate to lower, the consequence is allocated resources on the switch regardless on nonusage (i.e. rcv/tx buffer). This might be an area of interest for you. Try to dial down your 10219 to 9000. Check your counters.
- Switch mac table corruption causing frame to be lost. We found this to be true in some switches when utilization was high. You might want to monitor your proc usage. Again just another datapoint to help you understand.
- Switch to switch phy design. I've actually run across an issue recently where a switch and controller eth interface hand a preamble problem (interframe gap) resulting in packet losses but here was the catcher - more penevolent when Jumbo frames were used.
- not all switches metrics are the same - subject to interpretation. Meaning 9000 byte size frame is not neccessarily 90000 byte size frame. Some vendors adjust Only for payload and don't count the header and fcs... Dial down your switches to the 9000 metric and set your init and Target MTU to under that... 8000 byte size frames. Check counters... another data point...
This is really going to bite if its just an outdated driver problem... check to see if you have out of sequence frames. This would point to the switch fabric.
so should I type "no flow-control" on our core hub switches? Only the Firewall and other access switches are plugged into the core switches right now.
I've updated the NIC drivers, with no improvement in numbers.
I don't know if Brocade and Foundry switches can control specific MTU size. I think if you turn on jumbo frames it sets the MTU to whichever each supports.
if processor usage is low, is mac table corruption still possible? how do I verify/fix?
Without understanding how your topology is really layed out, I can only tell you that flowcontrol is typically at the edge to help manage traffic bursts and such to your attached devices/appliances.
Can you show a topology/depiction of what you have (10,000 foot view) without giving up any of your trade secrets...
I'm going to wait a few more days, but there are zero errors or discards since I installed latest drivers and management suite from Broadcom site. I saw in an iSCSI document put out there that Dell recommends you get Broadcom's site's driver, not from Dell's support site. Very non-intuitive IMO. Anyway, we'll see.
Glad to see the errors resolve after the driver update. Told you it was going to bite! There are many sources for flow control best practices. Google "flow control at link layer"... Keep in mind, Core is network layer, make sense?...
jp1110
67 Posts
0
September 18th, 2011 03:00
What are you hoping to accomplish (fix conflicts, standardize your deployments, tune SAN parameters)?
If your attempting to fine tune then start simple, initially. Beging to layer your approach (proof of concept). First I would, abstract out how the network is layed out (production, SAN, management) and how its managed (vlan, routing, ACL's, etc). Start from there...
MrVault
37 Posts
0
September 19th, 2011 12:00
we are trying to reduce discards and errors on the interfaces so that the iSCSI traffic is more stable, with less drops, etc. we are using MPIO as well, but it's not reliable enough. we have the same setup across many like systems who are not reporting these errors and discards, so we're trying to figure out what setting makes the most sense.
Essentially for the most reliable, stable connection with the highest performance opportunity, we're hoping Dell has done the due diligence and come up with the recommended configuration for these settings. One that results in the least dropped connections and allows the data to move the fastest.
Also trying to find out which NIC is the right to purchase so we can put the iSCSI storage on 2 NICs and the regular LAN access/management traffic on two others.
jp1110
67 Posts
0
September 19th, 2011 23:00
Sounds like there are two objectives; stablize iSCSI connection & tune solution for peak performance (COMMs advanced features as well as MPIO function and net performance)?
Let's start with looking to stablize the connection/s by understanding what you have. What can you tell me about where these errored states are being reported - switchport interface counters, switch logs, OS netsh interface..., Target stats, etc... Can you also share a depiction of your setup (either explain or general drawing)?
I'll be asking quite a few more questions from the Init to the Target as we progress, but lets start with these questions first. Bear with me.
MrVault
37 Posts
0
September 21st, 2011 07:00
We have a flat network with gigabit switches interconnected in a hub/spoke topology. They are all in the same VLAN on same subnet. Connected to the various switches are Windows Server 2008 R2 servers with 2 NICs. Attached to the various switches are Dell Equallogic SAN arrays using iSCSI connectivity. In this particular situation, the servers are attached to the same switch as the array and are in the same subnet.
Below are the stats for one of the switchports in question. the other is the same pretty much.
Hardware is GigabitEthernet, address is 000c.db6d.5c8c (bia 000c.db6d.5c8c)
Configured speed auto, actual 1Gbit, configured duplex fdx, actual fdx
Configured mdi mode AUTO, actual MDIX
Member of L2 VLAN ID 1, port is untagged, port state is FORWARDING
STP configured to ON, priority is level0, flow control enabled
mirror disabled, monitor disabled
Not member of any active trunks
Not member of any configured trunks
Port name is Server1-NIC1
IP MTU 10222 bytes
300 second input rate: 21361776 bits/sec, 1206 packets/sec, 2.15% utilization
300 second output rate: 45856168 bits/sec, 1684 packets/sec, 4.60% utilization
39485349681 packets input, 66689472467389 bytes, 0 no buffer
Received 241303 broadcasts, 914355 multicasts, 39484194023 unicasts
0 input errors, 0 CRC, 0 frame, 0 ignored
0 runts, 0 giants
41517355665 packets output, 144012302456805 bytes, 0 underruns
Transmitted 96183737 broadcasts, 122737070 multicasts, 41298434858 unicasts
0 output errors, 0 collisions
---------------------------------------------
here are the netstat output on the server:
C:\>netstat -s
IPv4 Statistics
Packets Received = 1458824526
Received Header Errors = 0
Received Address Errors = 6427
Datagrams Forwarded = 0
Unknown Protocols Received = 0
Received Packets Discarded = 452371
Received Packets Delivered = 3317488819
Output Requests = 1833484949
Routing Discards = 0
Discarded Output Packets = 802
Output Packet No Route = 1
Reassembly Required = 457925
Reassembly Successful = 91585
Reassembly Failures = 0
Datagrams Successfully Fragmented = 0
Datagrams Failing Fragmentation = 0
Fragments Created = 0
IPv6 Statistics
Packets Received = 0
Received Header Errors = 0
Received Address Errors = 0
Datagrams Forwarded = 0
Unknown Protocols Received = 0
Received Packets Discarded = 9114
Received Packets Delivered = 9716
Output Requests = 18832
Routing Discards = 0
Discarded Output Packets = 0
Output Packet No Route = 4
Reassembly Required = 0
Reassembly Successful = 0
Reassembly Failures = 0
Datagrams Successfully Fragmented = 0
Datagrams Failing Fragmentation = 0
Fragments Created = 0
ICMPv4 Statistics
Received Sent
Messages 437364 439713
Errors 0 0
Destination Unreachable 9119 11481
Time Exceeded 0 0
Parameter Problems 0 0
Source Quenches 0 0
Redirects 0 0
Echo Replies 0 428225
Echos 428245 0
Timestamps 0 0
Timestamp Replies 0 0
Address Masks 0 0
Address Mask Replies 0 0
Router Solicitations 0 0
Router Advertisements 0 0
ICMPv6 Statistics
Received Sent
Messages 9114 9114
Errors 0 0
Destination Unreachable 9114 9114
Packet Too Big 0 0
Time Exceeded 0 0
Parameter Problems 0 0
Echos 0 0
Echo Replies 0 0
MLD Queries 0 0
MLD Reports 0 0
MLD Dones 0 0
Router Solicitations 0 0
Router Advertisements 0 0
Neighbor Solicitations 0 0
Neighbor Advertisements 0 0
Redirects 0 0
Router Renumberings 0 0
TCP Statistics for IPv4
Active Opens = 13723304
Passive Opens = 14495802
Failed Connection Attempts = 258
Reset Connections = 257731
Current Connections = 1690
Segments Received = 3291832677
Segments Sent = 1812636097
Segments Retransmitted = 17947801
TCP Statistics for IPv6
Active Opens = 13
Passive Opens = 9
Failed Connection Attempts = 4
Reset Connections = 14
Current Connections = 0
Segments Received = 602
Segments Sent = 591
Segments Retransmitted = 11
UDP Statistics for IPv4
Datagrams Received = 24746073
No Ports = 452332
Receive Errors = 39
Datagrams Sent = 2461345
UDP Statistics for IPv6
Datagrams Received = 0
No Ports = 9114
Receive Errors = 0
Datagrams Sent = 9114
AND more:
C:\>netstat -e
Interface Statistics
Received Sent
Bytes 4148407736 2097489827
Unicast packets 1930810659 2414008888
Non-unicast packets 92435683 110666
Discards 30186 30186
Errors 0 0
Unknown protocols 0
That's all I can get at this point.
jp1110
67 Posts
0
September 22nd, 2011 20:00
Well, there is more than just discard problems (resets, fragmentation, retransmits). Can you provide the switch stats for these ports, and if possible the switch logs for errored events (purge any company IP data before posting). I'm wondering if your seeing these issues as a function of resource limitations on the switch (rcv/tx buffer)... I assume the interfaces (ports, vlans, virtual interfaces) are all set to support JUMBO? EQL default is 9k and your server is +10K bytes.
Can you tell us what type switch this is and if flow control is enabled on the switch too?
INITIATOR:
General Server Configuration
- IP MTU 10222 bytes
- flow control enabled
General IPv4 Stats
- Received Address Errors = 6427 <-- ?!
- Received Packets Discarded = 452371 <-- Not surprised...
- Discarded Output Packets = 802 <-- timed out...
- Reassembly Required = 457925 < -- Fragmentation of frames?!
- Reassembly Successful = 91585 <-- 5 to 1 correction. Not good.
ICMP (IPv4) (Rcv/Trx)
- Messages 437364 439713
- Destination Unreachable 9119 11481 <-- ?!
ICMP (IPv6) (Rcv/Trx)
- Messages 9114 9114
- Destination Unreachable 9119 11481
TCP Statistics for IPv4
- Failed Connection Attempts = 258
- Reset Connections = 257731 <-- !
- Segments Retransmitted = 17947801 <-- wow!
seen enough on this end...
MrVault
37 Posts
0
October 1st, 2011 07:00
Hi. Had serious issues at work this week with san arrays dieing, volumes, dieing, etc. You wouldn't believe it. Anyway, The MTU size on all our server nic cards is set to 9000. The switches have an MTU value of 10218 or something close to that. The reassembly numbers are disconcerting. There must be something wrong here. Is it possible some of that data is due to the traffic that is on the NIC but not iSCSI traffic? Like RDP, SQL connections, etc? We want to segregate them, but right now that's not an option for us unfortunately.
jp1110
67 Posts
0
October 2nd, 2011 04:00
Sorry to hear that. It does sound like fragmentation is the problem. Look to my previous post and see if you can provide some of that info. To your question about traffic not completely being iSCSI, yes.
JP
MrVault
37 Posts
0
October 6th, 2011 07:00
Thanks. I don't know if flow control is enabled on every switch. I had only ever read it should be on the host side. Should it be on every switch and are there any concerns with that?
Also, I don't understand how smaller traffic (MTU:1500) could be the reason for fragmented data numbers. It would be wasted space, but 9000 or 10218 is enough to cover the 1500 for regular traffic, right?
jp1110
67 Posts
0
October 6th, 2011 23:00
Flow Control should only be deployed at edge/access switches to hosts for congestion management. Never at the core. As for a smaller frame (MTU 1518) with the switch jumbo frame setting (9k) that should not be a problem.
Here are some things I have seen with interoping switches (Cisco, Juniper, Brocade, Nortel, Foundry, & Powerconnect) while using jumbo frame...
- COMMs (server nic) driver. This might be a point to look at. Set your NIC to MTU 1500 and monitor traffic for a spell and check your counters. If they stop that's one datapoint closer to helping you understanding.
- switch to switch incompatibility. I've seen some switches have resource problems where when set to a higher supported size and negotiate to lower, the consequence is allocated resources on the switch regardless on nonusage (i.e. rcv/tx buffer). This might be an area of interest for you. Try to dial down your 10219 to 9000. Check your counters.
- Switch mac table corruption causing frame to be lost. We found this to be true in some switches when utilization was high. You might want to monitor your proc usage. Again just another datapoint to help you understand.
- Switch to switch phy design. I've actually run across an issue recently where a switch and controller eth interface hand a preamble problem (interframe gap) resulting in packet losses but here was the catcher - more penevolent when Jumbo frames were used.
- not all switches metrics are the same - subject to interpretation. Meaning 9000 byte size frame is not neccessarily 90000 byte size frame. Some vendors adjust Only for payload and don't count the header and fcs... Dial down your switches to the 9000 metric and set your init and Target MTU to under that... 8000 byte size frames. Check counters... another data point...
This is really going to bite if its just an outdated driver problem... check to see if you have out of sequence frames. This would point to the switch fabric.
MrVault
37 Posts
0
October 7th, 2011 08:00
so should I type "no flow-control" on our core hub switches? Only the Firewall and other access switches are plugged into the core switches right now.
I've updated the NIC drivers, with no improvement in numbers.
I don't know if Brocade and Foundry switches can control specific MTU size. I think if you turn on jumbo frames it sets the MTU to whichever each supports.
if processor usage is low, is mac table corruption still possible? how do I verify/fix?
jp1110
67 Posts
0
October 7th, 2011 22:00
Without understanding how your topology is really layed out, I can only tell you that flowcontrol is typically at the edge to help manage traffic bursts and such to your attached devices/appliances.
Can you show a topology/depiction of what you have (10,000 foot view) without giving up any of your trade secrets...
jp1110
67 Posts
0
October 7th, 2011 22:00
Also did you attempt to reduce the MTU of your server and check the counters. That's relatively easy and quick to do...
MrVault
37 Posts
0
October 11th, 2011 08:00
I'm going to wait a few more days, but there are zero errors or discards since I installed latest drivers and management suite from Broadcom site. I saw in an iSCSI document put out there that Dell recommends you get Broadcom's site's driver, not from Dell's support site. Very non-intuitive IMO. Anyway, we'll see.
MrVault
37 Posts
0
October 11th, 2011 08:00
Can you point me to an article that discusses more in detail why not to turn on flow control at the core level?
jp1110
67 Posts
1
October 11th, 2011 20:00
Glad to see the errors resolve after the driver update. Told you it was going to bite! There are many sources for flow control best practices. Google "flow control at link layer"... Keep in mind, Core is network layer, make sense?...