We have ESXi hosts running Nexus 1000v and connecting to Nexus 5548up switches with VPC , this exact topology from the FCoE techbook :
The Cisco documentation and the FCoE techbook recommendation is to bind the VFC interfaces to port-channel interfaces :
nexus1(confg)# int eth 1/3
nexus1(confg-if)# channel-group 10
nexus1(confg-if)# int port-channel 10
nexus1(confg-if)# vpc 10
nexus1(confg)# int vfc10
nexus1(confg)# bind int vfc10 port-channel 10
nexus1(confg)# no shutdown
The problem we have is that the port-channel interface that we're binding to does not come up until the Nexus 1000v is installed on the ESXi host and the LACP is properly configured.
If we decide to bind directly to the eth interface ( bind int vfc10 int eth 1/3 ), instead of binding to port-channel, during boot ESXi first connects to datastores but then while trying to initialize LACP looses connections and then restores them back, bouncing the storage link few times, causing errors. We have a case open with Cisco on this, and we have been waiting for more than a month.
I think that storage connection depending on networking configuration on the host is not a good design, and I wonder why Cisco and EMC recommend it. Is anybody else using this topology and if so, with what type of binding ?
Hi Burhan! Sorry to hear you're having a problem with this configuration.
In regards to LACP configuration, we used standard port-channels set to mode active.
Unfortunately, we did not use the 1000v in any of our reference architectures, so I'd rather not speculate about what's causing the issue you're seeing. In any case, I'm very interested to hear what Cisco has to say about it. Let's follow up via email.
In regards to your general observation about network configuration impacting storage connectivity, I agree in principle but I also have to add that it depends upon the configuration being used. When we first started testing FCoE, we spent a ton of time validating that network events on non-storage VLANs would not impact traffic on FCoE VLANs. We also attempted to create topologies that would ensure isolation between the two types of traffic. Early on in this process, we realized that in order to provide the same level of isolation and resiliency as a native FC SAN, you would end up creating two physically isolated networks, one for LAN traffic and another for storage which just so happened to utilize FCoE instead of native FC. While this worked great from an isolation and resiliency perspective, these configurations diminished the value of convergence. As a result, we ended up qualifying a number of different topologies that provided different levels of isolation. Our thinking was/is we wanted to allow customers the greatest amount of flexibility to choose a level of isolation that made sense for their environment. As a matter of fact, J Metz (Cisco) and I presented a high level overview of the tradeoffs at SNW Fall 2011 (see https://www.eiseverywhere.com/file_uploads/9a85a75f1b47b7e5a27b9957c354f5cc_MetzSmith_Designing_HA_i...)
So, with that background in mind, I feel obliged to point out that the techbook doesn't recommend that you utilize a particular configuration in all cases, it's merely intended to show how to setup a particular configuration if you decide that you want to use it.
So that's the back story.. Again, I'll follow up via email so we can figure out why you're seeing this particular issue.