Start a Conversation

Unsolved

This post is more than 5 years old

12126

November 8th, 2012 12:00

Ask The Expert: Discussing the challenges of Long Distance Links

The challenges of Long Distance Links with  RRR & Jon Klaus

 

Welcome to the EMC Support Community Ask the Expert conversation. This is an opportunity to discuss and learn about the variety of challenges when connecting storage and host systems over long distances.

 

This discussion begins on Monday, November 19th. Get ready by bookmarking this page or signing up for email notifications.

 

Your hosts:

 

profile-image-display.jspa?imageID=7024&size=350

Rob Koper is working in the IT industry since 1994 and since 2004 working for Open Line Consultancy. He started with Clariion CX300 and DMX-2 and worked with all newer arrays ever since, up to current technologies like VNX 5700 and the larger DMX-4 and VMAX 20k systems. He's mainly involved in managing and migrating data to storage arrays over large Cisco and Brocade SANs that span multiple sites widely spread through the Netherlands. Since 2007 he's an active member on ECN and the Support Forums and he currently holds Proven Professional certifications like Implementation Engineer for VNX, Clariion (expert) and Symmetrix as well as Technology Architect for VNX, Clariion and Symmetrix.

 

https://community.emc.com/profile-image-display.jspa?imageID=6000&size=350

Jon Klaus has been working at Open Line since 2008 as a project consultant on various storage and server virtualization projects. To prepare for these projects, an intensive one year barrage of courses on CLARiiON and Celerra has yielded him the EMCTAe and EMCIEe certifications on CLARiiON and EMCIE + EMCTA status on VNX and Celerra.

Currently Jon is contracted by a large multinational and part of a team that is responsible for running and maintaining several (EMC) storage and backup systems throughout Europe. Amongst his day-to-day activities are: performance troubleshooting, storage migrations and designing a new architecture for the Europe storage and backup environment.

 

The Event takes place between the 19th - 30th November 2012. Rob and Jon will take your questions during this time.

Also watch out for the tweet chat on #EMCATE about this very same topic taking place on the 26th November.

2 Intern

 • 

5.7K Posts

November 20th, 2012 03:00

If anyone thinks I made a mistake here, please inform me, so we can set this straight.

14 Posts

November 20th, 2012 04:00

Nice breakdown, RRR

2 Intern

 • 

5.7K Posts

November 20th, 2012 04:00

Thanks. And I almost lost my post!! When I was done I accidently switched to the advanced editor and nothing happened.... aaaargh! Fortunately I copied the html text to notepad only a minute before, so I was able to restore my post. Took me an hour and a half or so to type this, so I was glad it wasn't lost in the end .

So what about this 1.33 factor for a 4Gb link? Do you think it's correct? I can't find any errors in my calculations

247 Posts

November 20th, 2012 07:00

And I was afraid my post was long... yikes!

I originally calculated all buffer credits back from 2gbit, which is/was 1 credit per km. 4Gbit would need 2 credits per km, etc.

Don't forget that too much credits can't harm performance (assuming you're not running short on other links), and that you can always monitor credit usage on the switch. Cisco Device Manager has a counter that shows the slack you've got:

CurB2B.jpg

In this case it's an ISL that's maybe a couple of hundred meters long, so it can cope with 1 or 2 credits depending on actual speed and distance. The port is nevertheless assigned 32 credits, of which 32 are still available (see bottom 2 numbers, CurrRxBbCredits and CurrTxBbCredits.

2 Intern

 • 

5.7K Posts

November 22nd, 2012 01:00

Tonight (Europe time) we planned to do the TweetChat, but yesterday Sean Thulin mentioned to us over Twitter that Thursday Nov the 22nd is Thanksgiving and it might not be a good day for the TweetChat. So after some careful considerations we decided to move the TweetChat to next Monday, November the 26th. Same time, same place, new day!

247 Posts

November 23rd, 2012 04:00

Happy Thanksgiving everyone in the US and around the world! I hope the turkey tasted good!

So.. A lot of people are reading this topic, but the replies are a bit on the low side. Is there no-one else that wants to share their experiences with long distance links? Are long distance links that easy?

Please join the conversation; we won't bite!

2 Intern

 • 

5.7K Posts

November 23rd, 2012 06:00

So we talked about light, multi mode and single mode as well as attenuation. So the maximum distance is defined by how sensitive the Rx side is.

Also we explained why Buffer to Buffer Credits are needed and how you calculate how many of these you need.

So the next thing we'd like to discuss is a way to make optimal use of the expensive fiber link you might have. A way to do this is by using colors. Since lasers are monochromatic anyway using a sort of prism to bundle colors coming from different lasers into a single multi colored beam is the solution!

Techniques like CWDM, DWDM and EWDM were developed to facilitate as many colors as possible into a single beam.

WDM stands for wavelength-division multiplexing. Multiplexing says it all: it combines multiple source into one.

The best known WDM product is CWDM: Course WDM. In 2002 the standard was set for CWDM as using wavelengths between 1270 and 1610 nm as well as having 20 nm spacings between the "colors". The most commonly used CWDM Multiplexers (the prisms) provide 8 connections so 8 sending long wave ports can use the same physical dark fiber. And although each of these "colors" is named after a visible color (like brown, yellow, blue and so on) the actual wavelengths are far beyond the visibility of the human eye.

The wavelengths between 1270 and 1470 are considered unusable because of the increased attenuation, so only 1470 to 1610 is supported.

CWDM.jpg

CWDM.jpg

The "colors" used for CWDM are:

  • 1470 nm (gray)
  • 1490 nm (violet)
  • 1510 nm (blue)
  • 1530 nm (green)
  • 1550 nm (yellow)
  • 1570 nm (orange)
  • 1590 nm (red)
  • 1610 nm (brown)

The mentioned colors are the names Cisco calls these different wavelength. By using colored tags on each SFP the human eye can now easily identify each wavelength.

Needlessly to say that to connect a network on site A using for example the 1530 nm wavelength, the same wavelength of 1530 nm is needed on site B.

CWDM is unaware of the protocol used to transport data accross, so you can easily combine tcp/ip, fibre channel and even analogue tv signals as long as each signal uses a different wavelength.

247 Posts

November 23rd, 2012 08:00

So we know how to get the signal across long distances and we know how to efficiently load the link using buffer credits. If we need more bandwidth than one link can deliver, we can send different wavelengths over one link using CWDM and DWDM to make things even more efficient. But there's one thing we can't influence (much): latency.

When we send information between two distances we use light. The speed of light is 299,792,458m/s. This is in a vacuum; in fiber it's a bit lower, somewhere around 200,000,000m/s. Putting it differently, that's 200,000km/s or 200km/ms. This means that if we want to cover 200km of distance, we'll have 1ms of latency between putting light on the fiber and the other end seeing it.

To perform reliable communications, we can't signal uni-directionally and leave it at that. We want to know if the data arrived at the other end intact. This means that we need to use the bi-directional latency, also called round-trip time. Our 200km link is now costing us 2ms of latency. Not too much right? Well... hello SCSI!

FC data transport uses SCSI commands. If we assume I will be the sending storage array on the source end and RRR will be the receiving storage on the other end, our (SCSI) conversation would be something along the lines of:

  1. RTT direction1: Hey RRR, I want to send you some data.
  2. RTT direction2: Sure Jon, go ahead.
  3. RTT direction1: Okay, here is the data.
  4. RTT direction2: Thanks Jon, got it in good shape.

Or, since a picture says more than a thousand words, a picture from the Brocade whitepapers:

RTT.JPG

This means that instead of 1ms latency, our 200km link is actually costing us 4ms of latency. And that's only for the link itself.

If we send 32KB of data, we also need to put that on the line. Ignoring FC frame overhead, at 1Gbit/s a 32KB I/O takes 0,244ms to load. You can compare this to a truck entering the highway: once the front fender crosses the starting line, the back of it is still not "loaded" onto the highway.

If we also have protocol converters in between source and destination (because for example you switch from FC to FCIP and at the other end back from FCIP to FC) you will need to add another 0,5ms for each conversion. With 4 round-trips, you'll end up with 4ms of protocol conversion time.

So we've got 4ms of delay due to the speed of light. We have 4ms due to protocol conversions or routers. And we have 0,244ms of latency due to the fact we can only load the data on the line at 1Gbit/s. Add it all up and at 200km you have an additional latency of 8,244ms.

Considering that a unmirrored write I/O that is hitting write cache only costs us something in the neighborhood of 0,5ms, this is a lot. Switch on mirroring over that secondary link and your write latency is now 0,5 + 8,244 + 0,5ms = 9,244ms. If the secondary storage system is also running out of write cache and needs to force flush, you're in a world of pain...

Moral of the story: be careful of synchronous replication over long distances!

And have a good weekend...

247 Posts

November 26th, 2012 10:00

And we're off with the tweetchat! Join us on twitter with hashtag #EMCATE or go the the following website: http://tweetchat.com/room/EMCATE

Hope to see you there for questions, comments or just a casual chat!

2 Intern

 • 

5.7K Posts

November 26th, 2012 12:00

We talked about FCIP, buffervcredits, in order delivery, port groups, shared bandwidth, but also SONET OC-48 to cover almost 1,000 miles. Pretty impressive!

Lots of questions, lots of answer, but still a lot of unanswered ones. Need more time.......

2 Intern

 • 

5.7K Posts

November 27th, 2012 03:00

I fixed it. I totally forgot about the speed of light, which is lower in fiber compared to vacuum. It's about 1.5 times slower, so the speed of light in fiber is approximately 200k km/s.

This means more frames will live on the link, so more buffers are needed.

11 Posts

November 27th, 2012 04:00

Hi Jon,

Can the counter actually drop to 0, or is 1 the minimum?

Because there’s always at least 1 BBC for a port.

247 Posts

November 27th, 2012 05:00

Hi Merlijn,

Yes, the counter can actually drop to 0. At that point, transmission of new frames will halt untill at least 1 ACK is back and the counter is incremented again.

If you run the following CLI command against an interface, for example fc1/1:

sh int fc1/1 counters details | grep credit

You will see something like:

0 waits due to lack of transmit credits
14576233333 transitions of tx BB credit out of zero state
14 transitions of rx BB credit to zero state

I had to google a bit to make sense of it.. apparently "transitions of tx BB credit out of zero state" does not indicate a problem. I have reset the counters and it keeps incrementing; the other two values remain 0. This is a 1km link which has 32 credits, so it hardly ever dips under 31.

247 Posts

November 27th, 2012 07:00

It's cumulative, so you don't have to F5 it like a maniac to spot the buffer underrun

2 Intern

 • 

5.7K Posts

November 27th, 2012 07:00

0 waits due to lack of transmit credits
14576233333 transitions of tx BB credit out of zero state
14 transitions of rx BB credit to zero state

Is this "waits due to lack of transmit credits" cumulative or simply a counter for a certain point in time to present the value at THAT moment? I bet that if it's cumulative, you can easily track buffer shortages and act upon that.

No Events found!

Top