Start a Conversation

Unsolved

This post is more than 5 years old

2499

February 23rd, 2018 07:00

Did you use NAT (Network Address Translation) between the different servers of the same SRM platform?

I want to know if any you used NAT between the different servers of the same SRM platform. To be more precise I want to know if you used NAT to connect PBE (primary backend) server with a remote collector.

I had to use this mechanism and it might be the case that I got some issues because of this but I'm not sure.

Thanks and regards,

Mugur

9 Posts

February 26th, 2018 02:00

Hi Mugur.

Yes, you can do this, but you have to move the Load Balancer Connectors off of the Collecting VMs and move them to a VM with the main infrastructure. This is because the LBC needs the actual address of the PBE/ABE as supplied by the LBA and the NAT will break this connection. The LBC will then be trying to send data to the wrong target. If you can allocate the extra resources, you can put this on one of the existing FE/PBE/ABE servers, although I prefer to stand up a separate VM – just for this purpose. That way, if you have multiple Collecting VMs, you can host multiple matching LBCs. On the Collecting VMs, you then configure each of the CMs to route through to the NAT address of the VM running its LBC, rather than the usual localhost:2020.

Obviously this has implications. The biggest is that you lose the buffer on the LBC for connection issues. Instead, each of the CMs will use its own failover filter. You may have to review the failover filter settings on each CM and check the disk space buffer it can use. This is a manual edit and you need to document this in case changes are reverted during upgrades.


I suggest you engage local Dell EMC Professional Services to help with this, as this is a customisation not covered in the existing documentation and unsuitable for specific detail to be published here.

25 Posts

February 28th, 2018 05:00

Hi Ken,

I think I might not have explained well how NAT was used in my case. The PBE/ABE see the remote collector (all the other collectors are in the same subnet as the PBE/ABEs) through a NAT address but the remote collector (therefore the LBC that sits on it) sees the PBE/ABE with their real IP addresses.

I know that for the specific connectivity between the remote collector (being the source) and the PBE (being the destination) I would need the following ports on the destination (besides other ports) to be opened: 2000, 2001, 2010, 2020 and 2040. For a connection to an ABE I would need 2100, 2101, 2200, 2201, 2300, 2301, 2400 and 2401.

The most weird thing is that the data that the remote collector tries to send to the PBE are not getting there but the data that the remote collector tries to send to an ordinary ABE are getting there very well.

This behavior leads us to the following result: data about a VMAX that is collected by the remote collector are getting to the ABE but data related to another VMAX that is also collected by the same remote collector but the “platform” chose to send to PBE instead to a ABE are not getting to the destination.

And for having real fun there is another interesting fact. I’m getting an error on PBE that the remote collector is not able to connect on itself (the PBE) and on port 2000 but a simple telnet (done on the remote collector) test like “telnet PBE_IP_address 2000” is working just fine.

I really don’t see an end to this story.

Thanks and regards,

Mugur

Mugur Stef

Storage Admin. Expert

Storage Services

Performance & Efficiency

mugur.stef@vodafone.com

+40 733 508856

Avrig Office | 3-5 Avrig St, 021571 Bucharest, Romania

Technology-Vodafone Shared Services Center Romania

vodafone.com

The future is exciting.

Ready?

9 Posts

February 28th, 2018 10:00

Hi Mugur.

Usually, only the ABE receives device metrics with the PBE being reserved for events, topology, compliance, etc, so I'm not sure why one of the VMAXs should have its data sent to the PBE. The key component here is the LB Arbiter on the PBE. It alone determines where devices go and shares this with the LBCs on a regular basis. What do the logs of the LBA look like? Are they clean? You  should see messages that it is able to read the databases. On the LBCs (on the Collecting VMs) do they have any errors? They should be able to receive the LBA updates to know where to send the data. In a NAT environment the key point is that the database access points will be the names/addresses local to the SRM infra. The LBCs then get exactly the same details (from the LBA), but if there is NAT between the Collecting VMs and the infra, then the LBCs will be trying to send the data to the address the LBA knows about and not the NAT address the LBC should be accessing.

Makes sense?

Ken.

25 Posts

March 5th, 2018 00:00

Hi Ken,

Not all of it. ☺

I’ll try to involve also Professional Services for this issue. I also created a SR for support but the support told me there was a network issue.

I found some logs on the remote collector:

remote_collector:/opt/APG/Collecting/Collector-Manager/Load-Balancer/logs # grep -i PBE_hostname collecting-0-0.log

INFO -- -- BasicMessagesLoggingHandler::channelInactive(): Communication channel is now inactive/closed

com.watch4net.apg.v2.collector.PipeError: java.io.IOException: Cannot write to PBE_hostname:2000

Caused by: java.io.IOException: Cannot write to PBE_hostname:2000

INFO -- -- BasicMessagesLoggingHandler::channelActive(): Communication channel is now active

com.watch4net.apg.v2.collector.PipeError: java.io.IOException: Cannot write to PBE_hostname:2000

Caused by: java.io.IOException: Cannot write to PBE_hostname:2000

The strange thing (but maybe not so strange) is that a test (made on the remote collector) like:

telnet PBE_IP 2000

works just fine so I’m not sure we are talking about a network issue.

Thanks and regards,

Mugur

Mugur Stef

Storage Admin. Expert

Storage Services

Performance & Efficiency

mugur.stef@vodafone.com

+40 733 508856

Avrig Office | 3-5 Avrig St, 021571 Bucharest, Romania

Technology-Vodafone Shared Services Center Romania

vodafone.com

The future is exciting.

Ready?

9 Posts

March 7th, 2018 03:00

Hi Mugur.

Check your telnet is using the NAT address. Also, if the Collecting VM is using the DNS name, does that resolve to the native address or the NAT address?? That might be the root of the problem.On the Collectcing VM Load Balancer config, it would be better to use the NAT address of the PBE/ABE and not the DNS name, if it does not resolve the NAT address correctly.

25 Posts

March 7th, 2018 04:00

Hi Ken,

I think I didn’t explain quite well the NAT mechanism that we are using.

Let’s say I have Collector VM as the source and the PBE as the destination.

We are using NAT mechanism to mask the real IP of the Collector VM so:

- the Collector VM sees the PBE with its real IP;

- the PBE sees the Collector VM with the NAT IP and not with its real IP.

When I launch a telnet command from the Collector VM towards the PBE I use the real IP of the PBE

The DNS “mechanism” is implemented by using /etc/hosts and it is set up in the following way:

- /etc/hosts on remote Collector VM contains the names and the real IPs of the rest of SRM servers, PBE/ABE and the rest of the collectors;

- /etc/hosts on PBE/ABE and the rest of the local collectors contain (referring to the remote collector) the name and the NAT address (not the real address of the remote collector).

I hope now I explained our NAT better.

Thanks and regards,

Mugur

Mugur Stef

Storage Admin. Expert

Storage Services

Performance & Efficiency

mugur.stef@vodafone.com

+40 733 508856

Avrig Office | 3-5 Avrig St, 021571 Bucharest, Romania

Technology-Vodafone Shared Services Center Romania

vodafone.com

The future is exciting.

Ready?

No Events found!

Top