Unsolved
This post is more than 5 years old
12 Posts
0
2511
Card down notification
I have a " card down notification " InCharge Am-PM. When you go into Containment, the CARDS tab reads, " NOTPRESENT "....
However, the cards are present and fully functional. I have a couple of questions regarding this....
1st, What is causing the alarm and how can I stop it from reoccuring?
2nd, How can I delete the alarm without disabling future alarms incase there is a real issue?
Thanks in advance...
However, the cards are present and fully functional. I have a couple of questions regarding this....
1st, What is causing the alarm and how can I stop it from reoccuring?
2nd, How can I delete the alarm without disabling future alarms incase there is a real issue?
Thanks in advance...
network12
12 Posts
0
October 2nd, 2007 05:00
Dinand1
89 Posts
0
October 2nd, 2007 23:00
The following root-cause problem is diagnosed for Card:
Down: Indicates that a card has failed. A card failure causes all ports and interfaces in
the card as well as any system functions associated with the card to fail. For example,
if a Router Switch Module (RSM) is associated with the card, the routing functions
provided by the RSM will fail. The events used to diagnose Card Down include:
◆ OperationallyDown for the card
◆ Card Down for any subcards
◆ SwitchOver for any supervisor cards
◆ Network adapter Down for any ports or interfaces realized by the card
◆ System Down for any systems packaged by the card
well, you better check yor switch, your community strings, if it´s certified. It might be possible that the chassis is not properly reporting the status! Do you have more switches of the same type? Did you try to delete de device and rediscover it again?
You could also unmanage the card in case yo don't wanna see the alarm, this will remove the alarm from SAM but you must be careful with the objects related to the card because they might become unmanaged too
cheers
Fernando
network12
12 Posts
0
October 3rd, 2007 03:00
While I understand your diagnosis. The cards themselves have not failed. The alarms I am seeing are false. The device and cards are functional.
I am looking for an answer to what might have caused the false alarm.
Thanks!
network12
12 Posts
0
October 9th, 2007 05:00
What made this difficult to diagnose is there was nothing wrong with the router or the cards. All were functioning as they were suppose to. We were just getting false alarms.
Unmanaging, rediscovering, remanaging etc never worked. I finally contacted EMC about a " syntax string " that would remove the false alarm yet allow the device to send a legitimate alarm if needed....
Thanks again....
Dinand1
89 Posts
0
October 9th, 2007 05:00
the 'not present' status also happens when the card dissapers from the device. I had the same with interfaces like loopbacks, did u try to remove and re-discover the device?
did u check the last certification list from EMC?
is your device certified? what is the OID of your device?
cheers
F
Dinand1
89 Posts
0
October 9th, 2007 05:00
would you mind to post the resolution and the problem?
it will be quite helpful in case this happens again and someone needs a possible or definitive solution. Any hints or solutions for similiar cases sometimes help.
Many thanks in advance
Fernando
network12
12 Posts
0
October 9th, 2007 06:00
We run a Unix platform. The following were the steps taken to safely remove the alarm.
1. Log in as super user " sudo -s "
All alarms manually removed need to be done from the Smarts Bin file. Where the executibles are located.
2. cd /opt/InCharge6/SAM/smarts/bin
The syntax to generate the required response should include the class and the event names. The prompt is
3. ./sm_ems --server=INCHARGE-SA clear ( class ) ( event ) ( name ) ( event name ) ( source )
Example
./sm_ems --server=INCHARGE-SA clear Interface Down IF-s00177.phoenixville.pa.chs.net/3 Down INCHARGE-AM-PM
Additionally, when you double click the alarm and the splash screen is produced. Make sure your entry reads exactly like the alarm itself. If there are caps use caps, etc....
Anyway, I hope this helps....
Arpita2
3 Posts
0
June 4th, 2010 09:00
Hi All,
Thats really a helpful information.
I was too facing many such false alarm reported by the user for which I am giving Support.
But, there still a question?? What made this false alarm to appear on the Notification Console even if the Card was not Down.
Even after the doing the walk on the Card Oper Status, I get the result as Card Down but the User reapeatedly report as to be a false alarm.
Is this the issue with Certification for Polling the Wrong MIB ?
Thanks
Arpita
DSW-msharp
1 Message
0
January 7th, 2011 07:00
I have a similar issue with Cisco 2811 chassis. Unfortunately my SMARTs install is on a Windows platform, so the above solution does not work, but it points me in another direction. I had opened a case with EMC support and the issue I was having was that the OIDtype \IP\smarts\local\conf\discovery\oid2type_Cisco.conf Card-Fault = OldCiscoChassis is reporting this card as down or unknown (I forget which now). In any event the OldCiscoChassis has been depricated by Cisco, yet we were unable to find any documentation with Cisco for a new oid type to use. I have not yet, but will be opening a TAC case with Cisco. I still fear that they will not have any solution available and I will ahve to modify SMARTs to deal with this false alarm as above. Does anyone know how to apply the above solution for Unix to a Windows platform?
bkuhhirte
52 Posts
0
February 2nd, 2011 12:00
The OldCiscoChassis MIB has historically been a sore spot for these kind of problems. It would randomly re-index the entries in the table and in many cases, they would either no longer correspond to the right cards anymore or the status was unreliable or just plain incorrect.
The trouble is that when they deprecated the MIB (back in IOS 11.x and earlier) they said the Cisco Entity FRU Control (CEFC) MIB was the designated replacement. Unfortunately, it still isn't generally available on all hardware - even with the latest IOS releases.
In many cases, customers have simply removed the "Card-Fault" entry in the oid2type.conf file for the device in question. This is equivalent to unmanaging the Card as there is no possible way for the Card to generate a root-cause under those circumstances.
In IP 8.1, we introduced an intermediate option called "Card_Fault_Default". This infers the state of the Card from the local and connected network adapters. It is still short of having correct instrumentation for the Card, but it is arguably better than the problems we experienced with the OldCiscoChassis MIB.
Also in an upcoming release of IP, we plan to have the Card instrumentation (on Cisco devices) be less constrained to the SysObjectID than the MIBs present on the device - as the device moves to support the newer MIBs, we will adjust to take advantage of it.
In passing, I would be a little leery of using the "workaround" in question. It will indicate that the event has ostensibly been cleared from the source IP domain thus making SAM clear the event, but if SAM is restarted the event will be re-notified since IP still believes the condition on the card to still be valid.
Regards,
Bill
jchoov
1 Message
0
July 19th, 2013 11:00
Hi,
I'm having the same issue wiht 3 of my routers showing the following error whenever they are re-discovered by SMARTS. has anyone found a resolution yet? This seems to be happening only for some routers and not all.
InCharge Server INCHARGE-SA:
NL_NOTIFY Card CARD-NAMR2.CG.COM/2 [] [C2921/C2951 AC Power Supply] Down (100%):
Indicates that a failed card is the root cause.
BenimusIQ
12 Posts
0
July 22nd, 2013 22:00
Just adding some more information around this, we have seen this problem a few times recently with upgrades from 8.x to 9.x for customers, and it is generally caused by the device(s) not responding correctly to the CISCO-ENTITY-FRU-CONTROL-MIB (OID 1.3.6.1.4.1.9.9.117), which is now used by default for a lot of Cisco devices. It's support however is dependent on IOS version it seems.
In some cases we have have fixed it by reverting back to the OLD-CISCO-CHASSIS-MIB (OID 1.3.6.1.4.1.9.3.6), and others the CISCO-STACK-MIB. (OID .1.3.6.1.4.1.9.5.1.3.1), depending on the device type.
To do this, find the entry in the oid2type_Cisco.conf file, and change the Card-Fault entry under INSTRUMENTATION from CiscoEntityFRU to be the correct type for your device, either OldCiscoChassis or CiscoStack, e.g.
Card-Fault = OldCiscoChassis
Being good Smarts citizens, you will be using sm_edit to do this so that the original file is left untouched in the conf directory and your modifications are in local/conf so that whoever needs to look at this later can see what you have done.
Benjamin Johns
Senior Technology Consultant
iQ Consult Pty Ltd