8 Krypton

Error in getting data from a Legato Networker 7.6 SP1

Jump to solution

Hi,

I got the following error:

INFO    12395.1        20110701:124713         clctr.config - registerCollectorConfig(): Sending Controller gf0vswas022p:3916 our initial configuration
INFO    12395.1        20110701:124718       clctr.mod.disk - module_startup(): DPA Disk Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718      clctr.mod.fchba - module_startup(): DPA FCHBA Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718         clctr.mod.fs - module_startup(): DPA Filesystem Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718       clctr.mod.host - module_startup(): DPA Host Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718     clctr.mod.memory - module_startup(): DPA Memory Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718     clctr.mod.netint - module_startup(): DPA NetInt Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718        clctr.mod.nsr - module_startup(): DPA Networker Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718    clctr.mod.process - module_startup(): DPA Process Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718  clctr.mod.processor - module_startup(): DPA Processor Module - Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718                clctr - initAgentWorkers(): created 5 workers
INFO    12395.1        20110701:124718                clctr - daemonCollector(): DPA Collector on gf02sxnw10
INFO    12395.1        20110701:124718                clctr - daemonCollector(): Copyright 2002 - 2011 EMC Corporation.  All rights reserved.
INFO    12395.1        20110701:124718                clctr - daemonCollector(): Version: 5.8.0 build 5934 on solaris (sparc-64)
INFO    12395.1        20110701:124718                clctr - daemonCollector(): Logging at level Info
INFO    12395.1        20110701:124718                clctr - daemonCollector(): Collector started and listening on port 3741
INFO    12395.7        20110701:125235      clctr.mod.fchba - fchba_perf(): Failed to get adapter performance for adapter Emulex-LPe11002-S-0. Statistics are not gathered by the HBA. Please review HBA firmware version and documentation to see if this is provided with a later version
WARN    12395.8        20110701:125817    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:125817    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.8        20110701:130339    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:130339    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.6        20110701:130912    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:130912    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.8        20110701:131425    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:131425    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.8        20110701:132042    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:132042    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.8        20110701:132556    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:132556    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.6        20110701:133128    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:133128    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
INFO    12395.65       20110701:134258                clctr - agentCommand(): received reload
WARN    12395.6        20110701:134843    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:134843    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.6        20110701:135357    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:135357    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.8        20110701:135357    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.8        20110701:135357    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
INFO    12395.65       20110701:135912                clctr - agentCommand(): received reload
WARN    12395.6        20110701:135922    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:135922    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.6        20110701:140629    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:140629    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.6        20110701:141150    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:141150    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.5        20110701:141700    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.5        20110701:141700    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.6        20110701:141700    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:141700    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.5        20110701:142209    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.5        20110701:142209    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.6        20110701:142209    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:142209    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.6        20110701:142721    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:142721    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.5        20110701:143226    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.5        20110701:143227    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.6        20110701:143227    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.6        20110701:143227    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
INFO    12395.65       20110701:143423                clctr - agentCommand(): received reload
INFO    12395.65       20110701:143614                clctr - agentCommand(): received reload
INFO    12395.65       20110701:143616                clctr - agentCommand(): received reload
WARN    12395.7        20110701:143729    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.7        20110701:143729    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'
WARN    12395.5        20110701:144238    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.5        20110701:144238    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf02sxnw10'
WARN    12395.7        20110701:144737    fwk.base.sysutils - csystemreadline(): timeout while waiting for child process
ERR     12395.7        20110701:144737    clctr.mod.nsr.lib - nsrRunQuery(): timeout running jobquery against server 'gf0vsxnw10'

Is there anybody who could help?

Bye Jens

0 Kudos
1 Solution

Accepted Solutions
swp1
1 Copper

Re: Error in getting data from a Legato Networker 7.6 SP1

Jump to solution

Hi Jens,

The error displayed is most likely an issue with the jobmonitor. By default, when the jobmonitor starts up it has 5 minutes to complete it's function. If it does not complete it's task within that time frame it can result in a timeout error which you are seeing in the log. To verify if it is the jobmonitor, from the DPA GUI, right click on the 'gf0vsxnw10' node -> DPA -> Errors -> Collector Errors, set this to run for 'Last Day'. In the report you should see the same timeout error in the 'Message' field. Check the 'Function' field associated with this message (most likely it will be jobmonitor).

Once the function has been determined, you should change the timeout value for this particular function. To do this:

- Right click on 'gf0vsxnw10' -> Administration -> Properties

- When the 'Node Properties' Window appears, click on the 'Assignations' tab.

- Scroll down the list of 'Requests' until you see the function in question, select it and click on 'Edit'

- Once you see the 'Request Settings' you will need to increase the values for 'Period' and 'Timeout'.

     - For Period, uncheck the box, 'Use Default Period' and then increase the number of minutes, I recommend doubling it. So for example, if it displays minutes, raise it to 10

     - For Timeout, Place a check into the box and double that value as well.

- Once the changes have been made, click 'OK' and then reload the collector.

Once this has been done, wait about 30 minutes and then rerun the 'Collector Error' Report for 'Last Hour'.When the report renders, look for that error message again and check to see when the error last occurred. If the last time it occurred was about 30 minutes ago, you should be in the clear. You can also run the 'Request History' report in 'DPA' -> 'History', to verify that the function had run successfully.

If you have any questions or concerns, let me know.

NOTE: Some of the requests may not have timeout values to change, if it turns out to be one of those requests, please let me know.

Thanks,

Sang Pak

EMC Technical Support

0 Kudos
1 Reply
swp1
1 Copper

Re: Error in getting data from a Legato Networker 7.6 SP1

Jump to solution

Hi Jens,

The error displayed is most likely an issue with the jobmonitor. By default, when the jobmonitor starts up it has 5 minutes to complete it's function. If it does not complete it's task within that time frame it can result in a timeout error which you are seeing in the log. To verify if it is the jobmonitor, from the DPA GUI, right click on the 'gf0vsxnw10' node -> DPA -> Errors -> Collector Errors, set this to run for 'Last Day'. In the report you should see the same timeout error in the 'Message' field. Check the 'Function' field associated with this message (most likely it will be jobmonitor).

Once the function has been determined, you should change the timeout value for this particular function. To do this:

- Right click on 'gf0vsxnw10' -> Administration -> Properties

- When the 'Node Properties' Window appears, click on the 'Assignations' tab.

- Scroll down the list of 'Requests' until you see the function in question, select it and click on 'Edit'

- Once you see the 'Request Settings' you will need to increase the values for 'Period' and 'Timeout'.

     - For Period, uncheck the box, 'Use Default Period' and then increase the number of minutes, I recommend doubling it. So for example, if it displays minutes, raise it to 10

     - For Timeout, Place a check into the box and double that value as well.

- Once the changes have been made, click 'OK' and then reload the collector.

Once this has been done, wait about 30 minutes and then rerun the 'Collector Error' Report for 'Last Hour'.When the report renders, look for that error message again and check to see when the error last occurred. If the last time it occurred was about 30 minutes ago, you should be in the clear. You can also run the 'Request History' report in 'DPA' -> 'History', to verify that the function had run successfully.

If you have any questions or concerns, let me know.

NOTE: Some of the requests may not have timeout values to change, if it turns out to be one of those requests, please let me know.

Thanks,

Sang Pak

EMC Technical Support

0 Kudos