SRM 4.4.3 and 4.5: Oracle database collection does not recover after connection interruption

Summary: The SRM Oracle Database collector does not automatically recover from connection interruptions with the Oracle database session. ​

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Symptoms

  • Data collection for the database in question fails and multiple timeout errors are observed in the collecting logs every polling cycle for the affected server:
SEVERE	 -- [2021-05-28 15:25:59 JST] -- AbstractJobExecutor::executeJobRunner(): Error while executing job Oracle-Perf-x-[IP-ADDRESS]-ORACLE32-getparameters removing it from the queue
com.watch4net.apg.concurrent.JobExecutionException: Error during execution of main query 'getparameters' on Oracle-Perf-x-[IP-ADDRESS]-ORACLE32-getparameters
	at com.watch4net.apg.v2.collector.plugins.sqlcollector.polling.MainCollectorJob.step(MainCollectorJob.java:196)
	at com.watch4net.apg.concurrent.executor.AbstractJobExecutor.executeJobRunner(AbstractJobExecutor.java:130)
	at com.watch4net.apg.concurrent.executor.AbstractJobExecutor.access$500(AbstractJobExecutor.java:25)
	at com.watch4net.apg.concurrent.executor.AbstractJobExecutor$JobRunnerImpl.run(AbstractJobExecutor.java:287)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
	at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
	at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
	at com.watch4net.apg.v2.collector.plugins.sqlcollector.polling.MainCollectorJob.step(MainCollectorJob.java:180)
	... 6 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
	at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
	at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
	... 8 more
  • The database in question is accessible and available, and can be access by SRM.
  • No errors or issues during the SRM discovery test.

Cause

SRM does not retry the connection once there is an interruption with the connection to the Oracle database, such as when the Oracle database is restarted. As a result, SRM will continuously fail to connect to the database.

Resolution

Dell EMC SRM Engineering are aware of this issue and have implemented a fix for this issue for of SRM version 4.6. 

For SRM 4.4.3 and above, the SRM 4.6 Oracle Database package with the fix can be implemented. For a fix for SRM 4.4.3 and above, please go to the Dell EMC Online Support site (https://dell.com/support) to open a Service Request (SR), and reference this Knowledgebase article number in your SR.'

To workaround this issue in versions prior to SRM 4.6
  • Restarting the Oracle Database collector-manager service will resolve the issue until the next occurrence

Affected Products

SRM
Article Properties
Article Number: 000189045
Article Type: Solution
Last Modified: 07 Sept 2021
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.