SRM 4.4.3 and 4.5: Oracle database collection does not recover after connection interruption
Summary: The SRM Oracle Database collector does not automatically recover from connection interruptions with the Oracle database session.
This article applies to
This article does not apply to
This article is not tied to any specific product.
Not all product versions are identified in this article.
Symptoms
- Data collection for the database in question fails and multiple timeout errors are observed in the collecting logs every polling cycle for the affected server:
SEVERE -- [2021-05-28 15:25:59 JST] -- AbstractJobExecutor::executeJobRunner(): Error while executing job Oracle-Perf-x-[IP-ADDRESS]-ORACLE32-getparameters removing it from the queue com.watch4net.apg.concurrent.JobExecutionException: Error during execution of main query 'getparameters' on Oracle-Perf-x-[IP-ADDRESS]-ORACLE32-getparameters at com.watch4net.apg.v2.collector.plugins.sqlcollector.polling.MainCollectorJob.step(MainCollectorJob.java:196) at com.watch4net.apg.concurrent.executor.AbstractJobExecutor.executeJobRunner(AbstractJobExecutor.java:130) at com.watch4net.apg.concurrent.executor.AbstractJobExecutor.access$500(AbstractJobExecutor.java:25) at com.watch4net.apg.concurrent.executor.AbstractJobExecutor$JobRunnerImpl.run(AbstractJobExecutor.java:287) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114) at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044) at com.watch4net.apg.v2.collector.plugins.sqlcollector.polling.MainCollectorJob.step(MainCollectorJob.java:180) ... 6 more Caused by: java.util.NoSuchElementException: Timeout waiting for idle object at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134) at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) ... 8 more
- The database in question is accessible and available, and can be access by SRM.
- No errors or issues during the SRM discovery test.
Cause
SRM does not retry the connection once there is an interruption with the connection to the Oracle database, such as when the Oracle database is restarted. As a result, SRM will continuously fail to connect to the database.
Resolution
Dell EMC SRM Engineering are aware of this issue and have implemented a fix for this issue for of SRM version 4.6.
For SRM 4.4.3 and above, the SRM 4.6 Oracle Database package with the fix can be implemented. For a fix for SRM 4.4.3 and above, please go to the Dell EMC Online Support site (https://dell.com/support) to open a Service Request (SR), and reference this Knowledgebase article number in your SR.'
To workaround this issue in versions prior to SRM 4.6
For SRM 4.4.3 and above, the SRM 4.6 Oracle Database package with the fix can be implemented. For a fix for SRM 4.4.3 and above, please go to the Dell EMC Online Support site (https://dell.com/support) to open a Service Request (SR), and reference this Knowledgebase article number in your SR.'
To workaround this issue in versions prior to SRM 4.6
- Restarting the Oracle Database collector-manager service will resolve the issue until the next occurrence
Affected Products
SRMArticle Properties
Article Number: 000189045
Article Type: Solution
Last Modified: 07 Sept 2021
Version: 3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.