This is a list that our team put together that helps review an existing VIPR SRM environment for a pending upgrade.
If you have any questions or suggestion regarding this, please let us know..
- 1) Run the latest Calypso reporting and review the Diagnostic_support.zip output
- Check Configuration and settings, Services, Tasks, CPU, Memory, Java Heap, File Space, Grants, Metrics on the Backends, Backlogs, Modules, etc.
- 2) Login to ViPR SRM UI Centralized Management – Click on Physical Overview.
- Are graphs recording data? – cpu, memory and disk utilization
- 3) Click on each host - Select the Services tab on the right
- Click on show all entries - Are all services running?
- 4) Click on each host – Select the Tasks tab on the right
- Click on show all entries - Are all enabled tasks completing successfully according to their schedules?
- 5) Check Licensing – Has the customer loaded the ELMS license or are they using temp licenses?
- Make sure customer has a valid license and it is not about to expire
- If any licensing has expired, delete them and re-synchronize.
- 6) Go to the User interface and expand Report Library
- Are all top level reports under Report Library showing information as they should?
- Any broken links?
- 7) Go to Explore>Storage>Storage Systems>Array Ser #>Topology Map
- Validate that end-to-end topology is working
- 😎 Go to Explore>Hosts>Hostname>Topology Map
- Validate that end-to-end topology is working
- 9) Continue to M&R Health Reports
- Report Library>EMC M&R Health>Stress
- Change display to show 6 hours rather than the default of 1 week
- Look for Major and Critical indications – click on the box to drill down into a specific component
- There are 4 areas that apply to each component – Alerts, Availability (%), CPU usage (%), Memory Usage (%).
- Report Library>EMC M&R Health>Stress>Components>Backends
- Memory Utilization % should be less than 70%
- CPU Usage % should be less than 60%
- Current Queued Files Count should always be less than 20
- Report Library>EMC M&R Health>Stress>Components>Collector Managers
- Change display to show 6 hours
- Show all to be sure all information is presented
- Sort by severe log count column
- Memory Utilization % should be less than 90%
- v. CPU Usage % should be less than 60%
- vi. Review specific reports for instances showing overages
- Report Library>EMC M&R Health>Stress>Compliance Backend
- Any issues with CPU and Memory?
- Report Library>EMC M&R Health>Stress>Components>Databases
- Current Metrics Count should be less than 1.5 million
- Contact SAM to connect with PS if database has to be split
- Report Library>EMC M&R Health>Stress>Event Processing Managers
- Any issues with CPU and Memory?
- Warning log count may be high – no issue.
- Report Library>EMC M&R Health>Stress>Topology Mapping Service
- Any issues with CPU and Memory?
- Report Library>EMC M&R Health>Stress>Components>Web Servers (Tomcat)
- Availability % should be near 100%
- CPU Usage% should be under defined thresholds
- Memory Utilization % should be under defined thresholds
- Increase memory and CPU as needed
- Report Library>EMC M&R Health>Servers Summary
- CPU Utilization values should be less than 60%
- Swap Usage values should be at or near 0
- Add memory if needed
- Report Library>EMC M&R Health>Servers Summary>server_name>File Systems Stats
- Utilization should be under 60%
- Add space if needed
- Report Library>EMC M&R Health>Servers Summary>server_name>Backend Temp File Count
- Temp file count should rise and fall consistently
- Report Library>EMC M&R Health>Misc. Reports>JVMs Sizing Recommendations
- Currently Allocated should be above Recommended Allocation
- Total Currently Allocated per host should not be more than physical RAM available
- Review other files for errors
- Collecting/Collector-Manager/Default/logs/collecting-0-0.log (for collection)
- Web-Servers/Tomcat/Default/logs/apg-tomcat-default.out (for current tomcat running instance)
- Web-Servers/Tomcat/Default/logs/catalina.<date>.log (the system will indicate if there are any errors)
Issues seen with upgrades - solution pack configurations, missing grants, missing information in the APG and APG-WS files on the front so review these entries as well.