fredde1
3 Argentum

Reviewing Environment for Upgrade

This is a list that our team put together that helps review an existing VIPR SRM environment for a pending upgrade.

If you have any questions or suggestion regarding this, please let us know..

  1. 1) Run the latest Calypso reporting and review the Diagnostic_support.zip output
    1. Check Configuration and settings, Services, Tasks, CPU, Memory, Java Heap, File Space, Grants, Metrics on the Backends, Backlogs, Modules, etc.
  2. 2) Login to ViPR SRM UI Centralized Management – Click on Physical Overview.
    1. Are graphs recording data? – cpu, memory and disk utilization
  3. 3) Click on each host - Select the Services tab on the right
    1. Click on show all entries - Are all services running?
  4. 4) Click on each host – Select the Tasks tab on the right
    1. Click on show all entries - Are all enabled tasks completing successfully according to their schedules?
  5. 5) Check Licensing – Has the customer loaded the ELMS license or are they using temp licenses?
    1. Make sure customer has a valid license and it is not about to expire
    2. If any licensing has expired, delete them and re-synchronize. 
  6. 6) Go to the User interface and expand Report Library
    1. Are all top level reports under Report Library showing information as they should?
    2. Any broken links?
  7. 7) Go to Explore>Storage>Storage Systems>Array Ser #>Topology Map
    1. Validate that end-to-end topology is working
  8. 😎 Go to Explore>Hosts>Hostname>Topology Map
    1. Validate that end-to-end topology is working
  9. 9) Continue to M&R Health Reports
    1. Report Library>EMC M&R Health>Stress
      1. Change display to show 6 hours rather than the default of 1 week
      2. Look for Major and Critical indications – click on the box to drill down into a specific component
        1. There are 4 areas that apply to each component – Alerts, Availability (%), CPU usage (%), Memory Usage (%).
    2. Report Library>EMC M&R Health>Stress>Components>Backends
      1. Memory Utilization % should be less than 70%
      2. CPU Usage % should be less than 60%
      3. Current Queued Files Count should always be less than 20
    3. Report Library>EMC M&R Health>Stress>Components>Collector Managers
      1. Change display to show 6 hours
      2. Show all to be sure all information is presented
      3. Sort by severe log count column
      4. Memory Utilization % should be less than 90%
  • v. CPU Usage % should be less than 60%
  • vi. Review specific reports for instances showing overages
  1. Report Library>EMC M&R Health>Stress>Compliance Backend
    1. Any issues with CPU and Memory?
    2. Report Library>EMC M&R Health>Stress>Components>Databases
      1. Current Metrics Count should be less than 1.5 million
      2. Contact SAM to connect with PS if database has to be split
    3. Report Library>EMC M&R Health>Stress>Event Processing Managers
      1. Any issues with CPU and Memory?
      2. Warning log count may be high – no issue.
    4. Report Library>EMC M&R Health>Stress>Topology Mapping Service
      1. Any issues with CPU and Memory?
    5. Report Library>EMC M&R Health>Stress>Components>Web Servers (Tomcat)
      1. Availability % should be near 100%
      2. CPU Usage% should be under defined thresholds
      3. Memory Utilization % should be under defined thresholds
      4. Increase memory and CPU as needed
    6. Report Library>EMC M&R Health>Servers Summary
      1. CPU Utilization values should be less than 60%
      2. Swap Usage values should be at or near 0
      3. Add memory if needed
    7. Report Library>EMC M&R Health>Servers Summary>server_name>File Systems Stats
      1. Utilization should be under 60%
      2. Add space if needed
    8. Report Library>EMC M&R Health>Servers Summary>server_name>Backend Temp File Count
      1. Temp file count should rise and fall consistently
    9. Report Library>EMC M&R Health>Misc. Reports>JVMs Sizing Recommendations
      1. Currently Allocated should be above Recommended Allocation
      2. Total Currently Allocated per host should not be more than physical RAM available
    10. Review other files for errors
      1. Collecting/Collector-Manager/Default/logs/collecting-0-0.log (for collection)
      2. Web-Servers/Tomcat/Default/logs/apg-tomcat-default.out (for current tomcat running instance)
      3. Web-Servers/Tomcat/Default/logs/catalina.<date>.log (the system will indicate if there are any errors)

Issues seen with upgrades - solution pack configurations, missing grants, missing information in the APG and APG-WS files on the front so review these entries as well.

Labels (1)