2 Bronze

Re: Monster VMs performance 32vCPUs experience? beyond 16vCPUs drops?

Please see the original post - I am in no way recommending to go for every Oracle VM workload to this monster VM - in fact a best practice is to divide whenever possible. But if you can't go this way - that's true in our case - read on for more details I was able to capture.

Good news: customer is running benchmark test on a system with 32 cores (HP B660 with 4x4650 with 8 cores) and 768GB RAM. A monster VM scales fine up to 28vCPUs now and only drops to performance of 24 cores with 32vCPUs. We haven't run yet the application, but it looks like that in this case Hyperthreading (HT) on the previous machine didn't work well. We are looking for the vSphere 5.5 release to try advanced features as well to apply best practices for Oracle tuning on a VMware VM. As posted earlier the document on VMware's Oracle landing page seems to be outdated (2011) here are more recent hints and tips.In the scenario we run Oracle at this point.

  • Oracle HT recommended
  • Memory
    • Oracle – Reserve memory directly at VM reservation for a 32vCPU systems - possibly up to 100%, optimally we are using 128GB VM RAM, but this value may be different for you - follow formula from best practices to share some resources (in our case the may performance is needed for a nightly 8 hour job).
      • Utilize Memory Reservation–Size of the SGA + 2 times Aggregate PGA Target + 500MB for the OS (assuming some flavor of Linux.
      • These should be for Production Clusters only, as Development and Test databases do not usually require peak performance.
    • Faster RAM - possibly more physical memory is counter-productive if it’s cross architecture (CPU accesses memory of another CPU or the memory will have higher latency due to the memory architecture). Cisco B440 physically with 4x8 cores and 256GB memory have proven to be a pretty good sweet spot in terms of price / performance / resources.
    • NUMA locality
    • Linux huge pages (typical default 4K, large / huge 2MB, can be adjusted) - Oracle memory lookup, linear search can improve up to 15% - see as well the EMC At VMworld 2013 Orcale session – more detail there
  • Reservation / prioritization generally
    • CPU: Prod, Test, Dev in the same ESX cluster - eg Prod priority 1 / priority 2 test / dev Priority 3 ...

These are the key topics we identified in our situation - you might want to visit the following resources that provide further detail and we were looking at:

Thanks to Darryl Smith, Bart Sjerps, Sam Lucido, itzikr, Jeff Browning who provided input and insight.