VxBlock: Improved Memory RAS Features for Cisco UCS M5 Platforms

Summary: Due to memory DIMM errors and architectural changes in memory error handling on Intel Xeon Scalable processors (formerly code-named "Skylake Server") and 2nd Gen Intel Xeon Scalable processors (formerly code-named "Cascade Lake Server"), Cisco UCS M5 customers that experience memory DIMM errors might experience a higher rate of runtime uncorrectable memory errors than they experienced on previous generations with default SDDC Memory RAS mode. ...

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

Problem Description

Cisco UCS M5 servers with certain Intel Xeon Scalable processors might experience a higher rate of runtime uncorrectable memory errors than previous generations with the default Single Device Data Correction (SDDC) Memory Reliability, Availability, and Serviceability (RAS) configuration.


Background

Intel Xeon Scalable processors and 2nd Gen Xeon Scalable processors implemented changes in SDDC. SDDC is a fundamental Intel RAS feature available on all Cisco platforms. As a result of these architectural changes and memory dual inline memory module (DIMM) errors, there is a difference in which errors will be corrected between the previous generation of processors and the Xeon Scalable processor family generation.

The latest Intel microcode and BIOS enhancements improve management of memory errors by enabling additional Memory RAS features such as Adaptive Double Device Data Correction (ADDDC Sparing) and Post Packet Repair (PPR). ADDDC Sparing and PPR are now the default Memory RAS configuration on Cisco UCS M5 servers with Intel Xeon Scalable processors.

Additional information about Memory RAS features, such as ADDDC and PPR, can be found in the following document:  Cisco UCS HX M5 Memory Technical Overview - Memory RAS Features.

Release Certification Matrix (RCM) Affected

  • RCM Releases prior to 7.0.3.0
  • RCM Releases prior to 6.7.9.0

RCM Releases prior to 6.5.16.0.
 

UCS M5 Blades and Integrated UCS M5 Rack Servers

Cisco UCS 4.1(1d) or later expands memory fault coverage. This release and later versions, include additional RAS features, Adaptive Double Device Data Correction (ADDDC Sparing), and Post Package Repair (PPR). They are enabled and configured as "Platform Default" for Memory RAS configurations. Earlier releases had Memory RAS configuration set to "Maximum Performance." The original firmware that incorporated ADDDC and PPR included UCS Manager 4.1(1d), however a defect was identified in this version that may impact multiple systems, namely CSCvr79388. Due to this defect Cisco has changed the minimum required firmware. It is now recommended to upgrade the USC version to 4.1(3b) or later which include the fix for this defect. The UCS 4.1(3b) release is in the RCM versions below.


Standalone UCS M5 racks

In Cisco Integrated Manager Controller (IMC) release 4.1(1d) and later, Adaptive Double Device Data Correction (ADDDC Sparing) and Post Package Repair (PPR) features are available. They are enabled and configured as "Platform Default" for Memory RAS configurations.  Earlier releases had Memory RAS configuration set to "Maximum Performance."  The same bug found in the UCSM version 4.1(1d) also affects the CIMC 4.1(1d) version. The initial CIMC version with the fix, 4.1(3b), is included in the RCM releases listed below.


Required BIOS Settings

  • For UCS Manager managed servers with the BIOS POLICY for RAS configuration set to "Platform Default," no changes are required for ADDDC Sparing to take effect.
  • For UCS Manager managed servers with the BIOS POLICY for RAS configuration NOT set to "Platform Default," the policy must be changed to ADDDC Sparing (or Platform Default) to take advantage of ADDDC.
  • For Standalone (non-UCS Manager managed) servers, no changes are required for ADDDC Sparing to take effect.


Release Certification Matrix (RCM) Including Fix for Release 4.1(3b)

  • RCM Release 7.0.3.0 and later for the following systems
    • VxBlock 340, 350, 540, 540-40G, 740 and 1000
    • Tech Extension for Compute
  • RCM Release 6.7.9.0 and later for the following systems
    • Vblock 240, 340, 350, 540 and 740
    • VxBlock 240, 340, 350, 540, 540-40G, 740 and 1000
    • Tech Extension for Compute
  • RCM Release 6.5.16.0 and later for the following systems
    • Vblock 240, 340, 350, 540 and 740
    • VxBlock 240, 340, 350, 540, 740 and 1000

*Note* RCM release numbers may differ per VxBlock system. Reference the RCM version prior to upgrading your VxBlock system to ensure proper code compliance. 

IMPORTANT! Upgrade only if directly affected by the issue mentioned!

For additional details concerning the issue described in this article reference Cisco Field Notice: Field Notice: FN - 70432
Defect ID CSCvq38078


See this Video for more details:

Affected Products

VxBlock and vBlock Systems Series
Article Properties
Article Number: 000191333
Article Type: How To
Last Modified: 19 تشرين الثاني 2025
Version:  3
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.