9 Legend

 • 

16.3K Posts

October 14th, 2012 22:00

The SBE log begins every time the server is booted.  Once it reaches a certain threshold, it stops counting and alerts you to the problem.  There is a bad DIMM (possibly the slot).

Which DIMM(s) is indicated to be the problem?

Once you have isolated which is the problem slot, then swap it with another slot.  If the problem follows the DIMM, then it needs to be replaced.  If it stays on the same slot, regardless of which DIMM is in it, then the slot is bad.  If for some reason it is not telling you which is the problem, then boot up with a SINGLE DIMM to isolate the problem one.

Also, have you run Dell's 32-bit Diagnostics (MPMemory)?

3 Posts

October 15th, 2012 00:00

Thanks for the quick reply.

For the first server, we moved the DIMMs from 1A-1B to 3A-3B and ran memtest86+. So far so good.

However, server #2 is different. The problem moved with the DIMMs, which might indicate a bad memory stick. But, we ran memtest86+ twice with both sticks installed, and it shows NO errors.

This makes me wonder if the stick is really bad, since it passed the memtest.

9 Legend

 • 

16.3K Posts

October 15th, 2012 09:00

Don't use memtest.  Use Dell's mpmemory diagnostics ... it is the most reliable diagnostics on Dell hardware.  Regardless of diags results, if you have isolated a particular DIMM as giving the error, no matter which slot it is in, it is bad.  The single-bit errors will eventually become multi-bit errors and will cause the system to crash and/or data corruption.  Best to simply replace it now.

3 Posts

October 15th, 2012 12:00

The dell memory tools says "health critical", but the logs show no error. I put the DIMM in another server, and the other server started to give Log Disable SBE, which makes your assumption right, the DIMM might be weak.

So, the new DIMMs are on their way and should arrive tomorrow.

6 Posts

May 19th, 2013 08:00

2 different types of logs:

clear esm logs / hardware logs
win:    omsa:Main system chassis>> hardware logs (or dset esm clear option)

OR      boot ctrl_E (pe gen 9-11 only)


.
.
clear sbe memory log / err counts on each dimm (redX on dimm in omsa,dimm disabled)
(Clearing the ESM and alert logs alone does not re-enable the error count.)
.
win    CLI:   dcicfgxx.exe command=clearmemfailures  (must have omsa installed, and be in ..omsa..bin folder)                 

OR      boot/exit 32 bit diags disc:    mpmemory -ptech -tlogclr                 

.

i'd clear both for this; can do with out reboot generally.

If err comes back, clear logs again, swap dimms around in box;

Want to see if re-seat cures,

or if err does come back, if  err  follows device swapped (dimm)  or stays with connection point (mobo/riser).

.

.

.

for dcicfg command; all my omsa's are in same directory structure,

and all use the 32 bit version, so all can use this batch file(season "cd" folderName and if 32 or 64 to taste) to automate clear sbe log/counter:

.

C:
cd c:\program files\Dell\sysmgt\omsa\bin
Dcicfg32.exe command=clearmemfailures
pause

(same command line in OMSA Live/Linux)

No Events found!

Top