We have had the exact problem with a PE650.
A Dell service tech came and replaced the motherboard and CPU.
And still we get "failsafe timer interval elapsed" and a reboot.
I have run the Dell specific memory test wil alll tests turned on overnight and it passed 117 times!!!
I have been selectively removing hardware to try an isolate this problem and found that the system is stable without a Network Card, weird huh.
I reported this to Dell and a new NIC appeared extremely quickly.
Upon replacing the NIC with a new card the problem is back!
OK we have eliminated all hardware!
The server runs fine in safe mode with networking, its definitely a service.
Will post later when I find out which one!!
seems to be some sort of conflict between one of the DSM services and the windows time service.
It sees that stopping either windows time or the dsm services (4 of them) results in a stable machine with no "failsafe timer interval elapsed" messaegs
will post more details as the come to light.
Greg
This is interesting. Dell never did figure out what our problem was. It kept getting worse until the machine would restart every few minutes. They finally replaced the machine.
I have now uninstalled the DOMSA and restarted the machine, it has been stable now for 1.5 hours (since restart).
I first traced the issue by starting the server in safe mode with network, the machine was stable so I took note of the services that were running, rebooted the machine and then pared back the services (in computer management) to as close to the safe mode condition as i could.
With a stable but basic system, i restarted a few services at a time with an hour or so in between.
I must have picked the right / wrong ones to start with as it was on the second set that the problems came back.
This made trouble shooting from here really easy.
The first set was the DSM (Dell Openmanage) processes x4
the second set included windows time. how lucky.
gr3g0s
4 Posts
0
August 28th, 2006 04:00
A Dell service tech came and replaced the motherboard and CPU.
And still we get "failsafe timer interval elapsed" and a reboot.
I have run the Dell specific memory test wil alll tests turned on overnight and it passed 117 times!!!
I have been selectively removing hardware to try an isolate this problem and found that the system is stable without a Network Card, weird huh.
I reported this to Dell and a new NIC appeared extremely quickly.
Upon replacing the NIC with a new card the problem is back!
OK we have eliminated all hardware!
The server runs fine in safe mode with networking, its definitely a service.
Will post later when I find out which one!!
gr3g0s
4 Posts
0
August 29th, 2006 00:00
It sees that stopping either windows time or the dsm services (4 of them) results in a stable machine with no "failsafe timer interval elapsed" messaegs
will post more details as the come to light.
Greg
Ax-man
2 Posts
0
August 29th, 2006 01:00
gr3g0s
4 Posts
0
August 29th, 2006 02:00
I first traced the issue by starting the server in safe mode with network, the machine was stable so I took note of the services that were running, rebooted the machine and then pared back the services (in computer management) to as close to the safe mode condition as i could.
With a stable but basic system, i restarted a few services at a time with an hour or so in between.
I must have picked the right / wrong ones to start with as it was on the second set that the problems came back.
This made trouble shooting from here really easy.
The first set was the DSM (Dell Openmanage) processes x4
the second set included windows time. how lucky.