LooJacob
1 Copper

求助:PowerEdge R420连续两次宕机!原因竟是...

21号早上宕机一次,22号早上又宕机一次,第一次故障日志空白40分钟,第二次故障日志空白1小时10分钟。

Processor 1 below trip temperature. Throttling disabled是硬件的故障吗?求解!

/var/log/messages报错摘录:

Apr 21 07:40:01 TYGA-GIS /usr/sbin/cron[7877]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 07:46:14 TYGA-GIS syslog-ng[1873]: Log statistics; dropped='pipe(/dev/xconsole)=0', dropped='pipe(/dev/tty10)=0', processed='center(queued)=25164', processed='center(received)=16975', processed='destination(messages)=16965', processed='destination(mailinfo)=10', processed='destination(mailwarn)=0', processed='destination(localmessages)=0', processed='destination(newserr)=0', processed='destination(mailerr)=0', processed='destination(netmgm)=0', processed='destination(warn)=8135', processed='destination(console)=22', processed='destination(null)=0', processed='destination(mail)=10', processed='destination(xconsole)=22', processed='destination(firewall)=0', processed='destination(acpid)=0', processed='destination(newscrit)=0', processed='destination(newsnotice)=0', processed='source(src)=16975'
Apr 21 07:50:01 TYGA-GIS /usr/sbin/cron[7954]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:00:01 TYGA-GIS /usr/sbin/cron[8013]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:10:01 TYGA-GIS /usr/sbin/cron[8076]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:20:01 TYGA-GIS /usr/sbin/cron[8166]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:30:01 TYGA-GIS /usr/sbin/cron[8235]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:40:01 TYGA-GIS /usr/sbin/cron[8304]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 08:46:14 TYGA-GIS syslog-ng[1873]: Log statistics; dropped='pipe(/dev/xconsole)=0', dropped='pipe(/dev/tty10)=0', processed='center(queued)=25171', processed='center(received)=16982', processed='destination(messages)=16972', processed='destination(mailinfo)=10', processed='destination(mailwarn)=0', processed='destination(localmessages)=0', processed='destination(newserr)=0', processed='destination(mailerr)=0', processed='destination(netmgm)=0', processed='destination(warn)=8135', processed='destination(console)=22', processed='destination(null)=0', processed='destination(mail)=10', processed='destination(xconsole)=22', processed='destination(firewall)=0', processed='destination(acpid)=0', processed='destination(newscrit)=0', processed='destination(newsnotice)=0', processed='source(src)=16982'
Apr 21 08:50:01 TYGA-GIS /usr/sbin/cron[8395]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 21 09:21:50 TYGA-GIS syslog-ng[1924]: syslog-ng starting up; version='2.0.9'
Apr 21 09:21:50 TYGA-GIS firmware.sh[1970]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 21 09:21:50 TYGA-GIS firmware.sh[1976]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 21 09:21:50 TYGA-GIS firmware.sh[1988]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 21 09:21:50 TYGA-GIS firmware.sh[2001]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 21 09:21:50 TYGA-GIS firmware.sh[2014]: Cannot find firmware file 'intel-ucode/06-2d-07'

----------------------------------------------------


Apr 22 07:57:41 TYGA-GIS mcelog: Processor 15 below trip temperature. Throttling disabled
Apr 22 07:57:41 TYGA-GIS mcelog: Processor 5 below trip temperature. Throttling disabled
Apr 22 07:57:41 TYGA-GIS mcelog: Processor 17 below trip temperature. Throttling disabled
Apr 22 07:57:41 TYGA-GIS mcelog: Processor 7 below trip temperature. Throttling disabled
Apr 22 07:57:41 TYGA-GIS mcelog: Processor 19 below trip temperature. Throttling disabled
Apr 22 07:57:41 TYGA-GIS mcelog: Processor 9 below trip temperature. Throttling disabled
Apr 22 08:00:01 TYGA-GIS /usr/sbin/cron[15636]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 22 08:10:01 TYGA-GIS /usr/sbin/cron[15705]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 22 08:20:01 TYGA-GIS /usr/sbin/cron[15776]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
Apr 22 08:21:52 TYGA-GIS syslog-ng[1924]: Log statistics; dropped='pipe(/dev/xconsole)=0', dropped='pipe(/dev/tty10)=0', processed='center(queued)=689', processed='center(received)=497', processed='destination(messages)=491', processed='destination(mailinfo)=2', processed='destination(mailwarn)=0', processed='destination(localmessages)=4', processed='destination(newserr)=0', processed='destination(mailerr)=0', processed='destination(netmgm)=0', processed='destination(warn)=172', processed='destination(console)=7', processed='destination(null)=2', processed='destination(mail)=2', processed='destination(xconsole)=7', processed='destination(firewall)=0', processed='destination(acpid)=2', processed='destination(newscrit)=0', processed='destination(newsnotice)=0', processed='source(src)=497'
Apr 22 10:35:38 TYGA-GIS syslog-ng[1908]: syslog-ng starting up; version='2.0.9'
Apr 22 10:35:38 TYGA-GIS firmware.sh[1954]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[1960]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[1972]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[1985]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[1998]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[2011]: Cannot find firmware file 'intel-ucode/06-2d-07'
Apr 22 10:35:38 TYGA-GIS firmware.sh[2024]: Cannot find firmware file 'intel-ucode/06-2d-07'

mcelog频繁提示下述报错

-----------

MCE 23
CPU 1 THERMAL EVENT TSC 2f117c16fa04a
TIME 1493264184 Thu Apr 27 11:36:24 2017
Processor 1 below trip temperature. Throttling disabled
STATUS c000000088380c00 MCGSTATUS 0
MCGCAP 1000c12 APICID 20 SOCKETID 1
CPUID Vendor Intel Family 6 Model 45

标记 (1)
6 条回复6
社区管理员
社区管理员

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

硬件上有什么报错吗?

这个问题系统内核及内存及一些PCIE设备会引起这种报错,所以先排除硬件问题,看看硬件有什么报错。

Dell EMC | Global Support & Deployment

0 项奖励
LooJacob
1 Copper

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

请问报错从哪里看到?日志?路径是什么?

还是在主机的前置面板?

0 项奖励
社区管理员
社区管理员

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

硬件有报错,通常前面板LED会显示一些错误代码,另也可以尝试以下方式收集一下日志回复看一下。

zh.community.dell.com/.../1084.13gtsr

Dell EMC | Global Support & Deployment

0 项奖励
LooJacob
1 Copper

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

这是最新的mcelog,硬件故障在这里能看到吧

链接:pan.baidu.com/.../1bo3as2N 密码:as86

0 项奖励
LooJacob
1 Copper

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

好,我按照WIKI收集一下~

0 项奖励
社区管理员
社区管理员

RE: 求助:PowerEdge R420连续两次宕机!原因竟是...

后台网终限制,无法访问外部网盘,你可以把日志直接拖到回复时的输入框中上传。

Dell EMC | Global Support & Deployment

0 项奖励