Start a Conversation

Unsolved

This post is more than 5 years old

A

5 Practitioner

 • 

274.2K Posts

273915

July 9th, 2012 21:00

"power_saving" in centos 6.2 deployed on the PowerEdge R620

Hi, everyone

Here I have 68 r620 server and each has two e5-2650  cpus and 32G memory.

I have deploy the centos 6.2 on the servers and the settings in the BIOS are not be changed except I have close the logic processors.

So there're every strang problems

When some server booted , their cpus occopied by the process "power_saving" and "watchdog".  as following

top - 09:56:54 up 10:52,  3 users,  load average: 33.80, 33.71, 31.90

Tasks: 339 total,  40 running, 299 sleeping,   0 stopped,   0 zombie

Cpu(s):  0.0%us, 78.7%sy,  0.0%ni, 21.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

Mem:  32931132k total,   980388k used, 31950744k free,    72724k buffers

Swap: 67108856k total,        0k used, 67108856k free,   268520k cached

 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                              

 6951 root      -2   0     0    0    0 R 96.7  0.0 210:54.71 power_saving/4                                                                                                                       

   50 root      RT   0     0    0    0 S 75.7  0.0 162:55.42 watchdog/11                                                                                                                           

   14 root      RT   0     0    0    0 S 61.1  0.0 168:48.67 watchdog/2                                                                                                                            

 6960 root      -2   0     0    0    0 R 25.6  0.0 262:19.47 power_saving/13                                                                                                                      

 6958 root      -2   0     0    0    0 R 21.3  0.0 241:30.96 power_saving/11                                                                                                                      

 6957 root      -2   0     0    0    0 R 18.9  0.0 237:24.04 power_saving/10                                                                                                                       

 6956 root      -2   0     0    0    0 R 16.9  0.0 236:43.27 power_saving/9                                                                                                                        

 6953 root      -2   0     0    0    0 R 10.6  0.0 218:47.43 power_saving/6                                                                                                                       

 6952 root      -2   0     0    0    0 R  8.6  0.0 246:29.83 power_saving/5                                                                                                                       

 6948 root      -2   0     0    0    0 R  6.0  0.0 199:38.34 power_saving/1                                                                                                                        

 6949 root      -2   0     0    0    0 R  2.3  0.0 204:18.63 power_saving/2                                                                                                                        

20073 nobody    20   0  154m 4792 2208 S  0.3  0.0   0:00.97 gmond                                                                                                                                

20666 root      20   0 15148 1428  940 R  0.3  0.0   0:00.05 top                                                                                                                                  

    1 root      20   0 19324 1620 1292 S  0.0  0.0   0:02.00 init                                                                                                                                  


                            

[root@compute-2-4 ~]# ps -aux |grep power_saving

Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ

root      6947 37.2  0.0      0     0 ?        R    01:13 194:38 [power_saving/0]

root      6948 38.1  0.0      0     0 ?        R    01:13 199:04 [power_saving/1]

root      6949 38.9  0.0      0     0 ?        R    01:13 203:29 [power_saving/2]

root      6950 39.5  0.0      0     0 ?        R    01:13 206:30 [power_saving/3]

root      6951 40.2  0.0      0     0 ?        R    01:13 210:06 [power_saving/4]

root      6952 47.0  0.0      0     0 ?        R    01:13 245:45 [power_saving/5]

root      6953 41.8  0.0      0     0 ?        R    01:13 218:09 [power_saving/6]

root      6954 43.3  0.0      0     0 ?        R    01:13 226:22 [power_saving/7]

root      6955 43.2  0.0      0     0 ?        R    01:13 225:44 [power_saving/8]

root      6956 45.1  0.0      0     0 ?        R    01:13 235:47 [power_saving/9]

root      6957 45.3  0.0      0     0 ?        R    01:13 236:42 [power_saving/10]

root      6958 46.0  0.0      0     0 ?        D    01:13 240:34 [power_saving/11]

root      6959 47.7  0.0      0     0 ?        D    01:13 249:16 [power_saving/12]

root      6960 50.1  0.0      0     0 ?        R    01:13 261:28 [power_saving/13]

root      6961 62.4  0.0      0     0 ?        D    01:13 325:53 [power_saving/14]

root      6962 36.3  0.0      0     0 ?        R    01:13 189:29 [power_saving/15]

 

[root@compute-2-4 ~]# ps -aux |grep watchdog   

Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ

root         6 26.5  0.0      0     0 ?        R    Jul09 173:17 [watchdog/0]

root        10 26.1  0.0      0     0 ?        R    Jul09 170:29 [watchdog/1]

root        14 25.8  0.0      0     0 ?        R    Jul09 168:28 [watchdog/2]

root        18 25.5  0.0      0     0 ?        R    Jul09 166:28 [watchdog/3]

root        22 25.6  0.0      0     0 ?        S    Jul09 166:57 [watchdog/4]

root        26 25.2  0.0      0     0 ?        R    Jul09 164:44 [watchdog/5]

root        30 27.9  0.0      0     0 ?        R    Jul09 182:05 [watchdog/6]

root        34 24.6  0.0      0     0 ?        R    Jul09 160:52 [watchdog/7]

root        38 24.7  0.0      0     0 ?        S    Jul09 161:08 [watchdog/8]

root        42 24.6  0.0      0     0 ?        S    Jul09 160:51 [watchdog/9]

root        46 24.7  0.0      0     0 ?        R    Jul09 161:31 [watchdog/10]

root        50 24.9  0.0      0     0 ?        R    Jul09 162:37 [watchdog/11]

root        54 24.4  0.0      0     0 ?        R    Jul09 159:23 [watchdog/12]

root        58 24.5  0.0      0     0 ?        R    Jul09 159:59 [watchdog/13]

root        62 24.1  0.0      0     0 ?        R    Jul09 157:21 [watchdog/14]

root        66 17.0  0.0      0     0 ?        S    Jul09 111:22 [watchdog/15]

I can fould the power_saving on the linux system. after i reboot , it'll be ok ,and then these processes appeared randomly on all the server.

Please figure me out what the problem?

thank you very much!

July 10th, 2012 07:00

The CentOS 6.2 kernel is 2.6.32-220. It looks like you might be encountering the issue that is documented here: bugzilla.kernel.org/show_bug.cgi which affects, at least, kernel 2.6.32-131 (in 6.1) and newer. The fix is committed for 3.5-rc2 and it may take some time to be backported into RHEL and reach CentOS. If I hear anything new about the progress of this I'll post here.

1 Message

July 13th, 2012 09:00

Is there a list of affected Dell servers (12g line only?) and Operating systems?

Thank you

July 13th, 2012 16:00

From the notes on that bug report (provided this is what you are actually seeing!) it seems this is an issue with Linux ACPI 4.0 support idling multiple cores in a short time (more than 2), and from the bug conversation it seems to me this can be observed on any system running an affected kernel with many cores and ACPI 4.0.

5 Practitioner

 • 

274.2K Posts

July 14th, 2012 01:00

Some 12g server , I have installed the centos 5.7 and everythin is ok

5 Practitioner

 • 

274.2K Posts

July 14th, 2012 01:00

Maybe I can try to close the acpid service?

5 Practitioner

 • 

274.2K Posts

July 14th, 2012 02:00

did you think whether it woks if i rmod the acpi_pad?

July 17th, 2012 12:00

If you do not require ACPI support those may be reasonable workarounds. Does it run ok for several days like that?

5 Practitioner

 • 

274.2K Posts

July 21st, 2012 20:00

It does not work

1 Message

August 22nd, 2012 13:00

Some other options to try:

1) Disable Logical Processor in the BIOS

2)  unload the acpi_pad module by running “rmmod acpi_pad”

3)  Use a kernel parameter at boot time to prevent acpi_pad module from loading on server boot up. Append  “acpi_pad.disable=1” to the kernel line

1 Message

September 25th, 2012 09:00

I'm having the same problem running CentOS 6.3 on Dell R720s. I've got six of them and they will randomly enter this state. My 'ps' output is pasted below. This happens sometimes immediately after reboot. I don't thing the kernel bug mentioned by Jonathan is the problem, as I followed the steps to reproduce the problem given by Len Brown in comment #4 on bugzilla.kernel.org/show_bug.cgi and it didn't result in the same condition. It also sounds like the workaround didn't work for chufall. Any other suggestions for workarounds?

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

root      7407 43.7  0.0      0     0 ?        R    Sep24 570:19  \_ [power_saving/14]

root      7406 43.1  0.0      0     0 ?        R    Sep24 562:42  \_ [power_saving/13]

root      7404 42.7  0.0      0     0 ?        D    Sep24 556:54  \_ [power_saving/11]

root      7405 42.6  0.0      0     0 ?        D    Sep24 555:22  \_ [power_saving/12]

root      7403 41.4  0.0      0     0 ?        R    Sep24 539:50  \_ [power_saving/10]

root      7402 41.3  0.0      0     0 ?        R    Sep24 539:06  \_ [power_saving/9]

root      7401 40.6  0.0      0     0 ?        D    Sep24 529:13  \_ [power_saving/8]

root      7400 39.8  0.0      0     0 ?        R    Sep24 518:56  \_ [power_saving/7]

root      7399 39.2  0.0      0     0 ?        D    Sep24 510:50  \_ [power_saving/6]

root      7398 38.8  0.0      0     0 ?        D    Sep24 505:43  \_ [power_saving/5]

root      7397 38.0  0.0      0     0 ?        D    Sep24 495:58  \_ [power_saving/4]

root      7396 37.0  0.0      0     0 ?        R    Sep24 482:34  \_ [power_saving/3]

root      7395 36.2  0.0      0     0 ?        D    Sep24 472:25  \_ [power_saving/2]

root      7394 36.1  0.0      0     0 ?        R    Sep24 471:21  \_ [power_saving/1]

root      7393 35.0  0.0      0     0 ?        R    Sep24 456:08  \_ [power_saving/0]

root         6 33.6  0.0      0     0 ?        S    Sep24 447:03  \_ [watchdog/0]

root      7408 33.3  0.0      0     0 ?        R    Sep24 434:51  \_ [power_saving/15]

root        10 33.2  0.0      0     0 ?        R    Sep24 441:59  \_ [watchdog/1]

root        18 32.8  0.0      0     0 ?        S    Sep24 436:22  \_ [watchdog/3]

root        14 32.7  0.0      0     0 ?        R    Sep24 435:59  \_ [watchdog/2]

root        66 32.4  0.0      0     0 ?        S    Sep24 431:23  \_ [watchdog/15]

root        62 32.3  0.0      0     0 ?        R    Sep24 430:35  \_ [watchdog/14]

root        22 32.3  0.0      0     0 ?        R    Sep24 430:22  \_ [watchdog/4]

root        26 32.2  0.0      0     0 ?        S    Sep24 428:53  \_ [watchdog/5]

root        30 32.0  0.0      0     0 ?        S    Sep24 426:16  \_ [watchdog/6]

root        58 31.9  0.0      0     0 ?        S    Sep24 424:31  \_ [watchdog/13]

root        50 31.9  0.0      0     0 ?        S    Sep24 424:33  \_ [watchdog/11]

root        38 31.9  0.0      0     0 ?        S    Sep24 425:21  \_ [watchdog/8]

root        34 31.8  0.0      0     0 ?        S    Sep24 423:39  \_ [watchdog/7]

root        54 31.7  0.0      0     0 ?        S    Sep24 422:49  \_ [watchdog/12]

root        46 31.7  0.0      0     0 ?        R    Sep24 422:15  \_ [watchdog/10]

root        42 31.7  0.0      0     0 ?        S    Sep24 422:10  \_ [watchdog/9]

1 Message

November 6th, 2012 13:00

Are you sure you've disabled the acpi_pad module? Garima's workarounds seem to be the right ones.

I noticed this problem on one of my own Dell R720's running RHEL6u3, and it was a clue to me that hyperthreading was turned on, which I had intended to disable. Once I disabled it the problem went away.

But if you don't want to disable virtual cores, then you can unload the acpi_pad module with "rmmod acpi_pad - once that module came out, the wedged power_saving threads disappeared. If you add "acpi_pad.disable=1" to your kernel options with grubby, it'll stick through a reboot.

1 Message

January 29th, 2013 01:00

Has the fix to the bug reach CentOS?

130 Posts

June 8th, 2013 08:00

I'm still seeing this issue with the latest kernel (2.6.32-358.6.2.el6.x86_64) on a t620 system.

Any hints ?

130 Posts

June 8th, 2013 08:00

access.redhat.com/.../366273

Resolution :

•Disable C states/Speed Step in system BIOS

•To immediately reduce load on the system without a reboot modprobe -r acpi_pad

Not a real solution, more a workaround ?

2 Posts

January 27th, 2014 08:00

This problem exists on newest RHEL6.5, just bumped into it, shame Redhat hasn't backported the fix.  Without gimping power management, you can just disable the "Logical Processor Idling" in the BIOS.  With OpenManage installed, you can just run the following command and reboot:

omconfig chassis biossetup attribute=DynamicCoreAllocation setting=Disabled

You don't have to disable logical processors ie HyperThreading, the above setting is sufficient to avoid the problem.

No Events found!

Top