Start a Conversation

Unsolved

This post is more than 5 years old

C

145824

October 24th, 2014 21:00

Unreliable suspend / resume with A05 and A06 bios (UPDATED)

I've been chasing a suspend resume problem for a while. In general suspend / resume from and to memory works reasonably well. However I've found that after the system has been in an idle state for a while, screen off, system unused, and plugged in (but online), when closing the lid, and unplugging it, which initiates pm-suspend to memory.

That said, suspend to memory only seems to fail after long periods of idle, and of that, it seems some what random.

Debugging this, pm-suspend indicates that suspend to memory request takes place, however at that point (in which the BIOS should take over) the system hard locks and requires power cycling.

I've had this problem before on other dells and eventually, in all cases it required a BIOS fix. Would someone at Dell please investigate this?

36 Posts

December 17th, 2014 07:00

I'm sorry for this thread hijacking, but I have a couple of questions about this new BIOS.


Are the Fn keys that control the screen brightness still delayed? I decompiled the ACPI table and found that a delay of 200ms was intentionally added. This delay is somewhat problematic when you keep the Fn keys pressed, because it's as if the BIOS register how long you've been pressing the key, guesses the equivalent number of keypresses and then sends one keypress every 200ms. So, if you keep the key pressed and then release it, the BIOS will keep sending events for a while. I find this rather annoying.

Second question. I have noticed that enabling the SATA link power management causes weird keyboard lags and repeated keys. Is this still a thing?


Thanks.

There are still SATA link power issues, which can result in file system corruption. It's best to disable power handling on SATA drive and controllers.

The keyboard lag still appears to be present when using the screen brightness function keys. However It doesn't bother me as I don't use those keys, rather I utilize the ambient light sensor (ALS) (with als module [View:github.com/.../als:550:0])

Which works great with appropriate udev rules and handler script 

42 Posts

December 18th, 2014 05:00

There are still SATA link power issues, which can result in file system corruption. It's best to disable power handling on SATA drive and controllers.

I didn't have file system corruption yet. FYI there are three different policies (min_power, medium_power and max_performance) and for now I've always used min_power when on battery. I noticed some warnings in my dmesg, but I've never investigated into this. Maybe medium_power is a good compromise. I just can't explain how the SATA link power management (I think it's just SATA link power Managmenet for host2 that causes problem) can have an effect on the keyboard.

The keyboard lag still appears to be present when using the screen brightness function keys. However It doesn't bother me as I don't use those keys, rather I utilize the ambient light sensor (ALS) (with als module [View:github.com/.../als:550:0])

Which works great with appropriate udev rules and handler script

To be honest, it's mostly a pet peeve. It bothers me that it works like this and that somehow it works almost perfectly on Windows.


Anyway, thanks.

42 Posts

December 27th, 2014 04:00

A05 was skipped altogether and A06 was released instead. So far, so good (but I haven't used it that much).

I thought you might be interested given the problems you have with A05.

36 Posts

December 31st, 2014 09:00

After updating to the A06 bios, I can confirm that this issue is still present, though it did take a while longer to reach it (about 4 suspend / resume cycles) however it still fails to reliably suspend.

13 Posts

January 7th, 2015 04:00

a

42 Posts

January 7th, 2015 04:00

"[ahci] : port does not support device sleep in the console" had been there since always and it's just a warning message.
This message is printed when you change the power management policy of SATA links and if I'm not wrong it happens with host2, which by the way is what causes the keyboard lags. Changing the policy also causes other errors that are more worrying than that message.

Anyway, I forgot about this, but I had been using the following script for quite a while now. Maybe that's the reason why I didn't have any problem.

$ cat /lib/systemd/system-sleep/xps13.sh

#!/bin/sh

# max_performance|medium_power|min_power
SATA_POLICY_BATTERY=min_power

policy=max_performance
if [ $(cat /sys/class/power_supply/ADP0/online) = 0 ] && [ "$1" = "post" ]; then
    # Assume min_power on battery
    policy=$SATA_POLICY_BATTERY
fi
case "$2" in
"suspend"|"hibernate"|"pre-hibernate")
    for i in /sys/class/scsi_host/host*/link_power_management_policy; do
        echo $policy > $i
    done
esac

EDIT:
To make my post more clear: the above script changes the SATA power management policy to max_performance on suspend/hibernate and sets $SATA_POLICY_BATTERY (I set it to min_power) on resume if on battery. systemd is required.

13 Posts

January 7th, 2015 04:00

Hi,

This bug is back for me. It stopped when I reinstalled my computer in legacy mode, so I thought it was that. But then it reappeared after some time, even after upgrading to A06 BIOS, as you mentioned it.

I started to believe that I had installed/configured something that caused it. I was also getting messages like "[ahci] : port does not support device sleep in the console" the same day that bug reappeared.

I've ran a few experiments, and managed to "activate and deactivate" this bug, and finally get reliable suspends for the past week. Here is how:

Disable systemd handling of suspend in /etc/systemd/logind.conf

[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
HandlePowerKey=ignore
HandleSuspendKey=ignore
#HandleHibernateKey=hibernate
HandleLidSwitch=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#IdleAction=suspend
#IdleActionSec=10min
#RuntimeDirectorySize=10%
#RemoveIPC=yes

I installed acpid to handle this, with the following /etc/acpi/lid:

#!/bin/sh
case "$3" in
   close)
      /etc/pm/power.d/10-power_script false
      pm-suspend;;
   open)
      /etc/pm/run_power_script;;
esac

(and of course in /etc/acpi/events/lid:

event=button/lid
action=/etc/acpi/lid %e

)

 

 

Where the scripts in /etc/pm are (slightly customized versions of those) given there:

http:/xps13-9333.appspot.com/root/etc/pm/sleep.d/10-run_power_script

" rel="nofollow noopener noreferrer">http://xps13-9333.appspot.com/

Basically, calling /etc/pm/power.d/10-power_script with false disable all power savings of all devices. I have never seen it fail despite trying more than 20 times in a day, including with idle time in between.

Since then, I've been able to reproduce the problem again, but only willingly, by setting the following powertop switches to good (and all others to bad) :


Good Wireless Power Saving for interface wlan0
Good Bluetooth device interface status
Good NMI watchdog should be turned off
Good Autosuspend for USB device EHCI Host Controller [usb3]
Good Autosuspend for USB device xHCI Host Controller [usb1]
Good Autosuspend for USB device xHCI Host Controller [usb2]
Good Autosuspend for unknown USB device 3-1 (8087:8000)
Good Autosuspend for USB device Integrated_Webcam_HD [CN0Y2TKG7248745HA0VSA00]
Good Autosuspend for unknown USB device 1-1 (0424:2514)
Good Wake-on-lan status for device wlan0

I now believe that the fact that it appears only after some idle time also confirms a power saving problem. Switching smartconnect and rst on and off did not change anything for me.

Now I'm also not sure how switching back to legacy mode changed things.

Can you guys test this and confirm?

42 Posts

January 7th, 2015 13:00

I was playing with the link power management and somehow got to replicate your problem.

I'm looking into the various problems and for now I'm trying to use medium_power when using AC and min_power when on battery.

Using min_power saves roughly 2W in idle, which is a lot, and I found that transitions between min_power and medium_power are not causing problems. I haven't tested the performance difference between medium_power and max_performance, but there's definitely a difference between medium_power and min_power.

Transitions from min_power or medium_power to max_performance cause some errors and the kernel progressively lower the max speed of the bus from 6.0 Gpbs to 3.0 Gpbs and finally to 1.5 Gpbs as a safety measure. I suspect that those errors are playing a role in these failed suspensions.

Basically:

# If AC is present
echo medium_power > /sys/class/scsi_host/host2/link_power_management_policy
# If on battery
echo min_power > /sys/class/scsi_host/host2/link_power_management_policy

EDIT:
Be aware that pm-utils might override your preferences. For instance I have /usr/lib/pm-utils/power.d/sata_alpm that is overriding the policy.

36 Posts

January 8th, 2015 14:00

Hi,

This bug is back for me. It stopped when I reinstalled my computer in legacy mode, so I thought it was that. But then it reappeared after some time, even after upgrading to A06 BIOS, as you mentioned it.

I started to believe that I had installed/configured something that caused it. I was also getting messages like "[ahci] : port does not support device sleep in the console" the same day that bug reappeared.

I've ran a few experiments, and managed to "activate and deactivate" this bug, and finally get reliable suspends for the past week. Here is how:

Disable systemd handling of suspend in /etc/systemd/logind.conf

[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
HandlePowerKey=ignore
HandleSuspendKey=ignore
#HandleHibernateKey=hibernate
HandleLidSwitch=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#IdleAction=suspend
#IdleActionSec=10min
#RuntimeDirectorySize=10%
#RemoveIPC=yes

I installed acpid to handle this, with the following /etc/acpi/lid:

#!/bin/sh
case "$3" in
   close)
      /etc/pm/power.d/10-power_script false
      pm-suspend;;
   open)
      /etc/pm/run_power_script;;
esac

(and of course in /etc/acpi/events/lid:

event=button/lid
action=/etc/acpi/lid %e

)

 

 

Where the scripts in /etc/pm are (slightly customized versions of those) given there:

http:/xps13-9333.appspot.com/root/etc/pm/sleep.d/10-run_power_script

" rel="nofollow noopener noreferrer">http://xps13-9333.appspot.com/

Basically, calling /etc/pm/power.d/10-power_script with false disable all power savings of all devices. I have never seen it fail despite trying more than 20 times in a day, including with idle time in between.

Since then, I've been able to reproduce the problem again, but only willingly, by setting the following powertop switches to good (and all others to bad) :


Good Wireless Power Saving for interface wlan0
Good Bluetooth device interface status
Good NMI watchdog should be turned off
Good Autosuspend for USB device EHCI Host Controller [usb3]
Good Autosuspend for USB device xHCI Host Controller [usb1]
Good Autosuspend for USB device xHCI Host Controller [usb2]
Good Autosuspend for unknown USB device 3-1 (8087:8000)
Good Autosuspend for USB device Integrated_Webcam_HD [CN0Y2TKG7248745HA0VSA00]
Good Autosuspend for unknown USB device 1-1 (0424:2514)
Good Wake-on-lan status for device wlan0

I now believe that the fact that it appears only after some idle time also confirms a power saving problem. Switching smartconnect and rst on and off did not change anything for me.

Now I'm also not sure how switching back to legacy mode changed things.

Can you guys test this and confirm?

So I tried this and am still seeing the same lock up when everything in the suspend script finishes and it's handed off to the BIOS.

The BIOS with A06 is still very broken.

36 Posts

January 11th, 2015 13:00

Over the weekend I played around with this a bit more. It seems the issues are directly related to power state. If you adjust the power profile (in kernel) what so ever the system fails to shutdown.


This includes dropping to a more efficient power profile while on battery exclusively.

I've opened a service request from dell (905781411) If you guys could also get in touch with dell support so they can be aware that it's more than just me and my special configuration that would be extremely useful. ;|

42 Posts

February 2nd, 2015 04:00

Since I got the WiFi card and screen replaced (and then motherboard after that because somehow these components made the coil whine really loud), it had happened 5 times already. I didn't have problems before.

Did anyone experienced the same on Windows? I basically don't use it, but I think it would be helpful to know.

42 Posts

March 5th, 2015 15:00

cfaber and ppeemm, could you try to blacklist mei and mei_me?

bugzilla.kernel.org/show_bug.cgi

13 Posts

March 11th, 2015 01:00

So, I've disabled my previous solution (wake all devices up before suspending, was suspending almost always correctly, maybe 1 in 50 suspends would fail) and enabled this one. It's been working great for the few (~10) times I've tested it.

No Events found!

Top