Start a Conversation

Unsolved

C

2 Posts

1732

August 9th, 2018 13:00

Console access issue (R730)

 I am aware that this is quite unusual, but I hope someone will be able to shed some light onto the issue at hand.

Recently, I have been testing some odd thing, due to one of the reports I received from the field.  Basically, when you unplug|plug the console cable from thee device (details below) for ~10 times with ~1s intervals between plugging and unplugging, the actual serial device eventually hangs. Here are some details:

- the device in question is DELL's R730 server,
- 4.4.63 is the version of the linux kernel that spins on top of it,
- the ttyS0 (the serial device in question) is associated with the device present, as seen by the kernel, on the pnp bus,
- ttyS0 is managed by busybox's (1.24.1) getty implementation which is controlled by the runsv daemon. getty's cmdline looks like the following:

/sbin/getty -n -l path_to_login_script baud_rate ttyS0 vt102

- the path_to_login_script is just a bash script that reads the credentials in and forks another bash script that emulates a TUI in case login was successful,
- baud_rate depends on the configuration but in this case it's 9600,
- some details on the tty serial layer:

# cat /proc/tty/driver/serial
serinfo:1.0 driver revision:
0: uart:16550A port:000003F8 irq:4 tx:8820 rx:73 brk:6 RTS|CTS|DTR|DSR
1: uart:16550A port:000002F8 irq:3 tx:0 rx:0 CTS
2: uart:unknown port:000003E8 irq:4
3: uart:unknown port:000002E8 irq:3
# ./stty -F /dev/ttyS0
speed 9600 baud; line = 0;
eol = ^J;
-brkint ixoff -imaxbel
-iexten -echo -echok
#

- the ttyS1 seen above is the "virtual" console accessible via iDRAC and through bmc's lan+ interface (sol basically, must be an integrated thing then I guess). That device is not used though, since on the userspace side we have only one getty instance that handles just the ttyS0,
- some details from tty|pnp sysfs interfaces:

# readlink -f /sys/class/tty/ttyS0/device
/sys/devices/pnp0/00:02
# grep . "${ff[@]}"
/sys/class/tty/ttyS0/close_delay:50
/sys/class/tty/ttyS0/closing_wait:3000
/sys/class/tty/ttyS0/custom_divisor:0
/sys/class/tty/ttyS0/dev:4:64
/sys/class/tty/ttyS0/flags:0x10000040
/sys/class/tty/ttyS0/io_type:0
/sys/class/tty/ttyS0/iomem_base:0x0
/sys/class/tty/ttyS0/iomem_reg_shift:0
/sys/class/tty/ttyS0/irq:4
/sys/class/tty/ttyS0/line:0
/sys/class/tty/ttyS0/port:0x3F8
/sys/class/tty/ttyS0/rx_trig_bytes:8
/sys/class/tty/ttyS0/type:4
/sys/class/tty/ttyS0/uartclk:1843200
/sys/class/tty/ttyS0/uevent:MAJOR=4
/sys/class/tty/ttyS0/uevent:MINOR=64
/sys/class/tty/ttyS0/uevent:DEVNAME=ttyS0
/sys/class/tty/ttyS0/xmit_fifo_size:16
# grep . /sys/devices/pnp0/00\:02/{resources,options}
/sys/devices/pnp0/00:02/resources:state = active
/sys/devices/pnp0/00:02/resources:io 0x3f8-0x3ff
/sys/devices/pnp0/00:02/resources:irq 4
/sys/devices/pnp0/00:02/options:Dependent: 00 - Priority acceptable
/sys/devices/pnp0/00:02/options:  port 0x3f8-0x3f8, align 0x7, size 0x8,
16-bit address decoding
/sys/devices/pnp0/00:02/options:  irq 4 High-Edge
/sys/devices/pnp0/00:02/options:Dependent: 01 - Priority acceptable
/sys/devices/pnp0/00:02/options:  port 0x3e8-0x3e8, align 0x7, size 0x8,
16-bit address decoding
/sys/devices/pnp0/00:02/options:  irq 4 High-Edge
/sys/devices/pnp0/00:02/options:Dependent: 02 - Priority acceptable
/sys/devices/pnp0/00:02/options:  port 0x2f8-0x2f8, align 0x7, size 0x8,
16-bit address decoding
/sys/devices/pnp0/00:02/options:  irq 3 High-Edge
/sys/devices/pnp0/00:02/options:Dependent: 03 - Priority acceptable
/sys/devices/pnp0/00:02/options:  port 0x3f8-0x3f8, align 0x7, size 0x8,
16-bit address decoding
/sys/devices/pnp0/00:02/options:  irq 3,4,5,6,7,12 High-Edge
/sys/devices/pnp0/00:02/options:Dependent: 04 - Priority acceptable
/sys/devices/pnp0/00:02/options:  port 0x2f8-0x2f8, align 0x7, size 0x8,
16-bit address decoding
/sys/devices/pnp0/00:02/options:  irq 3,4,5,6,7,12 High-Edge
# readlink -f /sys/class/tty/ttyS0/device/driver
/sys/bus/pnp/drivers/serial

- some details on the irqs (there are 24 cpu threads in total so I just cut the output to the relevant one, cpu0)

# [[ -d /proc/irq/4/serial ]]; echo $?
0
# grep serial /proc/interrupts
    4:        970    ...          IO-APIC    4-edge      serial

Whenever the "hang" happens (as in, the user is not able to interact with the login script), it looks like the bash is stuck on the read from /dev/ttyS0, hence the feeling of unresponsiveness. if we check rx/tx stats under the procfs, rx counter in particular, are not increasing at all. Funnily enough, writes to /dev/ttyS0 are successful, but the data is not seen on the terminal which is connected to said device (counters don't increase as well). There is nothing reported by the kernel (tty layer or the pnp drivers) when that scenario happens. It's like the cable hasn't
been connected at all. HOWEVER, communication is restored when the getty daemon is restarted or, and that's the weirdest one, when write to pnp device's resources interface is performed. It doesn't matter what you write
into it, it my be a complete garbage that's not understood by the pnp/interface.c - the driver will return -EBUSY and the write will fail each time, but as soon as that's done, the console wakes up immediately and you can proceed with the login.

Any hints on what could be the problem here would be appreciated. Thanks! :)

No Responses!
No Events found!

Top