Windows Server: MPIO(MSDSM): Changing the PathVerificationPeriod does not make the path fail faster
Summary: Adjusting the Msdsm PathVerificationPeriod, does not make the path fail faster. This is expected behaviour, because the underlying HBA still signals a healthy path. Instead you need to change the respective HBA parameters to signal a path failure earlier. ...
Instructions
Modifying the PathVerificationPeriod is not triggering as expected, if the underlying, Emulex HBA-level, NodeTimeOut and LinkTimeOut values are higher, and your path failure test includes removing/disabling links in the SAN. For Qlogic these parameters are called Link Down Timeout and Port Down Retry Count.
Please note that a Msdsm PathVerificationPeriod is not a "PathVerificationTimeOut". The path verification period is the time period for scheduling the path tests. Therefore, reducing the value will schedule more tests, but the underlying layer will still only signal any state change after the HBA timers expired.
Emulex (Broadcom)
The parameters are defined in the Emulex documentation as follows:
LinkTimeOut
"A timer is started on all mapped targets using the LinkTimeOut value when a linkdown
event is detected. If the timer expires before link-up discovery is resolved,
commands issued to timed-out devices return a SELECTION_TIMEOUT status. The
Storport Miniport driver is notified of a bus change event, which leads to the removal
of all LUNs on the timed-out devices.
Values: 0 to 255 seconds or 0x0 to 0xFF (hexadecimal)
Default: 30 (0x1E)"
NodeTimeout
"The node timer starts when a node (that is, a discovered target or adapter) becomes
unavailable. If the node fails to become available before the NodeTimeout interval
expires, the operating system is notified so that any associated devices (if the node
is a target) can be removed. If the node becomes available before the NodeTimeout
interval expires, the timer is canceled and no notification is made.
Values: 1 to 255 seconds or 0x0 to 0xFF (hexadecimal)
Default: 30 (0x1E)"
If you, for example, reduce the Msdsm PathVerificationPeriod from 30 (0x1e) to 10 (0xa) seconds, then, for an Emulex FC adapter, you must also adjust NodeTimeOut and LinkTimeOut to 10 (0xa).
Refer to the Configuration section of the Emulex Drivers for Windows User Guide on the Broadcom web site for further details about using Emulex HBA Manager or Emulex HBA Manager CLI (https://docs.broadcom.com/docs/elx_DRVWin-UG144-100.pdf).
Qlogic (Marvell)
The parameters are defined in the Qlogic documentation as follows:
Link Down Timeout
"Specifies the number of seconds the software waits for a link that is down to come up."
Port Down Retry Count
"Specifies the number of seconds the software waits before resending a command
to a port whose status indicates that the port is down."
If you, for example, reduce the Msdsm PathVerificationPeriod from 30 (0x1e) to 10 (0xa) seconds, then, for an Qlogic FC adapter, you must also adjust Link Down Timeout and Port Down Retry Count to 10 (0xa). Please note, that Port Down Retry Count is a value in seconds and not a number of cycles, as one could conclude from the term "count".
Refer to the Qlogic User Guide on the Marvell web page for further details (https://www.marvell.com/content/dam/marvell/en/public-collateral/fibre-channel/marvell-fibre-channel-adapters-qlogic-series-2700-user-guide.pdf)
CLI References
CLI references can be found in: