PowerEdge: How to Install NVIDIA Driver in Red Hat Enterprise Linux
Summary: This article discusses compiling and installing an NVIDIA driver with Dynamic Kernel Module Support (DKMS) in Red Hat Enterprise Linux that has "Secure Boot" disabled.
Instructions
This article discusses how to compile and install an NVIDIA driver with DKMS in Red Hat Enterprise Linux that has "Secure Boot" disabled.
Before getting started, ensure that the Secure Boot option is disabled in the BIOS. This is because this installation option is using DKMS to compile the NVIDIA driver from the source code for any current running kernel. There is no vendor signature for the compiled driver. If Secure Boot is enabled, the self-compiled driver fails to load with the error Required key not available. Verify the current status by command mokutil --sb-state in Red Hat Enterprise Linux, and change it in the BIOS by pressing F2 during server POST.
If Secure Boot is required, or if you prefer a pre-compiled driver, see the following article to install. How to Install NVIDIA Driver Online in Red Hat Enterprise Linux with Secure Boot Enabled.
- Download the required driver from the NVIDIA site.
- Select the correct version of Red Hat Enterprise Linux, for example Red Hat Enterprise Linux 8.
- Select the correct Compute Unified Device Architecture (CUDA) version required with the CUDA toolkit that you are going to install, for example 12.2.
- The download package is an RPM, for example nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm
- Install the RPM. That creates a local repository.
[root@rhel87 ~]# ls anaconda-ks.cfg nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm [root@rhel87 ~]# yum localinstall ./nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm ...output skipped... [root@rhel87 ~]# yum repolist Updating Subscription Management repositories. Unable to read consumer identity This system is not registered with an entitlement server. You can use subscription-manager to register. repo id repo name my-rhel-87-AppStream-iso my RHEL 87 AppStream iso my-rhel-87-BaseOS-iso my RHEL 87 BaseOS iso my-rhel-extra-rpms my RHEL extra rpms nvidia-driver-local-rhel8-535.54.03 nvidia-driver-local-rhel8-535.54.03 [root@rhel87 ~]#
- Install DKMS. DKMS is not in Red Hat Enterprise Linux. It is available in Extra Packages for Enterprise Linux (EPEL)
. For more details about DKMS, see Red Hat article Is DKMS provided in Red Hat Enterprise Linux
[root@rhel87 ~]# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm [root@rhel87 ~]# yum install dkms
- You may Disable or remove the EPEL if required.
# to disable epel, edit the following configuration and change to "enabled=0" [root@rhel87 ~]# vi /etc/yum.repos.d/epel.repo # to remove the epel [root@rhel87 ~]# yum remove epel-release
- Install compilation tools and kernel header. Ensure that the Red Hat subscription is attached.
[root@rhel87 ~]# yum groupinstall "Development Tools" [root@rhel87 ~]# yum install kernel-devel-$(uname -r)
- Install the cuda-driver
[root@rhel87 ~]# yum install cuda-driver
- Confirm that the driver was installed successfully.
[root@rhel87 ~]# dkms status
nvidia/535.54.03, 4.18.0-425.3.1.el8.x86_64, x86_64: installed
[root@rhel87 ~]#
- If the status designated above is not installed, but instead it is Added, you may build it.
[root@rhel87 ~]# dkms build nvidia/535.54.03
- If the designated status is built, you may install it.
[root@rhel87 ~]# dkms install nvidia/535.54.03
- If the status failed in building or in installing, review the logs in the following path:
[root@rhel87 ~]# ls /var/lib/dkms/nvidia/535.54.03/4.18.0-425.3.1.el8.x86_64/x86_64/log/make.log [root@rhel87 ~]#
- Reboot the server after NVIDIA driver is installed.
[root@rhel87 ~]# systemctl reboot
- Verify that the driver is up and running.
[root@rhel87 ~]# lsmod | grep nvidia nvidia_drm 73728 0 nvidia_modeset 1306624 1 nvidia_drm nvidia_uvm 1523712 0 nvidia 56426496 2 nvidia_uvm,nvidia_modeset drm_kms_helper 176128 4 qxl,nvidia_drm drm 565248 7 drm_kms_helper,qxl,nvidia,drm_ttm_helper,nvidia_drm,ttm [root@rhel87 ~]# nvidia-smi Tue Jul 25 12:00:29 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Tesla P100-PCIE-12GB Off | 00000000:07:00.0 Off | 0 | | N/A 33C P0 29W / 250W | 0MiB / 12288MiB | 2% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ [root@rhel87 ~]# modinfo nvidia filename: /lib/modules/4.18.0-425.3.1.el8.x86_64/extra/nvidia.ko.xz firmware: nvidia/535.54.03/gsp_tu10x.bin firmware: nvidia/535.54.03/gsp_ga10x.bin alias: char-major-195-* version: 535.54.03 supported: external license: NVIDIA rhelversion: 8.7 srcversion: EA9C7EF32617E104C8240C4
If there is any issue, collect to following logs for and contact Dell Support:
sosreport- The logs files mentioned above in the built or make if the driver build failed
- Any
/var/log/nvidia-installer.log, or any logs mentioned in the output while installing
Other information to know:
- If the Red Hat Enterprise Linux installed and booted with graphical.target, you may see a black screen after reboot. The solution is to move /etc/X11/xorg.conf.d/10-nvidia.conf out of the X11 folder and reboot the server.
- If you want to passthrough the GPU to a VM in KVM with graphical.target, it fails. The solution is to boot hypervisor Red Hat Enterprise Linux into multi-user.target because then graphical.target prevents the NVIDIA driver from unloading before passing through to VM.