PowerEdge: How to Install NVIDIA Driver in Red Hat Enterprise Linux

Summary: This article discusses compiling and installing an NVIDIA driver with Dynamic Kernel Module Support (DKMS) in Red Hat Enterprise Linux that has "Secure Boot" disabled.

This article applies to This article does not apply to This article is not tied to any specific product. Not all product versions are identified in this article.

Instructions

This article discusses how to compile and install an NVIDIA driver with DKMS in Red Hat Enterprise Linux that has "Secure Boot" disabled.

Before getting started, ensure that the Secure Boot option is disabled in the BIOS. This is because this installation option is using DKMS to compile the NVIDIA driver from the source code for any current running kernel. There is no vendor signature for the compiled driver. If Secure Boot is enabled, the self-compiled driver fails to load with the error Required key not available. Verify the current status by command mokutil --sb-state in Red Hat Enterprise Linux, and change it in the BIOS by pressing F2 during server POST.

If Secure Boot is required, or if you prefer a pre-compiled driver, see the following article to install. How to Install NVIDIA Driver Online in Red Hat Enterprise Linux with Secure Boot Enabled.

  1. Download the required driver from the NVIDIA site. This hyperlink is taking you to a website outside of Dell Technologies.
    1. Select the correct version of Red Hat Enterprise Linux, for example Red Hat Enterprise Linux 8.
    2. Select the correct Compute Unified Device Architecture (CUDA) version required with the CUDA toolkit that you are going to install, for example 12.2.
    3. The download package is an RPM, for example nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm
  2. Install the RPM. That creates a local repository.
[root@rhel87 ~]# ls
anaconda-ks.cfg  nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm
[root@rhel87 ~]# yum localinstall ./nvidia-driver-local-repo-rhel8-535.54.03-1.0-1.x86_64.rpm
...output skipped...
[root@rhel87 ~]# yum repolist
Updating Subscription Management repositories.
Unable to read consumer identity

This system is not registered with an entitlement server. You can use subscription-manager to register.

repo id                              repo name
my-rhel-87-AppStream-iso             my RHEL 87 AppStream iso
my-rhel-87-BaseOS-iso                my RHEL 87 BaseOS iso
my-rhel-extra-rpms                   my RHEL extra rpms
nvidia-driver-local-rhel8-535.54.03  nvidia-driver-local-rhel8-535.54.03
[root@rhel87 ~]#
  1. Install DKMS. DKMS is not in Red Hat Enterprise Linux. It is available in Extra Packages for Enterprise Linux (EPEL) This hyperlink is taking you to a website outside of Dell Technologies.. For more details about DKMS, see Red Hat article Is DKMS provided in Red Hat Enterprise Linux This hyperlink is taking you to a website outside of Dell Technologies.
[root@rhel87 ~]# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
[root@rhel87 ~]# yum install dkms
  1. You may Disable or remove the EPEL if required.
# to disable epel, edit the following configuration and change to "enabled=0" 
[root@rhel87 ~]# vi /etc/yum.repos.d/epel.repo 

# to remove the epel 
[root@rhel87 ~]# yum remove epel-release
  1. Install compilation tools and kernel header. Ensure that the Red Hat subscription is attached.
[root@rhel87 ~]# yum groupinstall "Development Tools"
[root@rhel87 ~]# yum install kernel-devel-$(uname -r)
  1. Install the cuda-driver
[root@rhel87 ~]# yum install cuda-driver
  1. Confirm that the driver was installed successfully.
[root@rhel87 ~]# dkms status
nvidia/535.54.03, 4.18.0-425.3.1.el8.x86_64, x86_64: installed
[root@rhel87 ~]#
  1. If the status designated above is not installed, but instead it is Added, you may build it.
[root@rhel87 ~]# dkms build nvidia/535.54.03
  1. If the designated status is built, you may install it.
[root@rhel87 ~]# dkms install nvidia/535.54.03
  1. If the status failed in building or in installing, review the logs in the following path:
[root@rhel87 ~]# ls /var/lib/dkms/nvidia/535.54.03/4.18.0-425.3.1.el8.x86_64/x86_64/log/make.log
[root@rhel87 ~]#
  1. Reboot the server after NVIDIA driver is installed.
[root@rhel87 ~]# systemctl reboot
  1. Verify that the driver is up and running.
[root@rhel87 ~]# lsmod | grep nvidia
nvidia_drm             73728  0
nvidia_modeset       1306624  1 nvidia_drm
nvidia_uvm           1523712  0
nvidia              56426496  2 nvidia_uvm,nvidia_modeset
drm_kms_helper        176128  4 qxl,nvidia_drm
drm                   565248  7 drm_kms_helper,qxl,nvidia,drm_ttm_helper,nvidia_drm,ttm
[root@rhel87 ~]# nvidia-smi
Tue Jul 25 12:00:29 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-12GB           Off | 00000000:07:00.0 Off |                    0 |
| N/A   33C    P0              29W / 250W |      0MiB / 12288MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
[root@rhel87 ~]# modinfo nvidia
filename:       /lib/modules/4.18.0-425.3.1.el8.x86_64/extra/nvidia.ko.xz
firmware:       nvidia/535.54.03/gsp_tu10x.bin
firmware:       nvidia/535.54.03/gsp_ga10x.bin
alias:          char-major-195-*
version:        535.54.03
supported:      external
license:        NVIDIA
rhelversion:    8.7
srcversion:     EA9C7EF32617E104C8240C4



If there is any issue, collect to following logs for and contact Dell Support:

  • sosreport
  • The logs files mentioned above in the built or make if the driver build failed
  • Any /var/log/nvidia-installer.log, or any logs mentioned in the output while installing

Other information to know:

  • If the Red Hat Enterprise Linux installed and booted with graphical.target, you may see a black screen after reboot. The solution is to move /etc/X11/xorg.conf.d/10-nvidia.conf out of the X11 folder and reboot the server.
  • If you want to passthrough the GPU to a VM in KVM with graphical.target, it fails. The solution is to boot hypervisor Red Hat Enterprise Linux into multi-user.target because then graphical.target prevents the NVIDIA driver from unloading before passing through to VM.

Affected Products

Red Hat Enterprise Linux Version 7, Red Hat Enterprise Linux Version 9, Red Hat Enterprise Linux Version 8

Products

DSS 8440, Poweredge C4140, PowerEdge C6525, PowerEdge R640, PowerEdge R650, PowerEdge R6515, PowerEdge R6525, PowerEdge R740, PowerEdge R740XD, PowerEdge R7425, PowerEdge R750, PowerEdge R750XA, PowerEdge R7515, PowerEdge R7525, PowerEdge R840 , PowerEdge R940xa, PowerEdge T550, PowerEdge T640, PowerEdge XE2420, PowerEdge XE7420, PowerEdge XE9680 ...
Article Properties
Article Number: 000216077
Article Type: How To
Last Modified: 06 Dec 2024
Version:  5
Find answers to your questions from other Dell users
Support Services
Check if your device is covered by Support Services.