NVIDIA Driver Installation Guide

From Server rental store
Revision as of 10:00, 14 April 2026 by Admin (talk | contribs) (New server guide)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
🖥️ Need a Server? Compare VPS & GPU hosting deals
PowerVPS → GPU Cloud →
⭐ Recommended Binance 10% Fee CashBack
Register Now →

NVIDIA Driver Installation Guide

This guide provides detailed instructions for installing NVIDIA drivers on Ubuntu, Debian, and CentOS systems. Proper driver installation is crucial for leveraging the full potential of NVIDIA GPUs, particularly for AI/ML workloads, scientific computing, and high-performance graphics.

Prerequisites

Before you begin, ensure you have:

  1. A server with an NVIDIA GPU installed. GPU servers are available at Immers Cloud starting from $0.23/hr for inference to $4.74/hr for H200.
  2. Root or sudo privileges on your server.
  3. Internet connectivity to download necessary packages.
  4. Basic familiarity with the Linux command line.
  5. The specific model of your NVIDIA GPU. You can usually find this using:
lspci | grep -i nvidia
  1. Important: Ensure your system is up-to-date.
  2. For Ubuntu/Debian:
sudo apt update && sudo apt upgrade -y
  1. For CentOS:
sudo yum update -y

Step 1: Identify Your GPU and Kernel Version

Knowing your GPU model and kernel version helps in selecting the correct driver.

  1. Check GPU:
lspci | grep -i nvidia

Example Output:

01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2070] (rev a1)
  1. Check Kernel Version:
uname -r

Example Output:

5.15.0-56-generic

Step 2: Install Necessary Build Tools

The NVIDIA driver installation often requires kernel headers and build tools to compile modules for your specific kernel.

For Ubuntu/Debian

sudo apt install build-essential linux-headers-$(uname -r) -y

For CentOS

  1. CentOS uses `kernel-devel` for kernel headers.
sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc make -y

Why this matters: The NVIDIA driver is a kernel module. To load and function correctly, it needs to be compiled against the exact headers of your running kernel. Missing these will lead to a non-functional driver.

Step 3: Disable Nouveau Driver

The Nouveau driver is an open-source driver for NVIDIA cards that can interfere with the proprietary NVIDIA driver installation. It's essential to disable it.

  1. Blacklist Nouveau:

Create a new configuration file:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

Add the following lines:

blacklist nouveau
options nouveau modeset=0

Save and exit (Ctrl+X, Y, Enter in nano).

  1. Update initramfs:
  2. For Ubuntu/Debian:
sudo update-initramfs -u
  1. For CentOS:
sudo dracut --force
  1. Reboot your system:
sudo reboot

Why this matters: The Nouveau driver might try to claim the GPU, preventing the NVIDIA driver from doing so. Blacklisting ensures it's not loaded at boot.

Step 4: Install NVIDIA Driver

There are generally two recommended methods: using the distribution's package manager or downloading the driver from NVIDIA's website. Using the distribution's repository is often simpler and better integrated.

Method 1: Using Distribution Repositories (Recommended)

This is the easiest and most stable method for most users.

For Ubuntu/Debian

  1. Add the graphics-drivers PPA (Personal Package Archive) for newer drivers:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
  1. Find the recommended driver:
ubuntu-drivers devices

Example Output:

== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
driver   : xorg-driver-video-nvidia-470 - distro non-free recommended
driver   : xorg-driver-video-nvidia-510 - distro non-free
driver   : xorg-driver-video-nvidia-515 - distro non-free
driver   : xorg-driver-video-nvidia-450 - distro non-free
driver   : xorg-driver-video-nvidia-390 - distro non-free
driver   : xorg-driver-video-nvidia-525 - distro non-free
driver   : xorg-driver-video-nvidia-495 - distro non-free
driver   : xorg-driver-video-nvidia-535 - distro non-free
driver   : xorg-driver-video-nvidia-545 - distro non-free
driver   : xorg-driver-video-nvidia-550 - distro non-free
driver   : xorg-driver-video-nvidia-555 - distro non-free
driver   : xorg-driver-video-nvidia-560 - distro non-free
...
'''Use the recommended driver:'''
<pre>sudo ubuntu-drivers autoinstall

Alternatively, install a specific version (e.g., 535):

sudo apt install nvidia-driver-535 -y
  1. Reboot your system:
sudo reboot

For CentOS

  1. CentOS usually has drivers available in the EPEL (Extra Packages for Enterprise Linux) or RPM Fusion repositories.
  2. Install EPEL if not already present:
sudo yum install epel-release -y
  1. Install RPM Fusion (for NVIDIA drivers):
sudo yum install --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-$(rpm -E %rhel).noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-$(rpm -E %rhel).noarch.rpm -y
  1. Install the NVIDIA driver:
  2. Search for available drivers:
sudo yum search nvidia-driver
  1. Install the latest available driver (e.g., `akmod-nvidia` for kernel modules):
sudo yum install akmod-nvidia xorg-x11-drv-nvidia-cuda -y
  1. Reboot your system:
sudo reboot

Why this matters: Using distribution repositories ensures that the driver is compatible with your system's libraries and kernel. `akmod-nvidia` on CentOS automatically rebuilds the kernel module when the kernel is updated.

Method 2: Using NVIDIA's Runfile Installer

This method offers the latest drivers directly from NVIDIA but can be more complex to manage, especially during kernel updates.

  1. Visit the NVIDIA Driver Download page.
  2. Select your GPU model, operating system, and download the latest recommended driver.
  3. Make the downloaded file executable:
chmod +x NVIDIA-Linux-x86_64-*.run
  1. Stop your display manager. This is crucial to prevent conflicts.
  2. For Ubuntu/Debian (using `gdm3` or `lightdm`):
sudo systemctl stop gdm3

or

sudo systemctl stop lightdm
  1. For CentOS (using `gdm`):
sudo systemctl stop gdm
  1. Run the installer:
sudo ./NVIDIA-Linux-x86_64-*.run
  1. Follow the on-screen prompts. It's generally recommended to accept the default options, including installing the 32-bit compatibility libraries if prompted.
  2. The installer might ask to register the kernel module with DKMS (Dynamic Kernel Module Support). It's usually best to say 'yes' if available for easier updates.
  1. Restart your display manager and reboot:
sudo systemctl start gdm3

(or your display manager)

sudo reboot

Why this matters: Stopping the display manager ensures the X server is not running, which is necessary for the driver installer to properly load and configure the graphics components. DKMS helps manage kernel module updates automatically.

Step 5: Verify Installation

After rebooting, verify that the NVIDIA driver is loaded correctly.

  1. Check NVIDIA SMI (System Management Interface):
nvidia-smi

Example Output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05   Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+-------------------------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|=========================================+=====================================|
|   0  NVIDIA GeForce RTX 2070   Off  | 00000000:01:00.0  On |                  N/A |
| N/A   45C    P8     8W /  N/A |      1MiB /  8192MiB |      0%      Default |
+-----------------------------------------+-------------------------------------+
...
  1. Check loaded kernel modules:
lsmod | grep nvidia

Example Output:

nvidia_uvm            962570  0
nvidia_drm             57344  1
nvidia_modeset       119808  2 nvidia_drm
nvidia              3440640  15 nvidia_uvm,nvidia_modeset
i2c_algo_bit           16384  1 nvidia_drm
video                  53248  1 nvidia_modeset
  1. Check Xorg log for errors:
grep EE /var/log/Xorg.0.log

Expected Output: Should be empty or contain no critical errors related to NVIDIA.

Why this matters: `nvidia-smi` is the primary tool to confirm the driver is active and communicating with the GPU. `lsmod` shows kernel modules, and checking Xorg logs helps diagnose graphical display issues.

Troubleshooting

  • Black Screen After Reboot:
    • This often indicates a driver conflict or incorrect installation.
    • Try booting into recovery mode and uninstalling the NVIDIA driver.
    • If you used the runfile installer, run it again with the `--uninstall` flag.
    • If you used package managers, use `sudo apt autoremove nvidia-*` (Ubuntu/Debian) or `sudo yum remove akmod-nvidia` (CentOS).
    • Ensure Nouveau is properly blacklisted.
  • `nvidia-smi` command not found or "Failed to initialize NVML":
    • The driver is likely not loaded correctly.
    • Double-check that you rebooted after installation.
    • Verify that the `nvidia` kernel module is loaded (`lsmod | grep nvidia`).
    • Ensure you installed the correct driver version for your GPU and kernel.
    • On CentOS, ensure `akmod-nvidia` is installed and has built successfully.
  • CUDA Toolkit Issues:
    • Ensure you have installed the CUDA Toolkit, which is separate from the driver. Refer to the CUDA Installation Guide.
    • The driver version must be compatible with the CUDA Toolkit version. Check NVIDIA's CUDA documentation for compatibility matrices.
  • Kernel Updates Break Driver:
    • If you used the runfile installer without DKMS, you'll need to reinstall the driver after a kernel update.
    • If you used distribution packages with DKMS or `akmod-nvidia`, the module should rebuild automatically. If not, manually trigger a rebuild or reinstall.

Related Articles