Setting Up NVIDIA CUDA on Linux

This guide provides a comprehensive walkthrough for installing and configuring the NVIDIA CUDA Toolkit on a Linux system. CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA. It enables software developers and engineers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing – an approach known as GPGPU (General-Purpose computing on Graphics Processing Units). This is particularly useful for computationally intensive tasks such as machine learning, deep learning, scientific simulations, and video processing.

Prerequisites

Before you begin, ensure your system meets the following requirements:

NVIDIA GPU: A CUDA-enabled NVIDIA graphics card is essential. You can check compatibility on the [CUDA GPUs] list.
Linux Distribution: A supported Linux distribution (e.g., Ubuntu, CentOS, Debian, Fedora, RHEL). This guide primarily uses examples for Ubuntu/Debian-based systems, but commands can be adapted for others.
Root or sudo privileges: You will need administrative access to install software and modify system configurations.
Internet connection: To download the CUDA Toolkit and drivers.
Basic Linux command-line familiarity: Understanding of package management, file system navigation, and text editing.

For readily available GPU-accelerated computing, consider exploring options at Immers Cloud, offering GPU servers starting from $0.23/hr for inference.

Step 1: Identify Your NVIDIA GPU

First, confirm that your NVIDIA GPU is recognized by the system.

lspci | grep -i nvidia

This command should output information about your NVIDIA graphics card.

Step 2: Install NVIDIA Drivers

The CUDA Toolkit requires compatible NVIDIA drivers. It's generally recommended to install the drivers *before* the CUDA Toolkit.

Option A: Using the Distribution's Package Manager (Recommended for ease of use)

For Ubuntu/Debian:

sudo apt update
sudo apt install nvidia-driver-XXX

Replace `XXX` with the recommended driver version for your distribution or GPU. You can often find the recommended version by running:

ubuntu-drivers devices

Then install the recommended one:

sudo ubuntu-drivers autoinstall

For CentOS/RHEL:

sudo yum update
sudo yum install epel-release
sudo yum install xorg-x11-drv-nvidia-XXX

Replace `XXX` with the appropriate version.

After installation, reboot your system:

sudo reboot

Option B: Downloading Drivers from NVIDIA (More control, but more complex)

1. Visit the [Driver Downloads] page. 2. Select your GPU model, operating system, and download type (e.g., "Production Branch"). 3. Download the `.run` file. 4. Before running the installer, you might need to stop your display manager. For example, on Ubuntu:

sudo systemctl stop display-manager

5. Navigate to the directory where you downloaded the file and run it with root privileges:

sudo sh NVIDIA-Linux-x86_64-XXX.XX.run

   Follow the on-screen prompts.

6. Reboot your system:

sudo reboot

Step 3: Verify Driver Installation

After rebooting, check if the NVIDIA driver is loaded correctly.

nvidia-smi

This command should display information about your GPU(s), including the driver version and CUDA version supported by the driver.

If `nvidia-smi` fails, it indicates a problem with the driver installation. Consult the troubleshooting section.

Step 4: Install the CUDA Toolkit

NVIDIA provides several methods for installing the CUDA Toolkit. The most common are using a package manager (deb/rpm) or a runfile installer.

Option A: Using the CUDA Repository (Recommended)

This method is generally preferred as it integrates well with your system's package manager and simplifies updates.

1. Add the CUDA Repository:

   Visit the [CUDA Downloads] page. Select your Operating System, Architecture, Distribution, Version, and Installer Type (e.g., `deb (local)` or `rpm (network)`). The page will provide the exact commands.

   For Ubuntu (example using `deb (local)`):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
    sudo dpkg -i cuda-keyring_1.1-1_all.deb
    sudo apt-get update

   For CentOS/RHEL (example using `rpm (network)`):

sudo rpm --import https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/7fa2af80.pub
    sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
    sudo dnf clean all

2. Install the CUDA Toolkit:

   For Ubuntu:

sudo apt-get -y install cuda

   This installs the latest CUDA Toolkit and its dependencies. If you need a specific version, you might install `cuda-toolkit-XX-Y` (e.g., `cuda-toolkit-12-2`).

   For CentOS/RHEL:

sudo dnf -y install cuda

3. Set Environment Variables:

   Add CUDA to your PATH and LD_LIBRARY_PATH. This is crucial for the system to find CUDA executables and libraries.
   Add the following lines to your `~/.bashrc` or `~/.zshrc` file:

export PATH=/usr/local/cuda/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

   Then, apply the changes to your current session:

source ~/.bashrc

   (Or `source ~/.zshrc` if you use Zsh)

Option B: Using the Runfile Installer

1. Download the CUDA Toolkit runfile (`.run` extension) from the [CUDA Downloads] page, selecting the appropriate OS and version. 2. Make the runfile executable:

chmod +x cuda_XXX.XX_linux.run

3. Run the installer with root privileges. It's often recommended to *not* install the driver if you've already installed a compatible one.

sudo sh cuda_XXX.XX_linux.run

   Follow the on-screen prompts. When asked about installing the driver, choose "no" if you already have one installed and verified.

4. Set environment variables as described in Option A, Step 3.

Step 5: Verify CUDA Toolkit Installation

After installing the toolkit and setting environment variables, verify that CUDA is accessible.

1. Check nvcc version:

   The `nvcc` (NVIDIA CUDA Compiler) is the compiler for CUDA.

nvcc --version

   This should display the installed CUDA Toolkit version.

2. Compile and run CUDA Samples:

   CUDA Toolkit typically includes sample applications. These are usually located in `/usr/local/cuda/samples`.
   First, navigate to the samples directory.

cd /usr/local/cuda/samples

   Then, compile a sample, for example, the deviceQuery sample:

sudo make && cd bin/x86_64/linux/release/
    ./deviceQuery

   This utility will list your CUDA-enabled devices and report if they are CUDA-capable. You should see a "Result = PASS" at the end.

   You can also compile and run the `bandwidthTest` sample:

cd ../../../
    make bandwidthTest && cd bin/x86_64/linux/release/
    ./bandwidthTest

   This tests memory bandwidth between the host and device. It should also report "Result = PASS".

Troubleshooting

`nvidia-smi` command not found or fails:

   *   Ensure the NVIDIA driver is correctly installed and loaded. Rebooting often helps.
   *   Check if `/usr/bin/nvidia-smi` or a similar path exists.

* Verify that the `nvidia` kernel module is loaded:

lsmod | grep nvidia

`nvcc: command not found` or CUDA samples don't compile:

   *   Ensure the CUDA environment variables (`PATH` and `LD_LIBRARY_PATH`) are correctly set in your `~/.bashrc` or `~/.zshrc` and that you've sourced the file (`source ~/.bashrc`).
   *   Check if `/usr/local/cuda/bin/nvcc` exists.
   *   Verify that the CUDA Toolkit was installed correctly. Reinstall if necessary.

Driver/Toolkit Version Mismatch:

   *   The `nvidia-smi` output shows the maximum CUDA version supported by the driver. The CUDA Toolkit version you install should be less than or equal to this supported version. If you install a newer CUDA Toolkit than your driver supports, you might encounter issues. You may need to update your NVIDIA driver or install an older CUDA Toolkit.

Secure Boot Issues:

   *   If you have Secure Boot enabled on your system, you might need to sign the NVIDIA kernel modules. This is often handled during driver installation, but if not, you might need to manually sign them or disable Secure Boot.

Setting Up NVIDIA CUDA on Linux

Contents

Setting Up NVIDIA CUDA on Linux

Prerequisites

Step 1: Identify Your NVIDIA GPU

Step 2: Install NVIDIA Drivers

Step 3: Verify Driver Installation

Step 4: Install the CUDA Toolkit

Step 5: Verify CUDA Toolkit Installation

Troubleshooting

Related Articles

Read Also

Navigation menu

Search