CUDA installation

CUDA Installation

This article details the installation of the Compute Unified Device Architecture (CUDA) toolkit on our servers. CUDA is a parallel computing platform and programming model developed by NVIDIA, and is essential for utilizing the GPU for accelerated computing tasks. This guide is geared towards new server engineers and assumes a basic understanding of Linux command-line operations. It covers installation on a Debian 11 (Bullseye) system, which is our standard server OS. Adjustments may be needed for other distributions. We will address download, installation, environment variable setup, and verification. See also GPU Acceleration Overview and Server Operating System Installation for related information.

Prerequisites

Before beginning the CUDA installation, ensure the following prerequisites are met:

A supported NVIDIA GPU is installed on the server. Check GPU Hardware List for compatibility.
The server is running Debian 11 (Bullseye) with root or sudo access.
A stable internet connection is available for downloading the CUDA toolkit.
Ensure the system is fully updated. Run `sudo apt update && sudo apt upgrade`.
Verify the presence of a compatible GCC compiler (version 7.5 or higher is recommended). Check with `gcc --version`. If necessary, install a suitable GCC version using `sudo apt install gcc g++`. See Compiler Installation Guide for details.

Downloading the CUDA Toolkit

The CUDA toolkit can be downloaded from the NVIDIA developer website. A developer account is required. Navigate to the CUDA Toolkit Archive: [1](https://developer.nvidia.com/cuda-toolkit-archive). Select the appropriate version for your GPU and operating system. For this guide, we will use CUDA 12.2.

Download the Debian/Ubuntu local installer (e.g., `cuda_12.2.2_535.154.05_linux.run`). We recommend downloading directly to the `/tmp` directory.

Installation Process

1. **Navigate to the download directory:**

   ```bash
   cd /tmp
   ```

2. **Make the installer executable:**

   ```bash
   chmod +x cuda_12.2.2_535.154.05_linux.run
   ```

3. **Run the installer:**

   ```bash
   sudo ./cuda_12.2.2_535.154.05_linux.run
   ```

   The installer will present a series of prompts. Accept the license agreement, and choose the installation directory. The default is `/usr/local/cuda-12.2`. It is recommended to keep the default. The installer will also ask if you want to install the NVIDIA drivers.  If you already have the latest drivers installed (verified through `nvidia-smi`), decline this option.  See NVIDIA Driver Installation for driver details.

Post-Installation Configuration

After the installation completes, several environment variables need to be set up to ensure the CUDA toolkit is properly accessible.

1. **Add CUDA paths to the `~/.bashrc` file:**

   Open the `~/.bashrc` file using a text editor (e.g., `nano ~/.bashrc`) and add the following lines at the end of the file:

   ```bash
   export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}
   export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
   ```

   These lines add the CUDA binary directory and library directory to the system's PATH and LD_LIBRARY_PATH, respectively.

2. **Source the `~/.bashrc` file:**

   Run the following command to apply the changes:

   ```bash
   source ~/.bashrc
   ```

CUDA Toolkit Details

The CUDA toolkit includes various components. Here's a table outlining key components and their functionalities:

Component	Description
nvcc	The NVIDIA CUDA Compiler. Used to compile CUDA C/C++ code.
cuda-gdb	The NVIDIA CUDA Debugger. Used for debugging CUDA applications.
cuda-memcheck	A memory checker for CUDA applications.
cuBLAS	A library for Basic Linear Algebra Subroutines (BLAS).
cuFFT	A library for Fast Fourier Transforms (FFT).

Supported CUDA Versions & Hardware

The following table details the compatibility between CUDA versions and supported NVIDIA GPUs. Note that this is not exhaustive, and compatibility may vary.

CUDA Version	Supported GPU Architecture	Example GPU
11.x	Turing, Ampere	GeForce RTX 2080, GeForce RTX 3090
12.x	Ada Lovelace, Hopper	GeForce RTX 4090, NVIDIA H100
13.x (Beta)	Blackwell	NVIDIA B100 (Future)

Verification

To verify the installation, you can run the CUDA samples. The samples are located in the `/usr/local/cuda-12.2/samples` directory.

1. **Navigate to the samples directory:**

   ```bash
   cd /usr/local/cuda-12.2/samples
   ```

2. **Compile the deviceQuery sample:**

   ```bash
   make
   ```

3. **Run the deviceQuery executable:**

   ```bash
   ./deviceQuery
   ```

   If the installation was successful, the `deviceQuery` program will display information about your NVIDIA GPU, including its name, compute capability, and memory size. See Troubleshooting CUDA Installation if issues arise.

Additional Resources

System Specifications

Here's a table showing our standard server specifications for CUDA development:

Component	Specification
CPU	Intel Xeon Gold 6248R
RAM	128 GB DDR4 ECC
GPU	NVIDIA GeForce RTX 3090
Storage	2 TB NVMe SSD
Operating System	Debian 11 (Bullseye)

Intel-Based Server Configurations

Configuration	Specifications	Benchmark
Core i7-6700K/7700 Server	64 GB DDR4, NVMe SSD 2 x 512 GB	CPU Benchmark: 8046
Core i7-8700 Server	64 GB DDR4, NVMe SSD 2x1 TB	CPU Benchmark: 13124
Core i9-9900K Server	128 GB DDR4, NVMe SSD 2 x 1 TB	CPU Benchmark: 49969
Core i9-13900 Server (64GB)	64 GB RAM, 2x2 TB NVMe SSD
Core i9-13900 Server (128GB)	128 GB RAM, 2x2 TB NVMe SSD
Core i5-13500 Server (64GB)	64 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Server (128GB)	128 GB RAM, 2x500 GB NVMe SSD
Core i5-13500 Workstation	64 GB DDR5 RAM, 2 NVMe SSD, NVIDIA RTX 4000

AMD-Based Server Configurations

Configuration	Specifications	Benchmark
Ryzen 5 3600 Server	64 GB RAM, 2x480 GB NVMe	CPU Benchmark: 17849
Ryzen 7 7700 Server	64 GB DDR5 RAM, 2x1 TB NVMe	CPU Benchmark: 35224
Ryzen 9 5950X Server	128 GB RAM, 2x4 TB NVMe	CPU Benchmark: 46045
Ryzen 9 7950X Server	128 GB DDR5 ECC, 2x2 TB NVMe	CPU Benchmark: 63561
EPYC 7502P Server (128GB/1TB)	128 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/2TB)	128 GB RAM, 2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (128GB/4TB)	128 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/1TB)	256 GB RAM, 1 TB NVMe	CPU Benchmark: 48021
EPYC 7502P Server (256GB/4TB)	256 GB RAM, 2x2 TB NVMe	CPU Benchmark: 48021
EPYC 9454P Server	256 GB RAM, 2x2 TB NVMe

Order Your Dedicated Server

Configure and order your ideal server configuration

Need Assistance?

Telegram: @powervps Servers at a discounted price

⚠️ *Note: All benchmark scores are approximate and may vary based on configuration. Server availability subject to stock.* ⚠️