How to Install CUDA Toolkit: Complete Setup Guide

Chapter 0 — Install and configure the NVIDIA CUDA Toolkit on Windows and Linux

This guide walks you through installing the NVIDIA CUDA Toolkit — the compiler, runtime libraries, and developer tools needed to build GPU-accelerated programs. If you'd rather skip local setup entirely, the cuda.live Playground lets you write and run CUDA code in your browser on real cloud GPUs with zero installation.

No GPU? No Problem

Don't have an NVIDIA GPU locally? Skip this chapter and jump straight to Chapter 1 — Introduction to CUDA. The Playground gives you a real NVIDIA T4 GPU in the cloud.

System Requirements

Before installing, verify you meet these requirements:

  • NVIDIA GPU — compute capability 3.5 or higher (GeForce GTX 700+, all RTX, all Quadro/Tesla). Check developer.nvidia.com/cuda-gpus for the full list.
  • Operating System — Windows 10/11 (64-bit), Ubuntu 20.04/22.04/24.04, or another supported Linux distro. macOS has not been supported since CUDA 10.2.
  • Disk space — ~4 GB for a full CUDA Toolkit install.
  • C/C++ compiler — Visual Studio (Windows) or GCC (Linux), installed before the CUDA Toolkit.
macOS Not Supported

NVIDIA dropped CUDA support for macOS after CUDA 10.2. On Apple Silicon (M1/M2/M3), use Apple's Metal framework for GPU compute. For cross-platform CUDA learning, use cuda.live.

Installing CUDA on Windows 10 / Windows 11

Step 1 — Check Your GPU

Press Win + X → Device Manager → Display adapters. You should see an NVIDIA GPU listed (e.g., "NVIDIA GeForce RTX 4080"). If you see only integrated graphics, CUDA cannot run on that machine.

Step 2 — Install or Update NVIDIA Drivers

Download the latest driver from nvidia.com/drivers (Game Ready or Studio Driver). CUDA 12.x requires driver version ≥ 527.41. If your driver is already up to date, skip this step.

Step 3 — Download the CUDA Toolkit

Go to developer.nvidia.com/cuda-downloads and select:

  • Operating System: Windows
  • Architecture: x86_64
  • Version: 11 or 10
  • Installer Type: exe (local) — downloads the full installer offline

Step 4 — Run the Installer

Double-click the downloaded .exe. Choose Express (Recommended) to install CUDA, drivers, and Visual Studio integration in one go. The installer will automatically add CUDA to your PATH.

Step 5 — Verify the Installation

Open a new Command Prompt or PowerShell (important — old windows won't see the updated PATH) and run:

cmd
123456789
nvcc --version
# Expected output:
# nvcc: NVIDIA (R) Cuda compiler driver
# Copyright (c) 2005-2024 NVIDIA Corporation
# Built on ...
# Cuda compilation tools, release 12.x, V12.x.xxx

nvidia-smi
# Shows your GPU name, driver version, and CUDA version

Step 6 — Compile and Run a Test Program

cmd
1234
# Compile the CUDA samples deviceQuery tool (included in toolkit)
cd "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\extras\demo_suite"
deviceQuery.exe
# Should print: Result = PASS
nvcc not found on Windows?

Manually add the CUDA bin path to your system PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin. Open System Properties → Environment Variables → Path → Edit → New, paste the path, click OK, then open a fresh terminal.

Installing CUDA on Ubuntu / Debian Linux

Ubuntu is the most common Linux distribution for CUDA development. These steps work for Ubuntu 20.04, 22.04, and 24.04. For other distros (Fedora, CentOS, RHEL), use the runfile installer from the CUDA Downloads page.

Step 1 — Install NVIDIA Drivers

terminal
123456789
# Verify an NVIDIA GPU is detected
lspci | grep -i nvidia

# Install the recommended driver (replace 535 with latest version)
sudo apt update
sudo apt install -y nvidia-driver-535

# Reboot to load the new driver
sudo reboot

After rebooting, verify the driver loaded:

terminal
12
nvidia-smi
# Should show GPU name, temperature, and driver version

Step 2 — Add the CUDA APT Repository

NVIDIA provides an official apt repository. Visit developer.nvidia.com/cuda-downloads and select Linux → x86_64 → Ubuntu → your version → deb (network) to get the exact commands for your distro. The generic Ubuntu 22.04 commands are:

terminal
123456789
# Download and install the CUDA keyring
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb

# Update package list
sudo apt update

# Install CUDA Toolkit (replace 12-x with latest, e.g. 12-4)
sudo apt install -y cuda-toolkit-12-4

Step 3 — Set Environment Variables

Add CUDA to your PATH and LD_LIBRARY_PATH. Append these lines to your ~/.bashrc (or ~/.zshrc):

~/.bashrc
12
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
terminal
12345
# Apply without restarting the shell
source ~/.bashrc

# Verify
nvcc --version

Step 4 — Install GCC (if not already installed)

terminal
12
sudo apt install -y build-essential
gcc --version

Step 5 — Compile a Test Program

terminal
12345678910111213141516
# Write a minimal CUDA test
cat > /tmp/hello.cu << 'EOF'
#include <stdio.h>
__global__ void hello() { printf("Hello from GPU thread %d\n", threadIdx.x); }
int main() { hello<<<1,4>>>(); cudaDeviceSynchronize(); return 0; }
EOF

# Compile with nvcc
nvcc /tmp/hello.cu -o /tmp/hello

# Run
/tmp/hello
# Hello from GPU thread 0
# Hello from GPU thread 1
# Hello from GPU thread 2
# Hello from GPU thread 3

CUDA on WSL2 (Windows Subsystem for Linux)

WSL2 supports CUDA natively since Windows 11 and Windows 10 build 21H2. Your Windows NVIDIA driver already includes the WSL2 CUDA driver — do not install a separate Linux driver inside WSL2.

wsl2 terminal
123456789
# Inside WSL2 — skip driver install, just add the toolkit repository
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update
sudo apt install -y cuda-toolkit-12-4

# Then set PATH as above and verify
nvcc --version
nvidia-smi   # Should show your Windows GPU from inside WSL2

CUDA with Docker (NVIDIA Container Toolkit)

Docker is the cleanest way to manage CUDA environments — no PATH conflicts, easy version switching, reproducible builds.

Install NVIDIA Container Toolkit (Linux)

terminal
1234567
# Add NVIDIA container toolkit repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update && sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Run a CUDA Container

terminal
1234567
# Pull and run an official CUDA development image
docker run --gpus all --rm nvidia/cuda:12.4.1-devel-ubuntu22.04 nvcc --version

# Interactive CUDA development shell
docker run --gpus all -it --rm \
  -v $(pwd):/workspace -w /workspace \
  nvidia/cuda:12.4.1-devel-ubuntu22.04 bash

Common Installation Errors and Fixes

Error: nvcc: command not found

CUDA bin directory is not in PATH. Add export PATH=/usr/local/cuda/bin:$PATH to ~/.bashrc and run source ~/.bashrc.

Error: CUDA driver version is insufficient for CUDA runtime version

Your GPU driver is too old for the installed CUDA version. Update the driver from nvidia.com/drivers. The driver version must be ≥ the minimum required for your CUDA version (see CUDA release notes).

Error: no kernel image is available for execution on the device

Your GPU's compute capability is lower than what the binary was compiled for. Recompile with the correct architecture flag:

terminal
12345678
# For an RTX 3080 (compute capability 8.6):
nvcc -arch=sm_86 mycode.cu -o mycode

# For an older GTX 1080 (compute capability 6.1):
nvcc -arch=sm_61 mycode.cu -o mycode

# Auto-detect and compile for the current GPU:
nvcc -arch=native mycode.cu -o mycode

Error: cudaErrorNoDevice — no CUDA-capable device is detected

Either the driver is not loaded or the GPU is not visible. On Linux, check:

terminal
12345678
# Check if nvidia kernel module is loaded
lsmod | grep nvidia

# If empty, load it:
sudo modprobe nvidia

# Verify GPU is visible
ls /dev/nvidia*

Summary

  • CUDA requires an NVIDIA GPU with compute capability ≥ 3.5
  • Windows: Download the exe installer from nvidia.com → Express install → open new terminal → nvcc --version
  • Linux: Install driver → add apt repo → install cuda-toolkit → set PATH → nvcc --version
  • WSL2 inherits your Windows driver; only install the toolkit inside WSL2
  • Docker + NVIDIA Container Toolkit = cleanest multi-version setup
  • No GPU? Use cuda.live Playground — real T4 GPU, zero setup

Frequently Asked Questions

What NVIDIA GPU do I need for CUDA?

Any NVIDIA GPU with compute capability 3.5 or higher — this includes the GeForce GTX 700 series and newer, all RTX cards, and all datacenter GPUs. Check the full list at developer.nvidia.com/cuda-gpus.

Does CUDA work on AMD or Intel GPUs?

No — CUDA is NVIDIA-exclusive. For AMD GPUs use ROCm/HIP; for Intel GPUs use oneAPI/SYCL. Alternatively, run CUDA code on real cloud GPUs via cuda.live.

Can I install CUDA without a GPU?

The CUDA Toolkit (compiler, headers, libraries) installs fine without a GPU. You can compile .cu files but cannot execute kernels. Use cuda.live to run on real GPUs from the browser.

What is the difference between CUDA Toolkit and CUDA drivers?

The CUDA driver is bundled with the NVIDIA GPU driver — it's the low-level bridge between software and hardware. The CUDA Toolkit is the developer layer: nvcc compiler, cuBLAS, cuDNN headers, profilers, and samples. You need both. The driver ships with the GPU driver package; the toolkit is downloaded separately.

Further Reading