fixing “Segmentation fault” / “no CUDA-capable device is detected” for FaceFusion on Ubuntu

If you see in logs:

Segmentation fault (core dumped)

and later:

CudaCall CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; … cudaGetDeviceCount(&num_devices);

on Ubuntu 22.04 with an Nvidia RTX-4000, this is what you should do.

✅ Step-by-Step Instructions

# 1. Update system packages
sudo apt update && sudo apt upgrade

# 2. Remove old Nvidia/CUDA drivers completely
sudo apt autoremove --purge 'nvidia*'

# 3. Detect recommended Nvidia driver for your GPU
ubuntu-drivers devices
# Note down suggested driver (e.g. nvidia-driver-550 or other)

# 4. Install kernel headers & build tools
sudo apt install linux-headers-$(uname -r) build-essential

# 5. Reboot system
sudo reboot

# 6. After reboot, install CUDA toolkit
sudo apt install nvidia-cuda-toolkit

# 7. Install the matching Nvidia driver version (e.g. the one found in step 3)
sudo apt install nvidia-driver-550   # or whatever version ubuntu-drivers recommended

# 8. Reboot again
sudo reboot

# 9. Verify GPU / driver / CUDA visibility

lspci | grep -i nvidia
nvidia-smi
nvcc --version

🔍 What This Fixes

  • Gets rid of leftover conflicting driver installations
  • Ensures kernel module for Nvidia loads (via correct headers & driver)
  • Ensures CUDA toolkit and driver are installed & aligned
  • After reboot, ONNX Runtime / FaceFusion should see the GPU properly, so no more “GPU = −1” error
  • Removes segmentation fault at startup when selecting CUDA

📝 Example/Error Logs Involved

  • Original crash: Segmentation fault (core dumped)
  • After installing nvidia-cuda-toolkit:
    CudaCall CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; … cudaGetDeviceCount(&num_devices);
    

That’s it. Follow exactly those commands and verifications.````