I am going to buy a new graphics card and can’t choose between Nvidia and AMD. I know that Nvidia has bad reputation in Linux community but how really it works? And I heard recently their drivers got better. What can you recommend?
P. S. I don’t want any proprietary drivers (so I am talking about Nouveau or any other FOSS Nvidia driver if it exists)
The only reason I still go Nvidia is because I self host AI, which afaik takes advantage of CUDA and just runs overall better on Nvidia cards, or at the very least is easier to set up. Really, the top reason is that it’s the devil I know right now.
If I didn’t self host AI, I would 100% go AMD. Especially if you don’t want to use proprietary drivers. That being said, my old gaming laptop runs NixOS with Nouveau and there have definitely been improvements since I first tried it years ago, but I don’t do much gaming on it. It’s more a TV media station these days (so I can avoid the stupid smart TV bloat agenda, where your TV gets gradually slower and fits less increasingly-bloating apps over time).
If it’s just about self-hosting and not training, ROCm works perfectly fine for that. I self-host DeepSeek R1 32b and FLUX.1-dev on my 7900 XTX.
You even get more VRAM for cheaper.
I’m curious. Say you are getting a new computer, put Debian on, want to run e.g. DeepSeek via ollama via a container (e.g. Docker or podman) and also play, how easy or difficult is it?
I know that for NVIDIA you install the (closed official) drivers, setup the container insuring you get GPU passthrough, and thanks to CUDA from the driver, you’re pretty much good to go. Is it the same for AMD? Do you “just” need to install another package or is there more tinkering involved?
On the host system, you don’t need to do anything. AMDGPU and Mesa are included on most distros.
For LLMs you can go the easy route and just install the Alpaca flatpak and the AMD addon. It will work out of the box and uses ollama in the background.
If you need a Docker container for it: AMD provides the handy
rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete
images. They contain all the required ROCm dependencies and runtimes and you can just install your stuff ontop of it.As for GPU passthrough, all you need to do is add a device link for
/dev/kfd
and/dev/dri
and you are set. For example, in a docker-compose.yml you just add this:devices: - /dev/kfd:/dev/kfd - /dev/dri:/dev/dri
For example, this is the entire Dockerfile needed to build ComfyUI from scratch with ROCm. The user/group commands are only needed to get the container groups to align with my Fedora host system.
spoiler
ARG UBUNTU_VERSION=24.04 ARG ROCM_VERSION=6.3 ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete # For 6000 series #ARG ROCM_DOCKER_ARCH=gfx1030 # For 7000 series ARG ROCM_DOCKER_ARCH=gfx1100 FROM ${BASE_ROCM_DEV_CONTAINER} RUN apt-get update && apt-get install -y git python-is-python3 && rm -rf /var/lib/apt/lists/* RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3 --break-system-packages # Change group IDs to match Fedora RUN groupmod -g 1337 irc && groupmod -g 105 render && groupmod -g 39 video # Rename user on newer 24.04 release and add to video/render group RUN usermod -l ai ubuntu && \ usermod -d /home/ai -m ai && \ usermod -a -G video ai && \ usermod -a -G render ai USER ai WORKDIR /app ENV PATH="/home/ai/.local/bin:${PATH}" RUN git clone https://github.com/comfyanonymous/ComfyUI . RUN pip install -r requirements.txt --break-system-packages COPY start.sh /start.sh CMD /start.sh
This is very good to know. I read that ROCm can be a pain to get up and running, but I read that months ago and this space is moving fast. I may switch over when I can if this is the case. My 3080 is feeling it’s age already. Thank you!
That used to be the case, yes.
Alpaca pretty much allows running LLM out of the box on AMD after installing the ROCm addon in Discover/Software. LM Studio also works perfectly.
Image generation is a little bit more complicated. ComfyUI supports AMD when all ROCm dependencies are installed and the PyTorch version is swapped for the AMD version.
However, ComfyUI provides no builds for Linux or AMD right now and you have to build it yourself. I currently use a simple Docker container for ComfyUI which just takes the AMD ROCm image and installs ComfyUI ontop.
Definitely bookmarking this reply. I haven’t tried ComfyUI yet, but I’ve had it starred on Github from back when it was fairly new. I’m no stranger to building from source, but I have not dived into Docker yet, which is becoming more and more of a weakness by the day. Docker is sometimes required by some really cool projects and I’m missing out.