It is possible to run local LLMs on AMD GPUs by using Ollama. The most recent version of Llama 3.2, which went live on September 25, 2024, is the subject of this tutorial.
Llama 3.2 from Meta is compact and multimodal, featuring 1B, 3B, 11B, and 90B models. Here is a step-by-step installation instruction for Ollama on Linux and Windows operating systems using Radeon GPUs, along with information on running these versions on different AMD hardware combinations.
Supported AMD GPUs
Ollama’s product may be used with both older and current AMD GPU models since it supports a variety of them.
Ollama Installation & Setup Guide.
Linux
System prerequisites
- Ubuntu 22.04.4.
- AMD GPUs using the most recent version of AMD ROCm.
- To install ROCm 6.1.3.
- Use a single command to install Ollama.
Windows
System prerequisites
- Windows 10 or Later.
- Installed drivers for AMD GPUs that are supported.
- After installation, just launch PowerShell and execute it.
- Run, llama, llama3.2.
- That’s all; you’re set to speak with your local LLM.
AMD ROCm Supported GPUs
Use ROCm to install Radeon software on Linux
The amdgpu-install script helps you install a cohesive collection of stack components, including the ROCm Software Stack and other Radeon software for Linux components.
Makes the installation of the AMD GPUs stack easier by using command line arguments that let you select the following and by encapsulating the distribution-specific package installation logic.
- The AMD GPUs stack’s use case (graphics or workstation) that has to be installed.
- Combination of elements (user selection or Pro stack).
- Carries out post-install inspections to confirm that the installation went well.
Installs the uninstallation script, which enables you to use a single command to delete the whole AMD GPU stack from the computer.
AMD Radeon GPUs
Ollama supports the following AMD GPUs:
Linux Support
Family | Cards and accelerators |
---|---|
AMD Radeon RX | 7900 XTX 7900 XT 7900 GRE 7800 XT 7700 XT 7600 XT 7600 6950 XT 6900 XTX 6900XT 6800 XT 6800 Vega 64 Vega 56 |
AMD Radeon PRO | W7900 W7800 W7700 W7600 W7500 W6900X W6800X Duo W6800X W6800 V620 V420 V340 V320 Vega II Duo Vega II VII SSG |
AMD Instinct | MI300X MI300A MI300 MI250X MI250 MI210 MI200 MI100 MI60 MI50 |
Linux Overrides
Not all AMD GPUs are supported by the AMD ROCm library, which Ollama makes use of. You can sometimes cause the system to attempt using a nearby, comparable LLVM target. For instance, the Radeon RX 5400 is gfx1034 (also referred to as 10.3.4), however ROCm does not yet support this target. gfx1030 is the closest support. With x.y.z syntax, you may utilize the environment variable HSA_OVERRIDE_GFX_VERSION. For instance, you may change the server’s environment variable HSA_OVERRIDE_GFX_VERSION=”10.3.0″ to force the system to operate on the RX 5400.
A future version of ROCm v6 is anticipated to support a greater number of GPU families due to AMD’s ongoing efforts to improve it.
GPU Selection
To restrict Ollama to use a subset of your system’s AMD GPUs, you may set HIP_VISIBLE_DEVICES to a list of GPUs separated by commas. The list of devices having rocminfo is shown. You may use an incorrect GPU ID (such as “-1”) to compel CPU utilization while ignoring the GPUs.
Permission for Containers
SELinux may restrict containers’ access to AMD GPU hardware in various Linux editions. To enable containers to utilize devices, execute sudo setsebool container_use_devices=1 on the host system.
Metal: GPUs made by Apple
Through the Metal API, Ollama facilitates GPU acceleration on Apple devices.
In summary
Ollama’s broad support for AMD GPUs is evidence of how widely available executing LLMs locally is becoming. Users may run models like Llama 3.2 on their own hardware with a variety of choices, ranging from high-end AMD Instinct accelerators to consumer-grade AMD Radeon RX graphics cards. More customization, privacy, and experimentation are possible in AI applications across a range of industries thanks to this adaptable strategy for enabling creative LLMs throughout the extensive AI portfolio.