AMDDatacenterInstinct CDNA 1-2

Instinct MI100 for local AI

Instinct MI100 provides 32 GB of VRAM for local AI. In the LocalIA catalog, 207 out of 242 models run comfortably on a single card.

VRAM
32GB
Category
Datacenter
Series
Instinct CDNA 1-2
Vendor
AMD

Models that run comfortably

These models fit in 32 GB with room for context and stable inference.

Falcon 40Bfalcon25.1 GBcomfortableQ4 · / 32 GB
Command R 35Bcommand26.9 GBcomfortableQ5 · / 32 GB
Aya 23 35Baya26.9 GBcomfortableQ5 · / 32 GB
CodeLlama 34Bcodellama26.1 GBcomfortableQ5 · / 32 GB
Yi 1.5 34Byi26.1 GBcomfortableQ5 · / 32 GB
dolphin 2.9.1 yi 1.5 34byi26.1 GBcomfortableQ5 · / 32 GB
Qwen 2.5 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
Qwen 2.5 Coder 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
Qwen 3 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
QwQ 32Bqwq24.6 GBcomfortableQ5 · / 32 GB
DeepSeek R1 Distill 32Bdeepseek24.6 GBcomfortableQ5 · / 32 GB
Qwen 2.5 VL 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
Granite 4 H-Small 32B-A9Bgranite24.6 GBcomfortableQ5 · / 32 GB
GLM-4.6glm24.6 GBcomfortableQ5 · / 32 GB
GLM-4.7glm24.6 GBcomfortableQ5 · / 32 GB
GLM-5glm24.6 GBcomfortableQ5 · / 32 GB
GLM-5.1glm24.6 GBcomfortableQ5 · / 32 GB
Qwen3 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
Qwen2.5 Coder 32B Instructqwen24.6 GBcomfortableQ5 · / 32 GB
DeepSeek R1 Distill Qwen 32Bqwen24.6 GBcomfortableQ5 · / 32 GB
Qwen2.5 32B Instructqwen24.6 GBcomfortableQ5 · / 32 GB
Gemma 4 31Bgemma23.8 GBcomfortableQ5 · / 32 GB
Qwen 3 30B A3Bqwen23.1 GBcomfortableQ5 · / 32 GB
MPT 30Bmpt23.1 GBcomfortableQ5 · / 32 GB
Qwen3 Coder 30B A3B Instructqwen23.1 GBcomfortableQ5 · / 32 GB
Qwen3 30B A3Bqwen23.1 GBcomfortableQ5 · / 32 GB
Qwen3 30B A3B Instruct 2507qwen23.1 GBcomfortableQ5 · / 32 GB
NVIDIA Nemotron 3 Nano 30B A3B BF16nemotron23.1 GBcomfortableQ5 · / 32 GB
Gemma 2 27Bgemma20.7 GBcomfortableQ5 · / 32 GB
Gemma 3 27Bgemma20.7 GBcomfortableQ5 · / 32 GB

Tight models

These models barely fit. They can run, but context and speed will be limited.

Mixtral 8x7Bmistral29.5 GBtightQ4 · / 32 GB

Unlocked in a 2x rig

With two cards in parallel (64 GB total), larger models become reachable.

Qwen3 Next 80B A3B Instructqwen50.3 GBcomfortableQ4 · / 64 GB
Qwen 2.5 72Bqwen45.3 GBcomfortableQ4 · / 64 GB
Qwen 2.5 VL 72Bqwen45.3 GBcomfortableQ4 · / 64 GB
Qwen2.5 72B Instructqwen45.3 GBcomfortableQ4 · / 64 GB
Llama 2 70Bllama53.8 GBcomfortableQ5 · / 64 GB
Llama 3 70Bllama53.8 GBcomfortableQ5 · / 64 GB
Llama 3.1 70Bllama53.8 GBcomfortableQ5 · / 64 GB
Llama 3.3 70Bllama53.8 GBcomfortableQ5 · / 64 GB
CodeLlama 70Bcodellama53.8 GBcomfortableQ5 · / 64 GB
DeepSeek R1 Distill 70Bdeepseek53.8 GBcomfortableQ5 · / 64 GB
Hermes 3 70Bhermes53.8 GBcomfortableQ5 · / 64 GB
Llama 3.1 Nemotron 70Bnemotron53.8 GBcomfortableQ5 · / 64 GB
Athene 70Bathene53.8 GBcomfortableQ5 · / 64 GB
Llama 3.3 70B Instructllama53.8 GBcomfortableQ5 · / 64 GB
Llama 3.1 70B Instructllama53.8 GBcomfortableQ5 · / 64 GB

Unlocked in a 4x rig

Server-style configuration (128 GB total) for the largest open-weight models.

Falcon 180Bfalcon113.2 GBtightQ4 · / 128 GB
Mixtral 8x22Bmistral108.3 GBcomfortableQ5 · / 128 GB
Mistral Large 123Bmistral94.5 GBcomfortableQ5 · / 128 GB
NVIDIA Nemotron 3 Super 120B A12B BF16nemotron92.2 GBcomfortableQ5 · / 128 GB
Llama 4 Scout 17Bx16llama83.7 GBcomfortableQ5 · / 128 GB
Command R+ 104Bcommand79.9 GBcomfortableQ5 · / 128 GB

Similar GPUs

VRAM estimates updated 2026-05-12.