NVIDIAConsumerRTX 30

RTX 3090 for local AI

RTX 3090 provides 24 GB of VRAM for local AI. In the LocalIA catalog, 201 out of 242 models run comfortably on a single card.

VRAM
24GB
Category
Consumer
Series
RTX 30
Vendor
NVIDIA

Models that run comfortably

These models fit in 24 GB with room for context and stable inference.

Qwen 2.5 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
Qwen 2.5 Coder 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
Qwen 3 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
QwQ 32Bqwq20.1 GBcomfortableQ4 · / 24 GB
DeepSeek R1 Distill 32Bdeepseek20.1 GBcomfortableQ4 · / 24 GB
Qwen 2.5 VL 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
Granite 4 H-Small 32B-A9Bgranite20.1 GBcomfortableQ4 · / 24 GB
GLM-4.6glm20.1 GBcomfortableQ4 · / 24 GB
GLM-4.7glm20.1 GBcomfortableQ4 · / 24 GB
GLM-5glm20.1 GBcomfortableQ4 · / 24 GB
GLM-5.1glm20.1 GBcomfortableQ4 · / 24 GB
Qwen3 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
Qwen2.5 Coder 32B Instructqwen20.1 GBcomfortableQ4 · / 24 GB
DeepSeek R1 Distill Qwen 32Bqwen20.1 GBcomfortableQ4 · / 24 GB
Qwen2.5 32B Instructqwen20.1 GBcomfortableQ4 · / 24 GB
Gemma 4 31Bgemma19.5 GBcomfortableQ4 · / 24 GB
Qwen 3 30B A3Bqwen18.9 GBcomfortableQ4 · / 24 GB
MPT 30Bmpt18.9 GBcomfortableQ4 · / 24 GB
Qwen3 Coder 30B A3B Instructqwen18.9 GBcomfortableQ4 · / 24 GB
Qwen3 30B A3Bqwen18.9 GBcomfortableQ4 · / 24 GB
Qwen3 30B A3B Instruct 2507qwen18.9 GBcomfortableQ4 · / 24 GB
NVIDIA Nemotron 3 Nano 30B A3B BF16nemotron18.9 GBcomfortableQ4 · / 24 GB
Gemma 2 27Bgemma17.0 GBcomfortableQ4 · / 24 GB
Gemma 3 27Bgemma17.0 GBcomfortableQ4 · / 24 GB
Gemma 4 26B A4Bgemma20.0 GBcomfortableQ5 · / 24 GB
Mistral Small 3 24Bmistral18.4 GBcomfortableQ5 · / 24 GB
Mistral Small 3.1 24Bmistral18.4 GBcomfortableQ5 · / 24 GB
Mistral Small 3.2 24Bmistral18.4 GBcomfortableQ5 · / 24 GB
Devstral Small 2 24Bdevstral18.4 GBcomfortableQ5 · / 24 GB
Mistral Small 22Bmistral16.9 GBcomfortableQ5 · / 24 GB

Tight models

These models barely fit. They can run, but context and speed will be limited.

Command R 35Bcommand22.0 GBtightQ4 · / 24 GB
Aya 23 35Baya22.0 GBtightQ4 · / 24 GB
CodeLlama 34Bcodellama21.4 GBtightQ4 · / 24 GB
Yi 1.5 34Byi21.4 GBtightQ4 · / 24 GB
dolphin 2.9.1 yi 1.5 34byi21.4 GBtightQ4 · / 24 GB

Unlocked in a 2x rig

With two cards in parallel (48 GB total), larger models become reachable.

Qwen 2.5 72Bqwen45.3 GBtightQ4 · / 48 GB
Qwen 2.5 VL 72Bqwen45.3 GBtightQ4 · / 48 GB
Qwen2.5 72B Instructqwen45.3 GBtightQ4 · / 48 GB
Llama 2 70Bllama44.0 GBtightQ4 · / 48 GB
Llama 3 70Bllama44.0 GBtightQ4 · / 48 GB
Llama 3.1 70Bllama44.0 GBtightQ4 · / 48 GB
Llama 3.3 70Bllama44.0 GBtightQ4 · / 48 GB
CodeLlama 70Bcodellama44.0 GBtightQ4 · / 48 GB
DeepSeek R1 Distill 70Bdeepseek44.0 GBtightQ4 · / 48 GB
Hermes 3 70Bhermes44.0 GBtightQ4 · / 48 GB
Llama 3.1 Nemotron 70Bnemotron44.0 GBtightQ4 · / 48 GB
Athene 70Bathene44.0 GBtightQ4 · / 48 GB
Llama 3.3 70B Instructllama44.0 GBtightQ4 · / 48 GB
Llama 3.1 70B Instructllama44.0 GBtightQ4 · / 48 GB
Mixtral 8x7Bmistral36.1 GBcomfortableQ5 · / 48 GB

Unlocked in a 4x rig

Server-style configuration (96 GB total) for the largest open-weight models.

Mixtral 8x22Bmistral88.6 GBtightQ4 · / 96 GB
Mistral Large 123Bmistral77.3 GBcomfortableQ4 · / 96 GB
NVIDIA Nemotron 3 Super 120B A12B BF16nemotron75.4 GBcomfortableQ4 · / 96 GB
Llama 4 Scout 17Bx16llama68.5 GBcomfortableQ4 · / 96 GB
Command R+ 104Bcommand79.9 GBcomfortableQ5 · / 96 GB
Qwen3 Next 80B A3B Instructqwen61.5 GBcomfortableQ5 · / 96 GB

Similar GPUs

VRAM estimates updated 2026-05-12.