NVIDIADatacenterHopper

NVIDIA H100 80GB for local AI

NVIDIA H100 80GB provides 80 GB of VRAM for local AI. In the LocalIA catalog, 224 out of 242 models run comfortably on a single card.

VRAM
80GB
Category
Datacenter
Series
Hopper
Vendor
NVIDIA

Models that run comfortably

These models fit in 80 GB with room for context and stable inference.

Command R+ 104Bcommand65.4 GBcomfortableQ4 · / 80 GB
Qwen3 Next 80B A3B Instructqwen61.5 GBcomfortableQ5 · / 80 GB
Qwen 2.5 72Bqwen55.3 GBcomfortableQ5 · / 80 GB
Qwen 2.5 VL 72Bqwen55.3 GBcomfortableQ5 · / 80 GB
Qwen2.5 72B Instructqwen55.3 GBcomfortableQ5 · / 80 GB
Llama 2 70Bllama53.8 GBcomfortableQ5 · / 80 GB
Llama 3 70Bllama53.8 GBcomfortableQ5 · / 80 GB
Llama 3.1 70Bllama53.8 GBcomfortableQ5 · / 80 GB
Llama 3.3 70Bllama53.8 GBcomfortableQ5 · / 80 GB
CodeLlama 70Bcodellama53.8 GBcomfortableQ5 · / 80 GB
DeepSeek R1 Distill 70Bdeepseek53.8 GBcomfortableQ5 · / 80 GB
Hermes 3 70Bhermes53.8 GBcomfortableQ5 · / 80 GB
Llama 3.1 Nemotron 70Bnemotron53.8 GBcomfortableQ5 · / 80 GB
Athene 70Bathene53.8 GBcomfortableQ5 · / 80 GB
Llama 3.3 70B Instructllama53.8 GBcomfortableQ5 · / 80 GB
Llama 3.1 70B Instructllama53.8 GBcomfortableQ5 · / 80 GB
Mixtral 8x7Bmistral52.5 GBcomfortableQ8 · / 80 GB
Falcon 40Bfalcon44.7 GBcomfortableQ8 · / 80 GB
Command R 35Bcommand39.1 GBcomfortableQ8 · / 80 GB
Aya 23 35Baya39.1 GBcomfortableQ8 · / 80 GB
CodeLlama 34Bcodellama38.0 GBcomfortableQ8 · / 80 GB
Yi 1.5 34Byi38.0 GBcomfortableQ8 · / 80 GB
dolphin 2.9.1 yi 1.5 34byi38.0 GBcomfortableQ8 · / 80 GB
Qwen 2.5 32Bqwen35.8 GBcomfortableQ8 · / 80 GB
Qwen 2.5 Coder 32Bqwen35.8 GBcomfortableQ8 · / 80 GB
Qwen 3 32Bqwen35.8 GBcomfortableQ8 · / 80 GB
QwQ 32Bqwq35.8 GBcomfortableQ8 · / 80 GB
DeepSeek R1 Distill 32Bdeepseek35.8 GBcomfortableQ8 · / 80 GB
Qwen 2.5 VL 32Bqwen35.8 GBcomfortableQ8 · / 80 GB
Granite 4 H-Small 32B-A9Bgranite35.8 GBcomfortableQ8 · / 80 GB

Tight models

These models barely fit. They can run, but context and speed will be limited.

Mistral Large 123Bmistral77.3 GBtightQ4 · / 80 GB
Llama 4 Scout 17Bx16llama68.5 GBtightQ4 · / 80 GB

Unlocked in a 2x rig

With two cards in parallel (160 GB total), larger models become reachable.

DeepSeek V2deepseek148.4 GBtightQ4 · / 160 GB
DeepSeek Coder V2deepseek148.4 GBtightQ4 · / 160 GB
Qwen 3 235B A22Bqwen147.7 GBtightQ4 · / 160 GB
Qwen3 235B A22Bqwen147.7 GBtightQ4 · / 160 GB
Falcon 180Bfalcon113.2 GBcomfortableQ4 · / 160 GB
Mixtral 8x22Bmistral108.3 GBcomfortableQ5 · / 160 GB

Unlocked in a 4x rig

Server-style configuration (320 GB total) for the largest open-weight models.

Llama 3.1 405Bllama254.6 GBcomfortableQ4 · / 320 GB
Hermes 3 405Bhermes254.6 GBcomfortableQ4 · / 320 GB
Llama 4 Maverick 17Bx128llama251.5 GBcomfortableQ4 · / 320 GB
Nemotron 340Bnemotron261.2 GBcomfortableQ5 · / 320 GB

Similar GPUs

VRAM estimates updated 2026-05-12.