AppleAppleMac Studiounified memory

Mac Studio M2 Ultra (128GB) for local AI

Mac Studio M2 Ultra (128GB) provides 128 GB of VRAM for local AI. In the LocalIA catalog, 227 out of 242 models run comfortably on a single card.

View all compatible models →Rig around the (128GB) ↗

VRAM

128GB

Models that run comfortably

227 models

These models fit in 128 GB with room for context and stable inference.

01Mixtral 8x22Bmistral108.3 GBcomfortableQ5 · / 128 GB

02★Mistral Large 123Bmistral94.5 GBcomfortableQ5 · / 128 GB

03★NVIDIA Nemotron 3 Super 120B A12B BF16nemotron92.2 GBcomfortableQ5 · / 128 GB

04★Llama 4 Scout 17Bx16llama83.7 GBcomfortableQ5 · / 128 GB

05★Command R+ 104Bcommand79.9 GBcomfortableQ5 · / 128 GB

06★Qwen 2.5 72Bqwen80.5 GBcomfortableQ8 · / 128 GB

07Qwen 2.5 VL 72Bqwen80.5 GBcomfortableQ8 · / 128 GB

08★Qwen2.5 72B Instructqwen80.5 GBcomfortableQ8 · / 128 GB

09Llama 2 70Bllama78.2 GBcomfortableQ8 · / 128 GB

10Llama 3 70Bllama78.2 GBcomfortableQ8 · / 128 GB

11Llama 3.1 70Bllama78.2 GBcomfortableQ8 · / 128 GB

12★Llama 3.3 70Bllama78.2 GBcomfortableQ8 · / 128 GB

13CodeLlama 70Bcodellama78.2 GBcomfortableQ8 · / 128 GB

14★DeepSeek R1 Distill 70Bdeepseek78.2 GBcomfortableQ8 · / 128 GB

15Hermes 3 70Bhermes78.2 GBcomfortableQ8 · / 128 GB

16★Llama 3.1 Nemotron 70Bnemotron78.2 GBcomfortableQ8 · / 128 GB

17Athene 70Bathene78.2 GBcomfortableQ8 · / 128 GB

18★Llama 3.3 70B Instructllama78.2 GBcomfortableQ8 · / 128 GB

19★Llama 3.1 70B Instructllama78.2 GBcomfortableQ8 · / 128 GB

20★DeepSeek R1 Distill Llama 70Bllama78.2 GBcomfortableQ8 · / 128 GB

21★Llama 3_3 Nemotron Super 49B v1_5llama54.8 GBcomfortableQ8 · / 128 GB

22★Mixtral 8x7Bmistral105.1 GBcomfortableFP16 · / 128 GB

23Falcon 40Bfalcon89.4 GBcomfortableFP16 · / 128 GB

24Command R 35Bcommand78.2 GBcomfortableFP16 · / 128 GB

25Aya 23 35Baya78.2 GBcomfortableFP16 · / 128 GB

26CodeLlama 34Bcodellama76.0 GBcomfortableFP16 · / 128 GB

27Yi 1.5 34Byi76.0 GBcomfortableFP16 · / 128 GB

28★dolphin 2.9.1 yi 1.5 34byi76.0 GBcomfortableFP16 · / 128 GB

29★Qwen 2.5 32Bqwen71.5 GBcomfortableFP16 · / 128 GB

30★Qwen 2.5 Coder 32Bqwen71.5 GBcomfortableFP16 · / 128 GB

Tight models

1 models

These models barely fit. They can run, but context and speed will be limited.

01Falcon 180Bfalcon113.2 GBtightQ4 · / 128 GB

Unlocked in a 2x rig

256 GB

With two cards in parallel (256 GB total), larger models become reachable.

01★Llama 3.1 405Bllama254.6 GBtightQ4 · / 256 GB

02Hermes 3 405Bhermes254.6 GBtightQ4 · / 256 GB

03★Llama 3.1 405Bllama254.6 GBtightQ4 · / 256 GB

04★Llama 4 Maverick 17Bx128llama251.5 GBtightQ4 · / 256 GB

05Nemotron 340Bnemotron213.7 GBcomfortableQ4 · / 256 GB

06DeepSeek V2deepseek181.3 GBcomfortableQ5 · / 256 GB

07DeepSeek Coder V2deepseek181.3 GBcomfortableQ5 · / 256 GB

08★Qwen 3 235B A22Bqwen180.6 GBcomfortableQ5 · / 256 GB

09★Qwen3 235B A22Bqwen180.6 GBcomfortableQ5 · / 256 GB

Unlocked in a 4x rig

512 GB

Server-style configuration (512 GB total) for the largest open-weight models.

01★DeepSeek V3.2deepseek430.6 GBcomfortableQ4 · / 512 GB

02★DeepSeek V4 Prodeepseek430.6 GBcomfortableQ4 · / 512 GB

03★DeepSeek R1deepseek421.8 GBcomfortableQ4 · / 512 GB

04★DeepSeek V3deepseek421.8 GBcomfortableQ4 · / 512 GB

05★DeepSeek R1 (0528 snapshot)deepseek421.8 GBcomfortableQ4 · / 512 GB

Similar GPUs

VRAM estimates updated 2026-06-27. Apple Silicon: part of unified memory remains reserved for the system.