NVIDIAConsumerGTX 16

GTX 1660 Super for local AI

GTX 1660 Super provides 6 GB of VRAM for local AI. In the LocalIA catalog, 135 out of 242 models run comfortably on a single card.

View all compatible models →Rig around the Super ↗

VRAM

6GB

Models that run comfortably

135 models

These models fit in 6 GB with room for context and stable inference.

01Llama 3 8Bllama5.0 GBcomfortableQ4 · / 6 GB

02★Llama 3.1 8Bllama5.0 GBcomfortableQ4 · / 6 GB

03Ministral 8Bmistral5.0 GBcomfortableQ4 · / 6 GB

04★Qwen 3 8Bqwen5.0 GBcomfortableQ4 · / 6 GB

05DeepSeek R1 Distill 8Bdeepseek5.0 GBcomfortableQ4 · / 6 GB

06Aya 23 8Baya5.0 GBcomfortableQ4 · / 6 GB

07Granite 3 8Bgranite5.0 GBcomfortableQ4 · / 6 GB

08★Hermes 3 8Bhermes5.0 GBcomfortableQ4 · / 6 GB

09★DeepSeek R1 Distill Llama 8Bdeepseek5.0 GBcomfortableQ4 · / 6 GB

10★MiniCPM 4.1 8Bminicpm5.0 GBcomfortableQ4 · / 6 GB

11★Qwen3 8Bqwen5.0 GBcomfortableQ4 · / 6 GB

12★Llama 3.1 8B Instructllama5.0 GBcomfortableQ4 · / 6 GB

13★Llama 3.1 8Bllama5.0 GBcomfortableQ4 · / 6 GB

14★Meta Llama 3 8B Instructllama5.0 GBcomfortableQ4 · / 6 GB

15★Meta Llama 3 8Bllama5.0 GBcomfortableQ4 · / 6 GB

16★DeepSeek R1 0528 Qwen3 8Bqwen5.0 GBcomfortableQ4 · / 6 GB

17★Nemotron Labs Diffusion 8B Basenemotron5.0 GBcomfortableQ4 · / 6 GB

18★Qwen3 8B Baseqwen5.0 GBcomfortableQ4 · / 6 GB

19★Meta Llama 3.1 8B Instructllama5.0 GBcomfortableQ4 · / 6 GB

20★saiga_llama3_8bllama5.0 GBcomfortableQ4 · / 6 GB

21★DeepSeek R1 Distill Llama 8Bllama5.0 GBcomfortableQ4 · / 6 GB

22★Hermes 3 Llama 3.1 8Bllama5.0 GBcomfortableQ4 · / 6 GB

23★Meta Llama 3.1 8B Instructllama5.0 GBcomfortableQ4 · / 6 GB

24★Phi Mini MoE 7.6Bphi4.8 GBcomfortableQ4 · / 6 GB

25Llama 2 7Bllama4.4 GBcomfortableQ4 · / 6 GB

26CodeLlama 7Bcodellama4.4 GBcomfortableQ4 · / 6 GB

27★Mistral 7Bmistral4.4 GBcomfortableQ4 · / 6 GB

28Mathstral 7Bmistral4.4 GBcomfortableQ4 · / 6 GB

29★Qwen 2.5 7Bqwen4.4 GBcomfortableQ4 · / 6 GB

30★Qwen 2.5 Coder 7Bqwen4.4 GBcomfortableQ4 · / 6 GB

Tight models

8 models

These models barely fit. They can run, but context and speed will be limited.

01★Gemma 2 9Bgemma5.7 GBtightQ4 · / 6 GB

02Yi 1.5 9Byi5.7 GBtightQ4 · / 6 GB

03★Qwen 3.5 9Bqwen5.7 GBtightQ4 · / 6 GB

04★GLM-4 9Bglm5.7 GBtightQ4 · / 6 GB

05★GLM-4.7 Flashglm5.7 GBtightQ4 · / 6 GB

06GLM-4.1V 9B Thinkingglm5.7 GBtightQ4 · / 6 GB

07★NVIDIA Nemotron Nano 9Bnemotron5.7 GBtightQ4 · / 6 GB

08★gemma 2 9b itgemma5.7 GBtightQ4 · / 6 GB

Unlocked in a 2x rig

12 GB

With two cards in parallel (12 GB total), larger models become reachable.

01DeepSeek V2 Litedeepseek10.1 GBcomfortableQ4 · / 12 GB

02DeepSeek Coder V2 Litedeepseek10.1 GBcomfortableQ4 · / 12 GB

03StarCoder 2 15Bstarcoder9.4 GBcomfortableQ4 · / 12 GB

04★Phi-4 Reasoning Vision 15Bphi9.4 GBcomfortableQ4 · / 12 GB

05★Qwen 2.5 14Bqwen8.8 GBcomfortableQ4 · / 12 GB

06Qwen 2.5 Coder 14Bqwen8.8 GBcomfortableQ4 · / 12 GB

07★Qwen 3 14Bqwen8.8 GBcomfortableQ4 · / 12 GB

08★DeepSeek R1 Distill 14Bdeepseek8.8 GBcomfortableQ4 · / 12 GB

09Phi-3 Medium 14Bphi8.8 GBcomfortableQ4 · / 12 GB

10★Phi-4 14Bphi8.8 GBcomfortableQ4 · / 12 GB

11★GLM-4.5 Airglm8.8 GBcomfortableQ4 · / 12 GB

12★Qwen2.5 Coder 14B Instructqwen8.8 GBcomfortableQ4 · / 12 GB

13★Qwen3 14Bqwen8.8 GBcomfortableQ4 · / 12 GB

14★Qwen2.5 14B Instructqwen8.8 GBcomfortableQ4 · / 12 GB

15Llama 2 13Bllama10.0 GBcomfortableQ5 · / 12 GB

Unlocked in a 4x rig

24 GB

Server-style configuration (24 GB total) for the largest open-weight models.

01Command R 35Bcommand22.0 GBtightQ4 · / 24 GB

02Aya 23 35Baya22.0 GBtightQ4 · / 24 GB

03CodeLlama 34Bcodellama21.4 GBtightQ4 · / 24 GB

04Yi 1.5 34Byi21.4 GBtightQ4 · / 24 GB

05★dolphin 2.9.1 yi 1.5 34byi21.4 GBtightQ4 · / 24 GB

06★Qwen 2.5 32Bqwen20.1 GBcomfortableQ4 · / 24 GB

07★Qwen 2.5 Coder 32Bqwen20.1 GBcomfortableQ4 · / 24 GB

08★Qwen 3 32Bqwen20.1 GBcomfortableQ4 · / 24 GB

09★QwQ 32Bqwq20.1 GBcomfortableQ4 · / 24 GB

10★DeepSeek R1 Distill 32Bdeepseek20.1 GBcomfortableQ4 · / 24 GB

Similar GPUs

VRAM estimates updated 2026-06-27.