Rig configureren →

Llama3B paramspopulair

Llama 3.2 3B lokaal

Llama 3.2 3B is een open-weight LLM uit de Llama-familie met 3B parameters. Hoofdgebruik: chat, RAG and general assistance. Gedetecteerde minimumhardware: GTX 1650 (4 GB).

Check op mijn GPU →Hugging Face ↗

Technische info

Parameters3B

Q4_K_M1.9 GB

Q5_K_M2.3 GB

Q83.4 GB

FP166.7 GB

FamilieLlama

Laatste sync2026-05-12

Beschikbare quantizations

GGUF-gewichten

Q4_K_M

1.9GB

Acceptabel. Goede keuze bij beperkte VRAM.

Q5_K_M

2.3GB

Goede kwaliteit. Sweet spot voor grootte en precisie.

Q8

3.4GB

Bijna FP16-kwaliteit. Comfortabel voor productie.

FP16

6.7GB

Referentieprecisie. Maximale kwaliteit, dubbele VRAM.

Compatibele GPUs

12 single-GPU

GPUs die Llama 3.2 3B op één kaart kunnen draaien, gesorteerd op VRAM-marge.

3.4 / 4 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

3.4 / 6 GBcomfortabel · Q8

6 GB · Arc Alchemist

3.4 / 6 GBcomfortabel · Q8

6.7 / 8 GBcomfortabel · FP16

6.7 / 8 GBcomfortabel · FP16

6.7 / 8 GBcomfortabel · FP16

6.7 / 8 GBcomfortabel · FP16

Aanbevolen multi-GPU rigs

2x / 4x consumer GPUs

Voor Llama 3.2 3B met hogere quantization of meer context geeft een multi-GPU rig meer marge.

6.7 / 8 GBcomfortabel · FP16

2× GTX 1060 6GB

12 GB · GTX 10

6.7 / 12 GBcomfortabel · FP16

12 GB · GTX 16

6.7 / 12 GBcomfortabel · FP16

2× GTX 1660 Super

12 GB · GTX 16

6.7 / 12 GBcomfortabel · FP16

2× GTX 1660 Ti

12 GB · GTX 16

6.7 / 12 GBcomfortabel · FP16

2× RTX 2060 6GB

12 GB · RTX 20

6.7 / 12 GBcomfortabel · FP16

2× RTX 3050 6GB

12 GB · RTX 30

6.7 / 12 GBcomfortabel · FP16

12 GB · Arc Alchemist

6.7 / 12 GBcomfortabel · FP16

Aanbevolen rig

2× GTX 1650

Llama 3.2 3B with Ubuntu, vLLM, Open WebUI and the model already downloaded.

Configureer →

Vergelijkbare modellen

VRAM-schatting: parameters x bits/8 plus marge. Echte prestaties hangen af van engine, context en batch.
sync: 2026-05-12