self-hosted/ai
§01·model · /models

Qwen3-8B

llmactive
§02·GPUs that run this model
15 total
GPUVRAMSeriesBest speedMin VRAMWorksBenchmarks
RTX 509032GB50200.4tokens/s3check ↗
RTX 508016GB50129.1tokens/s16GB2check ↗
RTX 3090 Ti24GB30123.7tokens/s24GB2check ↗
RTX 5070 Ti16GB50120.5tokens/s16GB2check ↗
RTX 309024GB30115.3tokens/s24GB2check ↗
RTX 3080 Ti12GB30115.2tokens/s2check ↗
RTX 4080 Super16GB40104.2tokens/s16GB2check ↗
RTX 408016GB40102.7tokens/s16GB2check ↗
RTX 4070 Ti Super16GB4096.3tokens/s16GB2check ↗
RTX 507012GB5085.8tokens/s12GB2check ↗
RTX 4070 Ti12GB4075.8tokens/s12GB2check ↗
RTX 4070 Super12GB4075.4tokens/s12GB2check ↗
RTX 5060 Ti16GB5069.2tokens/s16GB2check ↗
RTX 4060 Ti 16GB16GB4045.8tokens/s16GB1check ↗
RTX 306012GB3042tokens/s1check ↗