ModelMeter
Catalog

Backends

6 entries

Backend list

NameCategoryTarget HWMultiuserOpenAI APIConfigurable parameters
Hugging Face Transformerslibrarynvidia,amd,intel,cpu,appleNoNo
dtypedevice_mapmax_new_tokensnum_beams
MLX-LMappleappleNoNo
max_tokenstemperaturetop_p
Ollamalocalapple,nvidia,amd,cpuNoYes
num_ctxnum_predicttemperaturetop_p
Text Generation Inferenceservernvidia,amdYesYes
dtypequantizenum_shardcuda_memory_fractionmax_concurrent_requestsmax_input_tokensmax_total_tokens
llama.cpplocalcpu,metal,cuda,hip,vulkanNoNo
ctx_sizen_predictn_gpu_layers
vLLMservernvidia,amd,intel,cpu,appleYesYes
dtypequantizationgpu_memory_utilizationmax_model_lentensor_parallel_sizemax_num_seqs