Unload All llama.cpp Router Models Without Restarting
Free VRAM without killing llama-server.
llama.cpp router mode is one of the most useful changes to llama-server in years. It finally gives local LLM operators something close to the model management experience people expect from Ollama, while keeping the raw performance and low-level control that make llama.cpp worth using in the first place.