LLM Frontends

When I started experimenting with LLMs the UIs for them were in active development and now some of them are really good.

!- Jan - multiplatform ui for LLMs(jan-site_w678.jpg Jan - Frontend for LLMs - install)

Jan

Jan(https://jan.ai/) is available for Windows, Linux and Mac.

Has dark, light and transparent themes.

!- Jan LLM frontend - main window(jan-self_w678.jpg Jan - Frontend for LLMs - example of reply to why selfhost)

Can connect to several existing backends like Anthropic, Cohere, OpenAI, NvidiaNIM, MistralAI etc, and host models on its own - see the Cortex section on the screenshot below - showing Jan downloaded and hosting locally Llama3 8b q4 and Phi3 medium (q4).

!- Jan LLM frontend - config options(jan-config_w678.jpg Jan LLM frontend - config options)

Pros (What I liked):

Intuitive interface
Ability to experiment with model temperature, topp, frequency and presense penalties and system prompts.
Provides API server

Cons:

Somehow slow on my ubuntu-based os. On windows it did run ok.
Can connect to many backends, but all of them are managed. Would be nice to use Ollama option.
Not many variants of the models available for self-hosting in Cortex. Not too many quantizations options either.
Yes, Huggingface gguf is awesome. But I wanted
- to reuse what ollama already downloaded loaded into VRAM
- not host host same model everywhere

KoboldAI

Very prominent one

Silly Tavern

Another very versatile one

LLM Studio

LLM Studio is not my favorite UI for LLMs, but it has better access to huggingface models.

Commandline Ollama

Yes, that’s also a user interfece, just a commandline one.

Need to run for llama3.1 LLM:

ollama run llama3.1

when done, send a command to exit ollama commandline:

/bye

cURL Ollama

Install cUrl you you haven’t done it yet

sudo apt-get install curl

To call local mistral nemo q8 llm hosted on ollama - create local file with the prompt p.json:

{
  model: mistral-nemo:12b-instruct-2407-q8_0,
  prompt: What is post-modernism?,
  stream: false
}

and now execute in bash terminal

curl -X POST http://localhost:11434/api/generate -d @p.json > p-result.json

the result will be in the file p-result.json

if you wan just to print the result:

curl -X POST http://localhost:11434/api/generate -d @p.json

Also:

Didn’t test those, but quite comprehensive list of LLM UIs:

Useful links

Vane (Perplexica 2.0) Quickstart With Ollama and llama.cpp

Vane is one of the more pragmatic entries in the “AI search with citations” space: a self-hosted answering engine that mixes live web retrieval with local or cloud LLMs, while keeping the whole stack under your control.

Open WebUI is a powerful, extensible, and feature-rich self-hosted web interface for interacting with large language models.

Locally hosted Ollama allows to run large language models on your own machine, but using it via command-line isn’t user-friendly. Here are several open-source projects provide ChatGPT-style interfaces that connect to a local Ollama.

Awesome food is the pleasure for your eyes too. But in this post we will compare two AI-based search systems, Farfalle and Perplexica.

That’s very exciting! Instead of calling copilot or perplexity.ai and telling all the world what you are after, you can now host similar service on your own PC or laptop!

LLM Frontends

Jan

KoboldAI

Silly Tavern

LLM Studio

Commandline Ollama

cURL Ollama

Also:

Useful links

Vane (Perplexica 2.0) Quickstart With Ollama and llama.cpp

Open WebUI: Self-Hosted LLM Interface

Chat UIs for Local Ollama Instances

Farfalle vs Perplexica

Self-hosting Perplexica - with Ollama