Ollama

How Ollama Handles Parallel Requests

How Ollama Handles Parallel Requests

Configuring ollama for parallel requests executions.

When the Ollama server receives two requests at the same time, its behavior depends on its configuration and available system resources.

Testing Deepseek-R1 on Ollama

Testing Deepseek-R1 on Ollama

Comparing two deepseek-r1 models to two base ones

DeepSeek’s first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.

Self-hosting Perplexica - with Ollama

Self-hosting Perplexica - with Ollama

Running copilot-style service locally? Easy!

That’s very exciting! Instead of calling copilot or perplexity.ai and telling all the world what you are after, you can now host similar service on your own PC or laptop!

Gemma2 vs Qwen2 vs Mistral Nemo vs...

Gemma2 vs Qwen2 vs Mistral Nemo vs...

Testing logical fallacy detection

Recently we have seen several new LLMs were released. Exciting times. Let’s test and see how they perform when detecting logical fallacies.

LLM Frontends

LLM Frontends

Not so many to choose from but still....

When I started experimenting with LLMs the UIs for them were in active development and now some of them are really good.