Ollama

Comparison of Hugo Page Translation quality - LLMs on Ollama

In this test I’m comparing how different LLMs hosted on Ollama translate Hugo page in English to German. Three pages I tested were on different topics, had some nice markdown with some structure: headers, lists, tables, links, etc.

Reranking text documents with Ollama and Qwen3 Embedding model - in Go

This little Reranking Go code example is calling Ollama to generate embeddings for the query and for eache candidate document, then sorting descending by cosine similarity.

Qwen3 Embedding & Reranker Models on Ollama: State-of-the-Art Performance

The Qwen3 Embedding and Reranker models are the latest releases in the Qwen family, specifically designed for advanced text embedding, retrieval, and reranking tasks.

LLM Performance and PCIe Lanes: Key Considerations

How PCIe Lanes Affect LLM Performance? Depending on the task. For training and multi-gpu inferrence - perdormance drop is significant.

Convert HTML content to Markdown using LLM and Ollama

In the Ollama models library there are models that able convert HTML content to Markdown, which is useful for content conversion tasks.

Will list here some AI-assisted coding tools and AI Coding Assistants and their nice sides.

Test: How Ollama is using Intel CPU Performance and Efficient Cores

I’ve got a theory to test - if utilising ALL cores on Intel CPU would raise the speed of LLMs? This is bugging me that new gemma3 27 bit model (gemma3:27b, 17GB on ollama) is not fitting into 16GB VRAM of my GPU, and partially running on CPU.

When the Ollama server receives two requests at the same time, its behavior depends on its configuration and available system resources.

DeepSeek’s first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.

Here is the list and examples of the most useful Ollama commands (Ollama commands cheatsheet) I compiled some time ago. Hopefully it will be useful to you.

Not long ago was released. Let’s catch up and test how Mistral Small performs comparing to other LLMs.

Reranking is a second step in Retrieval Augmented Generation (RAG) systems, right between Retrieving and Generating.

Testing how Perplexica performs with various LLMs running on local Ollama: Llama3, Llama3.1, Hermes 3, Mistral Nemo, Mistral Large, Gemma 2, Qwen2, Phi 3 and Command-r of various quants and selecting The best LLM for Perplexica

Awesome food is the pleasure for your eyes too. But in this post we will compare two AI-based search systems, Farfalle and Perplexica.

That’s very exciting! Instead of calling copilot or perplexity.ai and telling all the world what you are after, you can now host similar service on your own PC or laptop!

Comparison of Hugo Page Translation quality - LLMs on Ollama

Reranking text documents with Ollama and Qwen3 Embedding model - in Go

Qwen3 Embedding & Reranker Models on Ollama: State-of-the-Art Performance

LLM Performance and PCIe Lanes: Key Considerations

Convert HTML content to Markdown using LLM and Ollama

AI Coding Assistants comparison

Test: How Ollama is using Intel CPU Performance and Efficient Cores

How Ollama Handles Parallel Requests

Testing Deepseek-R1 on Ollama

Ollama Cheatsheet

Mistral Small, Gemma 2, Qwen 2.5, Mistral Nemo, LLama3 and Phi - LLM Test

Reranking with embedding models

Choosing best LLM for Perplexica

Farfalle vs Perplexica

Self-hosting Perplexica - with Ollama