Jupyter Notebook Cheatsheet
Essential shortcuts and magic commands
Jumpstart the Jupyter Notebook productivity with essential shortcuts, magic commands, and workflow tips that will transform your data science and development experience.
Essential shortcuts and magic commands
Jumpstart the Jupyter Notebook productivity with essential shortcuts, magic commands, and workflow tips that will transform your data science and development experience.
Build AI search agents with Python and Ollama
Ollama’s Python library now includes native OLlama web search capabilities. With just a few lines of code, you can augment your local LLMs with real-time information from the web, reducing hallucinations and improving accuracy.
Pick the right vector DB for your RAG stack
Choosing the right vector store can make or break your RAG application’s performance, cost, and scalability. This comprehensive comparison covers the most popular options in 2024-2025.
Build AI search agents with Go and Ollama
Ollama’s Web Search API lets you augment local LLMs with real-time web information. This guide shows you how to implement web search capabilities in Go, from simple API calls to full-featured search agents.
Master local LLM deployment with 12+ tools compared
Local deployment of LLMs has become increasingly popular as developers and organizations seek enhanced privacy, reduced latency, and greater control over their AI infrastructure.
Build robust AI/ML pipelines with Go microservices
As AI and ML workloads become increasingly complex, the need for robust orchestration systems has become greater. Go’s simplicity, performance, and concurrency makes it an ideal choice for building the orchestration layer of ML pipelines, even when the models themselves are written in Python.
Unify text, images, and audio in shared embedding spaces
Cross-modal embeddings represent a breakthrough in artificial intelligence, enabling understanding and reasoning across different data types within a unified representation space.
Deploy enterprise AI on budget hardware with open models
The democratization of AI is here. With open-source LLMs like Llama 3, Mixtral, and Qwen now rivaling proprietary models, teams can build powerful AI infrastructure using consumer hardware - slashing costs while maintaining complete control over data privacy and deployment.
LongRAG, Self-RAG, GraphRAG - Next-gen techniques
Retrieval-Augmented Generation (RAG) has evolved far beyond simple vector similarity search. LongRAG, Self-RAG, and GraphRAG represent the cutting edge of these capabilities.
Master Linux environment setup for data science work
Linux has become the de facto operating system for data science professionals, offering unmatched flexibility, performance, and a rich ecosystem of tools.
Speed-up FLUX.1-dev with GGUF quantization
FLUX.1-dev is a powerful text-to-image model that produces stunning results, but its 24GB+ memory requirement makes it challenging to run on many systems. GGUF quantization of FLUX.1-dev offers a solution, reducing memory usage by approximately 50% while maintaining excellent image quality.
Configure context sizes in Docker Model Runner with workarounds
Configuring context sizes in Docker Model Runner is more complex than it should be.
AI model for augmenting images with text instructions
Black Forest Labs has released FLUX.1-Kontext-dev, an advanced image-to-image AI model that augments existing images using text instructions.
Enable GPU acceleration for Docker Model Runner with NVIDIA CUDA support
Docker Model Runner is Docker’s official tool for running AI models locally, but enabling NVidia GPU acceleration in Docker Model Runner requires specific configuration.
Cut LLM costs by 80% with smart token optimization
Token optimization is the critical skill separating cost-effective LLM applications from budget-draining experiments.
GPT-OSS 120b benchmarks on three AI platforms
I dug up some interesting performance tests of GPT-OSS 120b running on Ollama across three different platforms: NVIDIA DGX Spark, Mac Studio, and RTX 4080. The GPT-OSS 120b model from the Ollama library weighs in at 65GB, which means it doesn’t fit into the 16GB VRAM of an RTX 4080 (or the newer RTX 5080).