Using Ollama Web Search API in Go

Q: What is the difference between web_search and web_fetch endpoints?

The web_search endpoint queries the internet and returns multiple search results with titles, URLs, and snippets. The web_fetch endpoint retrieves the full content of a specific URL, returning the page title, content, and links.

Q: How do I authenticate with Ollama’s web search API in Go?

Set the Authorization header with your API key as a Bearer token. Create an API key from your Ollama account and pass it in the request header.

Build AI search agents with Go and Ollama

Page content

Ollama’s Web Search API lets you augment local LLMs with real-time web information. This guide shows you how to implement web search capabilities in Go, from simple API calls to full-featured search agents.

Getting Started

Does Ollama have an official Go library for web search? Ollama provides a REST API for web search that works with any Go HTTP client. While there’s no official Go SDK for web search yet, you can easily implement the API calls using standard library packages.

First, create an API key from your Ollama account. For a comprehensive reference on Ollama commands and usage, check out the Ollama cheatsheet.

Set your API key as an environment variable:

export OLLAMA_API_KEY="your_api_key"

On Windows PowerShell:

$env:OLLAMA_API_KEY = "your_api_key"

Project Setup

Create a new Go module:

mkdir ollama-search
cd ollama-search
go mod init ollama-search

Basic Web Search

How do I authenticate with Ollama’s web search API in Go? Set the Authorization header with your API key as a Bearer token. Create an API key from your Ollama account and pass it in the request header.

Here’s a complete implementation of the web search API:

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
)

// Request/Response types for web_search
type WebSearchRequest struct {
	Query      string `json:"query"`
	MaxResults int    `json:"max_results,omitempty"`
}

type WebSearchResult struct {
	Title   string `json:"title"`
	URL     string `json:"url"`
	Content string `json:"content"`
}

type WebSearchResponse struct {
	Results []WebSearchResult `json:"results"`
}

func webSearch(query string, maxResults int) (*WebSearchResponse, error) {
	apiKey := os.Getenv("OLLAMA_API_KEY")
	if apiKey == "" {
		return nil, fmt.Errorf("OLLAMA_API_KEY environment variable not set")
	}

	reqBody := WebSearchRequest{
		Query:      query,
		MaxResults: maxResults,
	}
	jsonData, err := json.Marshal(reqBody)
	if err != nil {
		return nil, fmt.Errorf("failed to marshal request: %w", err)
	}

	req, err := http.NewRequest("POST", "https://ollama.com/api/web_search", bytes.NewBuffer(jsonData))
	if err != nil {
		return nil, fmt.Errorf("failed to create request: %w", err)
	}

	req.Header.Set("Authorization", "Bearer "+apiKey)
	req.Header.Set("Content-Type", "application/json")

	client := &http.Client{}
	resp, err := client.Do(req)
	if err != nil {
		return nil, fmt.Errorf("request failed: %w", err)
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		body, _ := io.ReadAll(resp.Body)
		return nil, fmt.Errorf("API error (status %d): %s", resp.StatusCode, string(body))
	}

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return nil, fmt.Errorf("failed to read response: %w", err)
	}

	var searchResp WebSearchResponse
	if err := json.Unmarshal(body, &searchResp); err != nil {
		return nil, fmt.Errorf("failed to unmarshal response: %w", err)
	}

	return &searchResp, nil
}

func main() {
	results, err := webSearch("What is Ollama?", 5)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Println("Search Results:")
	fmt.Println("===============")
	for i, result := range results.Results {
		fmt.Printf("\n%d. %s\n", i+1, result.Title)
		fmt.Printf("   URL: %s\n", result.URL)
		fmt.Printf("   %s\n", truncate(result.Content, 150))
	}
}

func truncate(s string, maxLen int) string {
	if len(s) <= maxLen {
		return s
	}
	return s[:maxLen] + "..."
}

Web Fetch Implementation

What is the difference between web_search and web_fetch endpoints? The web_search endpoint queries the internet and returns multiple search results with titles, URLs, and snippets. The web_fetch endpoint retrieves the full content of a specific URL, returning the page title, content, and links.

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
)

// Request/Response types for web_fetch
type WebFetchRequest struct {
	URL string `json:"url"`
}

type WebFetchResponse struct {
	Title   string   `json:"title"`
	Content string   `json:"content"`
	Links   []string `json:"links"`
}

func webFetch(url string) (*WebFetchResponse, error) {
	apiKey := os.Getenv("OLLAMA_API_KEY")
	if apiKey == "" {
		return nil, fmt.Errorf("OLLAMA_API_KEY environment variable not set")
	}

	reqBody := WebFetchRequest{URL: url}
	jsonData, err := json.Marshal(reqBody)
	if err != nil {
		return nil, fmt.Errorf("failed to marshal request: %w", err)
	}

	req, err := http.NewRequest("POST", "https://ollama.com/api/web_fetch", bytes.NewBuffer(jsonData))
	if err != nil {
		return nil, fmt.Errorf("failed to create request: %w", err)
	}

	req.Header.Set("Authorization", "Bearer "+apiKey)
	req.Header.Set("Content-Type", "application/json")

	client := &http.Client{}
	resp, err := client.Do(req)
	if err != nil {
		return nil, fmt.Errorf("request failed: %w", err)
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		body, _ := io.ReadAll(resp.Body)
		return nil, fmt.Errorf("API error (status %d): %s", resp.StatusCode, string(body))
	}

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return nil, fmt.Errorf("failed to read response: %w", err)
	}

	var fetchResp WebFetchResponse
	if err := json.Unmarshal(body, &fetchResp); err != nil {
		return nil, fmt.Errorf("failed to unmarshal response: %w", err)
	}

	return &fetchResp, nil
}

func main() {
	result, err := webFetch("https://ollama.com")
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Printf("Title: %s\n\n", result.Title)
	fmt.Printf("Content:\n%s\n\n", result.Content)
	fmt.Printf("Links found: %d\n", len(result.Links))
	for i, link := range result.Links {
		if i >= 5 {
			fmt.Printf("  ... and %d more\n", len(result.Links)-5)
			break
		}
		fmt.Printf("  - %s\n", link)
	}
}

Reusable Client Package

Create a reusable Ollama client package for cleaner code:

// ollama/client.go
package ollama

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
	"time"
)

type Client struct {
	apiKey     string
	httpClient *http.Client
	baseURL    string
}

func NewClient() (*Client, error) {
	apiKey := os.Getenv("OLLAMA_API_KEY")
	if apiKey == "" {
		return nil, fmt.Errorf("OLLAMA_API_KEY environment variable not set")
	}

	return &Client{
		apiKey: apiKey,
		httpClient: &http.Client{
			Timeout: 30 * time.Second,
		},
		baseURL: "https://ollama.com/api",
	}, nil
}

type SearchResult struct {
	Title   string `json:"title"`
	URL     string `json:"url"`
	Content string `json:"content"`
}

type SearchResponse struct {
	Results []SearchResult `json:"results"`
}

type FetchResponse struct {
	Title   string   `json:"title"`
	Content string   `json:"content"`
	Links   []string `json:"links"`
}

func (c *Client) WebSearch(query string, maxResults int) (*SearchResponse, error) {
	payload := map[string]interface{}{
		"query":       query,
		"max_results": maxResults,
	}
	return doRequest[SearchResponse](c, "/web_search", payload)
}

func (c *Client) WebFetch(url string) (*FetchResponse, error) {
	payload := map[string]string{"url": url}
	return doRequest[FetchResponse](c, "/web_fetch", payload)
}

func doRequest[T any](c *Client, endpoint string, payload interface{}) (*T, error) {
	jsonData, err := json.Marshal(payload)
	if err != nil {
		return nil, err
	}

	req, err := http.NewRequest("POST", c.baseURL+endpoint, bytes.NewBuffer(jsonData))
	if err != nil {
		return nil, err
	}

	req.Header.Set("Authorization", "Bearer "+c.apiKey)
	req.Header.Set("Content-Type", "application/json")

	resp, err := c.httpClient.Do(req)
	if err != nil {
		return nil, err
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return nil, err
	}

	if resp.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("API error (status %d): %s", resp.StatusCode, string(body))
	}

	var result T
	if err := json.Unmarshal(body, &result); err != nil {
		return nil, err
	}

	return &result, nil
}

Usage:

package main

import (
	"fmt"
	"log"
	"ollama-search/ollama"
)

func main() {
	client, err := ollama.NewClient()
	if err != nil {
		log.Fatal(err)
	}

	// Search
	results, err := client.WebSearch("Ollama new features", 5)
	if err != nil {
		log.Fatal(err)
	}

	for _, r := range results.Results {
		fmt.Printf("- %s\n  %s\n\n", r.Title, r.URL)
	}

	// Fetch
	page, err := client.WebFetch("https://ollama.com")
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("Page: %s\n", page.Title)
}

Building a Search Agent

Which models work best for Go-based Ollama search agents? Models with strong tool-use capabilities work best, including qwen3, gpt-oss, and cloud models like qwen3:480b-cloud and deepseek-v3.1-cloud. If you’re working with Qwen3 models and need to process or rerank search results, see our guide on reranking text documents with Ollama and Qwen3 Embedding model in Go.

Here’s a complete search agent implementation:

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
)

// Chat types
type Message struct {
	Role      string     `json:"role"`
	Content   string     `json:"content,omitempty"`
	ToolCalls []ToolCall `json:"tool_calls,omitempty"`
	ToolName  string     `json:"tool_name,omitempty"`
}

type ToolCall struct {
	Function FunctionCall `json:"function"`
}

type FunctionCall struct {
	Name      string                 `json:"name"`
	Arguments map[string]interface{} `json:"arguments"`
}

type Tool struct {
	Type     string       `json:"type"`
	Function ToolFunction `json:"function"`
}

type ToolFunction struct {
	Name        string                 `json:"name"`
	Description string                 `json:"description"`
	Parameters  map[string]interface{} `json:"parameters"`
}

type ChatRequest struct {
	Model    string    `json:"model"`
	Messages []Message `json:"messages"`
	Tools    []Tool    `json:"tools,omitempty"`
	Stream   bool      `json:"stream"`
}

type ChatResponse struct {
	Message Message `json:"message"`
}

// SearchAgent orchestrates web search with LLM
type SearchAgent struct {
	model      string
	ollamaURL  string
	apiKey     string
	maxIter    int
	httpClient *http.Client
}

func NewSearchAgent(model string) *SearchAgent {
	return &SearchAgent{
		model:      model,
		ollamaURL:  "http://localhost:11434/api/chat",
		apiKey:     os.Getenv("OLLAMA_API_KEY"),
		maxIter:    10,
		httpClient: &http.Client{},
	}
}

func (a *SearchAgent) Query(question string) (string, error) {
	tools := []Tool{
		{
			Type: "function",
			Function: ToolFunction{
				Name:        "web_search",
				Description: "Search the web for current information",
				Parameters: map[string]interface{}{
					"type": "object",
					"properties": map[string]interface{}{
						"query": map[string]string{
							"type":        "string",
							"description": "The search query",
						},
					},
					"required": []string{"query"},
				},
			},
		},
		{
			Type: "function",
			Function: ToolFunction{
				Name:        "web_fetch",
				Description: "Fetch the full content of a web page",
				Parameters: map[string]interface{}{
					"type": "object",
					"properties": map[string]interface{}{
						"url": map[string]string{
							"type":        "string",
							"description": "The URL to fetch",
						},
					},
					"required": []string{"url"},
				},
			},
		},
	}

	messages := []Message{
		{Role: "user", Content: question},
	}

	for i := 0; i < a.maxIter; i++ {
		response, err := a.chat(messages, tools)
		if err != nil {
			return "", fmt.Errorf("chat error: %w", err)
		}

		messages = append(messages, response.Message)

		// No tool calls means we have a final answer
		if len(response.Message.ToolCalls) == 0 {
			return response.Message.Content, nil
		}

		// Execute tool calls
		for _, toolCall := range response.Message.ToolCalls {
			fmt.Printf("🔧 Calling: %s\n", toolCall.Function.Name)

			result, err := a.executeTool(toolCall)
			if err != nil {
				result = fmt.Sprintf("Error: %v", err)
			}

			// Truncate for context limits
			if len(result) > 8000 {
				result = result[:8000] + "... [truncated]"
			}

			messages = append(messages, Message{
				Role:     "tool",
				Content:  result,
				ToolName: toolCall.Function.Name,
			})
		}
	}

	return "", fmt.Errorf("max iterations reached")
}

func (a *SearchAgent) chat(messages []Message, tools []Tool) (*ChatResponse, error) {
	reqBody := ChatRequest{
		Model:    a.model,
		Messages: messages,
		Tools:    tools,
		Stream:   false,
	}

	jsonData, _ := json.Marshal(reqBody)

	resp, err := a.httpClient.Post(a.ollamaURL, "application/json", bytes.NewBuffer(jsonData))
	if err != nil {
		return nil, err
	}
	defer resp.Body.Close()

	body, _ := io.ReadAll(resp.Body)

	var chatResp ChatResponse
	if err := json.Unmarshal(body, &chatResp); err != nil {
		return nil, err
	}

	return &chatResp, nil
}

func (a *SearchAgent) executeTool(toolCall ToolCall) (string, error) {
	switch toolCall.Function.Name {
	case "web_search":
		query, ok := toolCall.Function.Arguments["query"].(string)
		if !ok {
			return "", fmt.Errorf("invalid query argument")
		}
		return a.webSearch(query)

	case "web_fetch":
		url, ok := toolCall.Function.Arguments["url"].(string)
		if !ok {
			return "", fmt.Errorf("invalid url argument")
		}
		return a.webFetch(url)

	default:
		return "", fmt.Errorf("unknown tool: %s", toolCall.Function.Name)
	}
}

func (a *SearchAgent) webSearch(query string) (string, error) {
	payload := map[string]interface{}{"query": query, "max_results": 5}
	jsonData, _ := json.Marshal(payload)

	req, _ := http.NewRequest("POST", "https://ollama.com/api/web_search", bytes.NewBuffer(jsonData))
	req.Header.Set("Authorization", "Bearer "+a.apiKey)
	req.Header.Set("Content-Type", "application/json")

	resp, err := a.httpClient.Do(req)
	if err != nil {
		return "", err
	}
	defer resp.Body.Close()

	body, _ := io.ReadAll(resp.Body)
	return string(body), nil
}

func (a *SearchAgent) webFetch(url string) (string, error) {
	payload := map[string]string{"url": url}
	jsonData, _ := json.Marshal(payload)

	req, _ := http.NewRequest("POST", "https://ollama.com/api/web_fetch", bytes.NewBuffer(jsonData))
	req.Header.Set("Authorization", "Bearer "+a.apiKey)
	req.Header.Set("Content-Type", "application/json")

	resp, err := a.httpClient.Do(req)
	if err != nil {
		return "", err
	}
	defer resp.Body.Close()

	body, _ := io.ReadAll(resp.Body)
	return string(body), nil
}

func main() {
	agent := NewSearchAgent("qwen3:4b")

	answer, err := agent.Query("What are the latest features in Ollama?")
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Println("\n📝 Answer:")
	fmt.Println(answer)
}

How do I handle large web search responses in Go? Truncate the response content before passing it to the model context. Use string slicing to limit content to approximately 8000 characters to fit within context limits.

Concurrent Search

Go excels at concurrent operations. Here’s how to perform multiple searches in parallel. Understanding how Ollama handles parallel requests can help you optimize your concurrent search implementations.

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
	"sync"
)

type SearchResult struct {
	Query   string
	Results []struct {
		Title   string `json:"title"`
		URL     string `json:"url"`
		Content string `json:"content"`
	} `json:"results"`
	Error error
}

func concurrentSearch(queries []string) []SearchResult {
	results := make([]SearchResult, len(queries))
	var wg sync.WaitGroup

	for i, query := range queries {
		wg.Add(1)
		go func(idx int, q string) {
			defer wg.Done()

			result := SearchResult{Query: q}

			payload := map[string]string{"query": q}
			jsonData, _ := json.Marshal(payload)

			req, _ := http.NewRequest("POST", "https://ollama.com/api/web_search", bytes.NewBuffer(jsonData))
			req.Header.Set("Authorization", "Bearer "+os.Getenv("OLLAMA_API_KEY"))
			req.Header.Set("Content-Type", "application/json")

			resp, err := http.DefaultClient.Do(req)
			if err != nil {
				result.Error = err
				results[idx] = result
				return
			}
			defer resp.Body.Close()

			body, _ := io.ReadAll(resp.Body)
			json.Unmarshal(body, &result)
			results[idx] = result
		}(i, query)
	}

	wg.Wait()
	return results
}

func main() {
	queries := []string{
		"Ollama latest features",
		"local LLM deployment",
		"AI search agents Go",
	}

	results := concurrentSearch(queries)

	for _, r := range results {
		fmt.Printf("\n🔍 Query: %s\n", r.Query)
		if r.Error != nil {
			fmt.Printf("   Error: %v\n", r.Error)
			continue
		}
		for _, item := range r.Results[:min(3, len(r.Results))] {
			fmt.Printf("   • %s\n", item.Title)
		}
	}
}

func min(a, b int) int {
	if a < b {
		return a
	}
	return b
}

Context and Cancellation

Add context support for timeouts and cancellation:

package main

import (
	"bytes"
	"context"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"os"
	"time"
)

func webSearchWithContext(ctx context.Context, query string) (string, error) {
	payload := map[string]string{"query": query}
	jsonData, _ := json.Marshal(payload)

	req, err := http.NewRequestWithContext(ctx, "POST", "https://ollama.com/api/web_search", bytes.NewBuffer(jsonData))
	if err != nil {
		return "", err
	}

	req.Header.Set("Authorization", "Bearer "+os.Getenv("OLLAMA_API_KEY"))
	req.Header.Set("Content-Type", "application/json")

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return "", err
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return "", err
	}

	return string(body), nil
}

func main() {
	// Create context with 10 second timeout
	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
	defer cancel()

	result, err := webSearchWithContext(ctx, "Ollama web search API")
	if err != nil {
		if ctx.Err() == context.DeadlineExceeded {
			fmt.Println("Request timed out")
		} else {
			fmt.Printf("Error: %v\n", err)
		}
		return
	}

	fmt.Println(result)
}

Recommended Models

What context length should I use for Go search agents? Set context length to approximately 32000 tokens for reasonable performance. Search agents work best with full context length since web search results can be extensive. If you need to manage your Ollama models or move them to different locations, see our guide on how to move Ollama models to different drive or folder.

Model	Parameters	Best For
`qwen3:4b`	4B	Quick local searches
`qwen3`	8B	General purpose agent
`gpt-oss`	Various	Research tasks
`qwen3:480b-cloud`	480B	Complex reasoning (cloud)
`gpt-oss:120b-cloud`	120B	Long-form research (cloud)
`deepseek-v3.1-cloud`	-	Advanced analysis (cloud)

For advanced AI applications that combine text and visual content, consider exploring cross-modal embeddings to extend your search capabilities beyond text-only queries.

Best Practices

Error Handling: Always check for errors and handle API failures gracefully
Timeouts: Use context with timeouts for network requests
Rate Limiting: Respect Ollama’s API rate limits. Be aware of potential changes to Ollama’s API, as discussed in first signs of Ollama enshittification
Result Truncation: Truncate results to ~8000 chars for context limits
Concurrent Requests: Use goroutines for parallel searches
Connection Reuse: Reuse HTTP client for better performance
Testing: Write comprehensive unit tests for your search implementations. Follow Go unit testing best practices to ensure your code is robust and maintainable