What is the main difference between BAML and Instructor?

BAML uses a domain-specific language (.baml files) with code generation, while Instructor uses Python decorators and Pydantic models directly in your code. BAML provides stronger type safety across languages, while Instructor offers simpler Python-native integration.

Which framework has better performance?

Both frameworks have similar runtime performance as they use the same underlying LLM APIs. BAML’s code generation can provide faster development cycles with better autocomplete, while Instructor’s decorator approach has minimal overhead.

Can I use BAML or Instructor with local LLMs like Ollama?

Yes, both frameworks support local LLMs. Instructor works seamlessly with Ollama through its OpenAI-compatible API. BAML can also connect to Ollama and other local model providers through its client configuration.

Do these frameworks support streaming responses?

Yes, both support streaming. Instructor provides streaming with Pydantic models using iterable responses. BAML supports streaming through its generated client code with type-safe stream handlers.

Which framework should I choose for production?

Choose Instructor for simpler Python projects, rapid prototyping, and when you want minimal setup. Choose BAML for multi-language projects, when you need contract-first development, or want stronger compile-time guarantees across your stack.

BAML vs Instructor: Structured LLM Outputs

Type-safe LLM outputs with BAML and Instructor

Page content

When working with Large Language Models in production, getting structured, type-safe outputs is critical. Two popular frameworks - BAML and Instructor - take different approaches to solving this problem.

This comparison helps you choose the right tool for your Python LLM applications.

circular-flow

Understanding Structured Output Challenges

LLMs naturally generate unstructured text, but modern applications need predictable, parseable data. Whether you’re building chatbots, data extraction pipelines, or AI agents, you need JSON objects, validated data types, and error handling—not free-form responses.

Both BAML and Instructor address this challenge but with fundamentally different philosophies: BAML uses a contract-first approach with code generation, while Instructor leverages Python’s type system with runtime validation. If you’re interested in broader context about structured output approaches across different LLM providers, understanding these frameworks becomes even more valuable.

BAML: Domain-Specific Language for LLMs

BAML (BoundaryML’s language) introduces a dedicated DSL for defining LLM interactions. You write .baml files that declare your prompts, types, and functions, then BAML generates type-safe client code for multiple languages including Python.

Key Features of BAML

Type Safety Across Languages: BAML generates clients for Python, TypeScript, and Ruby from the same .baml definitions, ensuring consistency across your stack.

Version Control for Prompts: Your prompts live in .baml files, making them easy to track, review, and test independently from application code.

Built-in Testing Framework: BAML includes testing tools to validate prompt behavior before deployment, catching issues early in development.

Playground Interface: The BAML playground lets you iterate on prompts visually with immediate feedback, accelerating development cycles.

BAML Example Implementation

# First, define your schema in a .baml file:
# persona.baml

class Person {
  name string
  age int
  occupation string
  skills string[]
}

function ExtractPerson(text: string) -> Person {
  client GPT4
  prompt #"
    Extract person information from: {{ text }}
    Return structured data.
  "#
}

The generated Python client provides type-safe access:

from baml_client import b
from baml_client.types import Person

# Use the generated client
text = "John Smith, 34, software engineer skilled in Python and Go"
result: Person = b.ExtractPerson(text)

print(f"{result.name} is {result.age} years old")
print(f"Skills: {', '.join(result.skills)}")

BAML’s approach shines when you have multiple services consuming the same LLM contracts or when you need strong guarantees about data shapes across language boundaries.

Instructor: Pydantic-Native Python Framework

Instructor takes a Python-first approach, extending Pydantic models with LLM capabilities. It feels natural to Python developers already using Pydantic for validation and type hints.

Key Features of Instructor

Zero Boilerplate: Instructor works directly with your existing Pydantic models using simple decorators. No code generation or build steps required.

Rich Validation: Leverage Pydantic’s entire validation ecosystem—custom validators, field constraints, computed fields, and complex nested structures.

Multiple Provider Support: Works seamlessly with OpenAI, Anthropic, Google, and Ollama through a unified interface.

Streaming Support: First-class support for streaming responses with incremental Pydantic model updates.

Retry Logic: Built-in retry mechanisms with exponential backoff and validator-based error recovery.

Instructor Example Implementation

from pydantic import BaseModel, Field
from instructor import from_openai
from openai import OpenAI

# Define your Pydantic model
class Person(BaseModel):
    name: str = Field(description="Full name of the person")
    age: int = Field(ge=0, le=120, description="Age in years")
    occupation: str
    skills: list[str] = Field(description="List of professional skills")

# Patch the OpenAI client
client = from_openai(OpenAI())

# Extract structured data
text = "John Smith, 34, software engineer skilled in Python and Go"
result = client.chat.completions.create(
    model="gpt-4",
    response_model=Person,
    messages=[
        {"role": "user", "content": f"Extract person info: {text}"}
    ]
)

print(f"{result.name} is {result.age} years old")
print(f"Skills: {', '.join(result.skills)}")

Instructor’s strength lies in its simplicity and integration with Python’s ecosystem. If you’re already using Pydantic, the learning curve is minimal. For developers new to Python or needing quick reference for Python-specific patterns, our Python cheatsheet provides helpful syntax reminders alongside these frameworks.

Detailed Comparison: BAML vs Instructor

Development Experience

BAML requires an additional build step and tooling setup. You write .baml files, run the generator, then import the generated code. This creates a clear separation between prompt engineering and application logic, which can be beneficial for larger teams.

Instructor has zero setup friction—pip install and you’re ready. Your prompts live alongside your code, making rapid iteration easier for smaller projects or prototypes.

Type Safety and Validation

BAML provides compile-time type checking in the generated code. Your IDE knows exactly what fields are available before you run anything. Cross-language consistency is guaranteed since the same .baml file generates clients for all supported languages.

Instructor offers runtime validation through Pydantic. While Python type hints provide IDE support, errors surface during execution. This is standard for Python but means less static guarantee than BAML’s generated code.

Working with Local LLMs

Both frameworks support local models, which is crucial for privacy, cost control, and offline development. When using Ollama or other local LLM providers, you maintain the same structured output benefits without external API dependencies. For a deeper dive into constraining LLMs with structured output using Ollama, Qwen3, and Python or Go, these frameworks provide production-ready abstractions over the lower-level APIs.

BAML connects to Ollama by configuring the client in your .baml file:

# In your .baml file:
client OllamaLocal {
  provider ollama
  options {
    model "llama2"
    base_url "http://localhost:11434"
  }
}

Instructor works with Ollama through the OpenAI-compatible API:

from openai import OpenAI
from instructor import from_openai

client = from_openai(OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama"  # dummy key
))

Note that when working with local models, you should be aware of potential structured output issues with Ollama and GPT-OSS models, as not all models handle structured outputs with equal reliability.

Error Handling and Retries

BAML handles retries at the framework level with configurable strategies. Errors in schema validation trigger automatic reprompting with error context.

Instructor provides decorative retry logic with hooks for custom behavior. You can define validators that trigger retries with modified prompts:

from instructor import patch
from tenacity import retry, stop_after_attempt

@retry(stop=stop_after_attempt(3))
def extract_with_retry(text: str) -> Person:
    return client.chat.completions.create(
        model="gpt-4",
        response_model=Person,
        messages=[{"role": "user", "content": text}]
    )

Testing and Observability

BAML includes a testing framework where you can write test cases directly in .baml files, validating prompt behavior across different inputs. The playground provides visual debugging.

Instructor integrates with standard Python testing frameworks. You can use pytest fixtures, mocking libraries, and assertion helpers just like any Python code.

Performance Considerations

Runtime performance is comparable—both frameworks ultimately make the same LLM API calls. The overhead for validation and parsing is negligible compared to network latency and model inference time.

Development velocity differs significantly:

BAML’s code generation means better autocomplete and earlier error detection but requires a build step
Instructor’s decorator approach means faster iteration but runtime error discovery

For production systems processing millions of requests, both frameworks handle load equally well. Your choice depends more on development workflow preferences than performance characteristics.

When to Choose BAML

Select BAML when you need:

Multi-language support: Accessing the same LLM contracts from Python, TypeScript, and Ruby services
Contract-first development: API-style development where LLM interfaces are designed before implementation
Team collaboration: Separate prompt engineering workflows from application development
Strong typing guarantees: Compile-time checks across your entire stack
Visual prompt development: Playground-driven iteration on prompts

When to Choose Instructor

Choose Instructor when you want:

Python-only projects: No need for cross-language consistency
Rapid prototyping: Minimum setup to get structured outputs working
Pydantic integration: Leveraging existing Pydantic models and validators
Simple deployment: No build steps or generated code to manage
Rich Python ecosystem: Using Python-specific libraries and patterns

Combining Approaches

Some projects benefit from using both frameworks. For example, you might use BAML for customer-facing APIs that need cross-language clients, while using Instructor for internal Python services that need rapid iteration.

You can also transition between frameworks as your project matures—starting with Instructor for quick validation, then moving to BAML when you need broader language support or stricter contracts.

Real-World Use Cases

Data Extraction Pipeline (BAML)

A document processing system uses BAML to extract structured data from invoices, contracts, and receipts. The .baml definitions serve as contracts between the ML team and the backend services, with TypeScript clients for the web dashboard and Python clients for batch processing.

Customer Support Bot (Instructor)

A support bot uses Instructor to classify tickets, extract user intents, and generate responses. The team iterates quickly on prompts using Pydantic models, with validators ensuring extracted phone numbers, emails, and ticket IDs meet format requirements.

An AI agent system uses BAML for core agent-to-agent communication contracts, ensuring type safety across the distributed system, while individual agents use Instructor internally for flexible, Python-native processing of user inputs. Similar patterns apply when building MCP servers in Python, where structured outputs enable reliable tool integration with AI assistants.

Migration and Integration Paths

If you’re already using basic JSON parsing with LLMs, both frameworks offer straightforward migration paths:

From JSON to BAML: Convert your JSON schemas to BAML type definitions, move prompts into .baml files, generate clients, and replace manual parsing with generated types.

From JSON to Instructor: Add Pydantic models matching your JSON structure, install instructor, patch your OpenAI client, and replace JSON parsing with response_model parameters.

Both migrations can be incremental—you don’t need to convert your entire codebase at once.

Future Outlook and Community

Both frameworks are actively developed with strong communities:

BAML (BoundaryML) focuses on expanding language support, improving the playground, and enhancing testing capabilities. The commercial backing suggests long-term stability.

Instructor maintains a strong open-source presence with frequent updates, extensive documentation, and growing adoption. The project is well-maintained by Jason Liu and contributors.

Conclusion

BAML and Instructor represent two excellent but distinct approaches to structured LLM outputs. BAML’s contract-first, multi-language philosophy suits teams building distributed systems with strict type requirements. Instructor’s Python-native, Pydantic-based approach fits rapid development and Python-centric stacks.

Neither is universally better—your choice depends on your team’s size, language preferences, development workflow, and type safety requirements. Many teams will find that starting with Instructor for prototyping, then adopting BAML for production multi-service architectures, offers the best of both worlds.

BAML vs Instructor: Structured LLM Outputs

Understanding Structured Output Challenges

BAML: Domain-Specific Language for LLMs

Key Features of BAML

BAML Example Implementation

Instructor: Pydantic-Native Python Framework

Key Features of Instructor

Instructor Example Implementation

Detailed Comparison: BAML vs Instructor

Development Experience

Type Safety and Validation

Working with Local LLMs

Error Handling and Retries

Testing and Observability

Performance Considerations

When to Choose BAML

When to Choose Instructor

Combining Approaches

Real-World Use Cases

Data Extraction Pipeline (BAML)

Customer Support Bot (Instructor)

Migration and Integration Paths

Future Outlook and Community

Conclusion

Useful links

External references

Understanding Structured Output Challenges

BAML: Domain-Specific Language for LLMs

Key Features of BAML

BAML Example Implementation

Instructor: Pydantic-Native Python Framework

Key Features of Instructor

Instructor Example Implementation

Detailed Comparison: BAML vs Instructor

Development Experience

Type Safety and Validation

Working with Local LLMs

Error Handling and Retries

Testing and Observability

Performance Considerations

When to Choose BAML

When to Choose Instructor

Combining Approaches

Real-World Use Cases

Data Extraction Pipeline (BAML)

Customer Support Bot (Instructor)

Multi-Modal AI Agent (Both)

Migration and Integration Paths

Future Outlook and Community

Conclusion

Useful links

Related articles on this site

External references