Does GPT-OSS on Ollama support Structured Output?

Right now (October 2025) GPT-OSS on Ollama does not support Structured Output.

Ollama GPT-OSS Structured Output Issues

Not very nice.

Page content

Ollama’s GPT-OSS models have recurring issues handling structured output, especially when used with frameworks like LangChain, OpenAI SDK, vllm, and others.

Many users report failures to generate valid JSON or other structured formats, model hallucination of format elements, and inconsistent or empty response content. These problems stem from current compatibility gaps, response format changes (such as Harmony), and incomplete enforcement of output schemas by both Ollama and third-party APIs.

llama with the issues

About GPT-OSS

This is a new very interesting LLM from OpenAI. Just look at these params:

Model	gpt-oss-120b	gpt-oss-20b
Layers	36	24
Total Params	117B	21B
Active Params Per Token	5.1B	3.6B
Total Experts	128	32
Active Experts Per Token	4	4
Context Length	128k	128k

Release notes say (here and here):

Permissive Apache 2.0 license: Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
Full chain-of-thought: Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. It’s not intended to be shown to end users.
Fine-tunable: Fully customize models to your specific use case through parameter fine-tuning.
Agentic capabilities: Use the models’ native capabilities for function calling, web browsing, Python code execution, and Structured Outputs.
MXFP4 quantization: The models were post-trained with MXFP4 quantization of the MoE weights, making gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the gpt-oss-20b model run within 16GB of memory. All evals were performed with the same MXFP4 quantization.

What not to love? The behaviour of structured output… that what is. Overall, this issue is very disappointing, especially because Structured Output works so well with Ollama and Qwen3.

Common Issues

Models like gpt-oss:20b frequently fail to produce strict JSON or schema-compliant output, with responses often containing extra commentary or incomplete objects.
Integration with LangChain and OpenAI SDK tends to throw parsing/validation errors due to non-structured output, making pipelines unusable in production environments.
Harmony format in gpt-oss introduces reasoning traces even when not requested, complicating schema parsing compared to other models such as Qwen3.
With vllm, structured output enforcement mechanisms are either missing or deprecated, so the output is frequently “unguided” and must be manually parsed.
There are reports of the model producing the correct structured output, then continuing with unrelated content, breaking standard parsers.

Workarounds and Fixes

Some users suggest explicitly specifying the JSON schema in the prompt and attempting manual parsing of model outputs, sometimes using pre- and post-split markers.
Another approach is to run a post-processing layer or a smaller LLM to reformat GPT-OSS output to the desired schema, though this is resource-intensive.
A few bugfixes and pull requests (PRs) have incrementally improved Harmony format compliance, particularly with newer Ollama releases, but complete parity with previous models is not achieved yet.
When using vllm, patching specific functions may help, but generally, robust schema enforcement isn’t supported at this time.

Recommendations

Avoid relying solely on GPT-OSS for strict structured output until full compatibility is restored in Ollama and downstream frameworks.
Where structured output is critical, use additional parsing or a model better known for schema compliance.
Monitor relevant GitHub issues (ollama/ollama, langchain-ai/langchain, vllm-project/vllm) for fixes and integration updates.

In summary, GPT-OSS with Ollama currently struggles with structured output, largely due to incomplete format enforcement, Harmony format changes, and lacking support across toolchains. Manual workarounds may help, but consistent success is not guaranteed.

Ollama GPT-OSS Structured Output Issues

About GPT-OSS

Common Issues

Workarounds and Fixes

Recommendations

Useful links

Other Ollama Articles