Unlock provider-specific features
While Zaguán maintains 97%+ OpenAI compatibility, you can access advanced features unique to Google Gemini, Anthropic Claude, and xAI Grok. Control reasoning depth, enable prompt caching (90% cost savings), manage thinking budgets, and access stateful conversations—all through the same familiar API interface using our extra_body translation layer.
How it works
Zaguán automatically routes your requests based on the model name and translates parameters between OpenAI format and provider-specific formats. You get the best of both worlds: OpenAI compatibility plus native provider features.
Send request
Use OpenAI SDK with Zaguán base URL
Auto-translate
We convert parameters to provider format
Get response
Receive OpenAI-compatible response
Google Gemini Features
Advanced reasoning and thinking control for complex tasks
Key capabilities
- •Reasoning control: Adjust depth with
reasoning_effort - •Thinking budgets: Allocate tokens for internal reasoning
- •Thought visibility: See the model's reasoning process
- •Safety settings: Configure content filtering per harm category
- •Function calling: OpenAI-compatible tool use
Anthropic Claude Features
Prompt caching (90% cost savings), extended thinking, and PDF support
Key capabilities
- •Prompt caching: 90% cost savings on cached tokens (5-minute TTL)
- •Extended thinking: Up to 8K token outputs for complex reasoning
- •PDF support: Native document analysis with base64 encoding
- •System prompts: Automatic extraction and conversion
- •Vision support: Analyze images with multimodal models
- •Tool use: Native function calling translation
xAI Grok Features
Stateful conversations with server-side storage and encrypted thinking
Key capabilities
- •Stateful conversations: 30-day server-side storage
- •Encrypted thinking: Access model's reasoning process
- •Automatic context: No need to send full history
- •Response IDs: Reference previous responses easily
The extra_body parameter
This is your gateway to provider-specific features. Add it to any OpenAI SDK call to access advanced capabilities:
from openai import OpenAI
client = OpenAI(
api_key="your-zaguan-api-key",
base_url="https://api.zaguanai.com/v1",
)
response = client.chat.completions.create(
model="google/gemini-2.5-pro",
messages=[{"role": "user", "content": "Explain quantum computing"}],
extra_body={
"reasoning_effort": "high" # Provider-specific feature
}
)Quick examples
Enable Gemini reasoning
{
"model": "google/gemini-2.5-pro",
"messages": [...],
"extra_body": {
"reasoning_effort": "high",
"thinking_budget": 2048
}
}Enable Claude prompt caching
{
"model": "anthropic/claude-3-5-sonnet",
"messages": [...],
"extra_body": {
"system_cache_control": {
"type": "ephemeral"
}
}
}xAI stateful conversation
{
"model": "xai/grok-beta",
"messages": [...],
"extra_body": {
"use_responses_api": true,
"store": true
}
}Claude extended thinking
{
"model": "anthropic/claude-3-5-sonnet",
"messages": [...],
"max_tokens": 8000,
"extra_body": {
"extended_thinking": true
}
}Best practices for solo devs & small teams
- •Start simple: Use standard parameters first, add advanced features only when needed
- •Test incrementally: Try one provider-specific feature at a time to understand its impact
- •Monitor costs: Reasoning and extended output consume more tokens—track usage in your dashboard
- •Keep it portable: Wrap provider-specific code in functions so you can swap providers easily