Advanced

Unlock provider-specific features

While Zaguán maintains 97%+ OpenAI compatibility, you can access advanced features unique to Google Gemini, Anthropic Claude, and xAI Grok. Control reasoning depth, enable prompt caching (90% cost savings), manage thinking budgets, and access stateful conversations—all through the same familiar API interface using our extra_body translation layer.

How it works

Zaguán automatically routes your requests based on the model name and translates parameters between OpenAI format and provider-specific formats. You get the best of both worlds: OpenAI compatibility plus native provider features.

1

Send request

Use OpenAI SDK with Zaguán base URL

2

Auto-translate

We convert parameters to provider format

3

Get response

Receive OpenAI-compatible response

Google Gemini Features

Advanced reasoning and thinking control for complex tasks

Key capabilities

  • Reasoning control: Adjust depth with reasoning_effort
  • Thinking budgets: Allocate tokens for internal reasoning
  • Thought visibility: See the model's reasoning process
  • Safety settings: Configure content filtering per harm category
  • Function calling: OpenAI-compatible tool use

Anthropic Claude Features

Prompt caching (90% cost savings), extended thinking, and PDF support

Key capabilities

  • Prompt caching: 90% cost savings on cached tokens (5-minute TTL)
  • Extended thinking: Up to 8K token outputs for complex reasoning
  • PDF support: Native document analysis with base64 encoding
  • System prompts: Automatic extraction and conversion
  • Vision support: Analyze images with multimodal models
  • Tool use: Native function calling translation

xAI Grok Features

Stateful conversations with server-side storage and encrypted thinking

Key capabilities

  • Stateful conversations: 30-day server-side storage
  • Encrypted thinking: Access model's reasoning process
  • Automatic context: No need to send full history
  • Response IDs: Reference previous responses easily

The extra_body parameter

This is your gateway to provider-specific features. Add it to any OpenAI SDK call to access advanced capabilities:

from openai import OpenAI

client = OpenAI(
    api_key="your-zaguan-api-key",
    base_url="https://api.zaguanai.com/v1",
)

response = client.chat.completions.create(
    model="google/gemini-2.5-pro",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    extra_body={
        "reasoning_effort": "high"  # Provider-specific feature
    }
)

Quick examples

Enable Gemini reasoning

{
  "model": "google/gemini-2.5-pro",
  "messages": [...],
  "extra_body": {
    "reasoning_effort": "high",
    "thinking_budget": 2048
  }
}

Enable Claude prompt caching

{
  "model": "anthropic/claude-3-5-sonnet",
  "messages": [...],
  "extra_body": {
    "system_cache_control": {
      "type": "ephemeral"
    }
  }
}

xAI stateful conversation

{
  "model": "xai/grok-beta",
  "messages": [...],
  "extra_body": {
    "use_responses_api": true,
    "store": true
  }
}

Claude extended thinking

{
  "model": "anthropic/claude-3-5-sonnet",
  "messages": [...],
  "max_tokens": 8000,
  "extra_body": {
    "extended_thinking": true
  }
}
See more examples

Best practices for solo devs & small teams

  • Start simple: Use standard parameters first, add advanced features only when needed
  • Test incrementally: Try one provider-specific feature at a time to understand its impact
  • Monitor costs: Reasoning and extended output consume more tokens—track usage in your dashboard
  • Keep it portable: Wrap provider-specific code in functions so you can swap providers easily