Anthropic Claude

Extended context & vision capabilities

Anthropic Claude models excel at long-form content, vision analysis, and precise instruction following. Zaguán automatically handles Claude-specific requirements while maintaining OpenAI compatibility.

What Zaguán handles for you

System prompts

Automatically extracts and converts system messages to Claude's format

Token limits

Sets model-specific defaults for max_tokens (required by Claude)

Tool translation

Converts OpenAI function format to Claude's native tool use

System prompts made easy

Claude uses a dedicated system parameter. Zaguán automatically extracts system messages from your OpenAI-style messages array and converts them.

You write (OpenAI format)

Standard OpenAI messages with system role

from openai import OpenAI

client = OpenAI(
    api_key="your-zaguan-api-key",
    base_url="https://api.zaguanai.com/v1",
)

response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to sort a list"}
    ]
)

Zaguán sends (Claude format)

Automatically converted to Claude's native format

{
  "system": "You are a helpful coding assistant.",
  "messages": [
    {"role": "user", "content": "Write a Python function to sort a list"}
  ]
}

Maximum output tokens

Claude requires the max_tokens parameter. Zaguán sets smart defaults based on the model, but you can override them:

anthropic/claude-opus-4

Default: 32,000 tokens

anthropic/claude-sonnet-4

Default: 64,000 tokens

anthropic/claude-3.7-sonnet

Default: 64,000 tokens

anthropic/claude-3.5-sonnet

Default: 8,192 tokens

# Override the default
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[...],
    max_tokens=4096  # Custom limit
)

Vision & multimodal

Claude excels at analyzing images. Use OpenAI's standard image format—Zaguán handles the conversion.

Image analysis example

Send images using base64 encoding or URLs

import base64

# Load and encode your image
with open("diagram.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{image_data}"
                    }
                }
            ]
        }
    ],
    max_tokens=8192
)

print(response.choices[0].message.content)
Behind the scenes: Zaguán converts the image_url format to Claude's native image format with proper media type detection and base64 data extraction.

Function calling (Tool use)

Claude's tool use is powerful and precise. Zaguán translates OpenAI's function format automatically.

Tool use example

Define tools using OpenAI format—works seamlessly with Claude

response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[
        {"role": "user", "content": "Calculate 15% tip on $87.50"}
    ],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "calculate_tip",
                "description": "Calculate tip amount",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "amount": {"type": "number"},
                        "percentage": {"type": "number"}
                    },
                    "required": ["amount", "percentage"]
                }
            }
        }
    ],
    tool_choice="auto"
)

# Handle tool calls
if response.choices[0].message.tool_calls:
    for tool_call in response.choices[0].message.tool_calls:
        # Execute your function
        result = calculate_tip(
            amount=87.50,
            percentage=15
        )
        # Continue conversation...
Translation details: Zaguán converts the function.parameters to Claude's input_schemaand handles tool_choice format differences automatically.

Streaming responses

Stream Claude responses for real-time user feedback. Works exactly like OpenAI streaming:

stream = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[
        {"role": "user", "content": "Write a short story"}
    ],
    stream=True,
    max_tokens=4096
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Best practices

  • Use system prompts: Claude performs best with clear system instructions. Always include a system message for consistent behavior.
  • Set appropriate max_tokens: For long-form content, use higher limits. For quick responses, keep it lower to control costs.
  • Leverage vision capabilities: Claude 3.5+ models excel at analyzing charts, diagrams, and screenshots. Use them for document analysis.
  • Monitor finish_reason: Check response.choices[0].finish_reasonto see if output was truncated ("length") or completed naturally ("stop").
  • Use Haiku for speed: Claude Haiku is significantly faster and cheaper for simple tasks. Reserve Sonnet/Opus for complex reasoning.

Model recommendations

anthropic/claude-sonnet-4

Best for: Complex analysis, long-form content, coding tasks. Up to 64K output tokens.

anthropic/claude-3.5-sonnet

Best for: Balanced performance and cost. Great for most production use cases.

anthropic/claude-3.5-haiku

Best for: Fast responses, simple tasks, high-volume applications. Most cost-effective.

anthropic/claude-opus-4

Best for: Maximum capability, research tasks, complex multi-step reasoning.

💡 Tip for small teams: Start with Claude 3.5 Haiku for development and testing. Upgrade to Sonnet for production when you need better quality. Reserve Opus for critical tasks where accuracy is paramount.