Extended context & vision capabilities
Anthropic Claude models excel at long-form content, vision analysis, and precise instruction following. Zaguán automatically handles Claude-specific requirements while maintaining OpenAI compatibility.
What Zaguán handles for you
System prompts
Automatically extracts and converts system messages to Claude's format
Token limits
Sets model-specific defaults for max_tokens (required by Claude)
Tool translation
Converts OpenAI function format to Claude's native tool use
System prompts made easy
Claude uses a dedicated system parameter. Zaguán automatically extracts system messages from your OpenAI-style messages array and converts them.
You write (OpenAI format)
Standard OpenAI messages with system role
from openai import OpenAI
client = OpenAI(
api_key="your-zaguan-api-key",
base_url="https://api.zaguanai.com/v1",
)
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to sort a list"}
]
)Zaguán sends (Claude format)
Automatically converted to Claude's native format
{
"system": "You are a helpful coding assistant.",
"messages": [
{"role": "user", "content": "Write a Python function to sort a list"}
]
}Maximum output tokens
Claude requires the max_tokens parameter. Zaguán sets smart defaults based on the model, but you can override them:
anthropic/claude-opus-4Default: 32,000 tokens
anthropic/claude-sonnet-4Default: 64,000 tokens
anthropic/claude-3.7-sonnetDefault: 64,000 tokens
anthropic/claude-3.5-sonnetDefault: 8,192 tokens
# Override the default
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4",
messages=[...],
max_tokens=4096 # Custom limit
)Vision & multimodal
Claude excels at analyzing images. Use OpenAI's standard image format—Zaguán handles the conversion.
Image analysis example
Send images using base64 encoding or URLs
import base64
# Load and encode your image
with open("diagram.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/png;base64,{image_data}"
}
}
]
}
],
max_tokens=8192
)
print(response.choices[0].message.content)image_url format to Claude's native image format with proper media type detection and base64 data extraction.Function calling (Tool use)
Claude's tool use is powerful and precise. Zaguán translates OpenAI's function format automatically.
Tool use example
Define tools using OpenAI format—works seamlessly with Claude
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "Calculate 15% tip on $87.50"}
],
tools=[
{
"type": "function",
"function": {
"name": "calculate_tip",
"description": "Calculate tip amount",
"parameters": {
"type": "object",
"properties": {
"amount": {"type": "number"},
"percentage": {"type": "number"}
},
"required": ["amount", "percentage"]
}
}
}
],
tool_choice="auto"
)
# Handle tool calls
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
# Execute your function
result = calculate_tip(
amount=87.50,
percentage=15
)
# Continue conversation...function.parameters to Claude's input_schemaand handles tool_choice format differences automatically.Streaming responses
Stream Claude responses for real-time user feedback. Works exactly like OpenAI streaming:
stream = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[
{"role": "user", "content": "Write a short story"}
],
stream=True,
max_tokens=4096
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Best practices
- •Use system prompts: Claude performs best with clear system instructions. Always include a system message for consistent behavior.
- •Set appropriate max_tokens: For long-form content, use higher limits. For quick responses, keep it lower to control costs.
- •Leverage vision capabilities: Claude 3.5+ models excel at analyzing charts, diagrams, and screenshots. Use them for document analysis.
- •Monitor finish_reason: Check
response.choices[0].finish_reasonto see if output was truncated ("length") or completed naturally ("stop"). - •Use Haiku for speed: Claude Haiku is significantly faster and cheaper for simple tasks. Reserve Sonnet/Opus for complex reasoning.
Model recommendations
anthropic/claude-sonnet-4Best for: Complex analysis, long-form content, coding tasks. Up to 64K output tokens.
anthropic/claude-3.5-sonnetBest for: Balanced performance and cost. Great for most production use cases.
anthropic/claude-3.5-haikuBest for: Fast responses, simple tasks, high-volume applications. Most cost-effective.
anthropic/claude-opus-4Best for: Maximum capability, research tasks, complex multi-step reasoning.