API Reference

Orchestrator API Reference

Complete API documentation for Zaguán Orchestrator. All endpoints follow the OpenAI-compatible API specification for seamless integration.

Base URLs

Cloudflare-Proxied (Recommended)

https://orchestrator.zaguanai.com/v1

Global CDN with DDoS protection and automatic SSL/TLS encryption.

Direct Connection (EU-Finland)

https://orchestrator-eu-fi-01.zaguanai.com/v1

Direct connection to EU datacenter for lower latency and GDPR compliance.

Authentication

API Key Authentication

All API requests require authentication using your API key in the Authorization header with the Bearer scheme:

Authorization: Bearer YOUR_API_KEY

Security Note: Never expose your API key in client-side code or public repositories. Store it securely as an environment variable.

Endpoints

Public Endpoints (No Authentication)

GET/health

Health check endpoint. Returns service status.

GET/healthz

Kubernetes-style health check.

GET/version

Returns API version information.

GET/metrics

Prometheus metrics endpoint.

List Models

GET/v1/models

Returns a list of all available personas (models) with their IDs, tiers, and descriptions.

Request Example

curl https://orchestrator.zaguanai.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Response Example

{
  "object": "list",
  "data": [
    {
      "id": "promptshield/architect",
      "object": "model",
      "created": 1704067200,
      "owned_by": "promptshield",
      "description": "Software Engineer - Production-ready code with clean architecture"
    },
    {
      "id": "promptshield/wordsmith",
      "object": "model",
      "created": 1704067200,
      "owned_by": "promptshield",
      "description": "Marketing Copywriter - Persuasive, conversion-focused copy"
    }
    // ... more personas
  ]
}

Create Chat Completion

POST/v1/chat/completions

Generate a completion using a specified persona. Supports both streaming and non-streaming responses.

Request Parameters

modelrequired

The persona ID to use (e.g., promptshield/architect)

messagesrequired

Array of message objects with role and content

streamoptional

Boolean. If true, returns Server-Sent Events stream. Default: false

temperatureoptional

Number between 0 and 2. Controls randomness. Default: 1.0

max_tokensoptional

Maximum number of tokens to generate. Default: varies by persona

Request Example

curl https://orchestrator.zaguanai.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "promptshield/architect",
    "messages": [
      {
        "role": "user",
        "content": "Create a Python function that validates email addresses"
      }
    ],
    "stream": false
  }'

Response Example (Non-Streaming)

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "promptshield/architect",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's a robust email validation function..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming Response

When stream: true, the API returns Server-Sent Events (SSE) with incremental updates:

data: {
  "id": "chatcmpl-abc123",
  "object": "chat.completion.chunk",
  "created": 1704067200,
  "model": "promptshield/architect",
  "choices": [{
    "index": 0,
    "delta": {"role": "assistant", "content": "Here"},
    "finish_reason": null
  }]
}

data: {
  "id": "chatcmpl-abc123",
  "object": "chat.completion.chunk",
  "created": 1704067200,
  "model": "promptshield/architect",
  "choices": [{
    "index": 0,
    "delta": {"content": "'s a"},
    "finish_reason": null
  }]
}

data: [DONE]

Rate Limits

Rate limits vary by plan tier. All responses include rate limit headers:

Free Tier

10 requests/minute

Pro Tier

100 requests/minute

Enterprise

Custom limits

Response Headers

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1704067200

Error Handling

Error Response Format

Orchestrator returns errors in OpenAI-compatible format with standard HTTP status codes:

{
  "error": {
    "message": "Invalid model name",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  }
}

Common Error Codes

401Unauthorized

Invalid or missing API key

400Bad Request

Invalid request parameters or malformed JSON

429Rate Limit Exceeded

Too many requests. Implement exponential backoff and retry

500Internal Server Error

Server error. Retry with exponential backoff

503Service Unavailable

Temporary service disruption. Retry after a delay

Best Practices

  • Implement retry logic: Use exponential backoff for 429 and 500 errors
  • Monitor rate limits: Check response headers to avoid hitting limits
  • Use streaming for long tasks: Enable streaming to see progress and catch issues early
  • Secure your API key: Never expose keys in client-side code or version control