Best Practices

Best Practices for Orchestrator

Learn how to get the most out of Zaguán Orchestrator. Follow these proven strategies to optimize quality, performance, and cost-efficiency.

#1 Rule: Quality In = Quality Out

The most important rule for exceptional results: Your output quality is directly proportional to your input quality. Even the most sophisticated AI personas cannot transform vague, incomplete, or poorly-structured requests into exceptional results.

Why This Matters

Orchestrator uses a multi-step workflow (Understanding → Planning → Execution → Review → Refinement). If your initial request lacks clarity, context, or specificity, every subsequent step will be working with a flawed foundation.

Think of it like commissioning work from a world-class expert team. A vague brief like "Make our marketing better" produces generic output. A detailed brief with context, audience, constraints, and success criteria produces targeted, actionable strategy.

Provide Context

  • • What problem are you solving?
  • • Who is the audience?
  • • What constraints exist?
  • • What's the desired outcome?

Include Background

  • • Key facts, data, or research
  • • Existing work or standards
  • • Industry context
  • • Examples you like/dislike

Set Expectations

  • • Format requirements
  • • Technical constraints
  • • Must-have vs. nice-to-have
  • • Level of detail needed

Example Transformations

Poor:
"Write code for user authentication"
Good:
"Create a Python function for user authentication that validates email/password against a PostgreSQL database, returns JWT tokens with 24-hour expiration, includes rate limiting (5 attempts per minute), handles common errors gracefully, and follows OWASP security best practices. Include comprehensive docstrings and usage examples."

2. Choose the Right Persona

Match Task to Expert

The most important decision is selecting the right persona for your task. Each persona is optimized for specific types of work and will deliver significantly better results in their domain.

Good Practice

For code generation:

promptshield/architect

For marketing copy:

promptshield/wordsmith

For strategic advice:

promptshield/sage

Avoid

Using a general-purpose model when a specialized persona exists for your task

Using wordsmith for technical documentation

Using architect for creative writing

3. Be Specific in Your Requests

Provide Clear Context and Requirements

The more specific your request, the better the output. Include relevant context, constraints, and desired outcomes.

Vague Request

"Write code for user authentication"

Specific Request

"Create a Python function that validates email addresses using regex.
The function should:
- Accept an email string as input
- Return a boolean indicating validity
- Check for standard email format ([email protected])
- Handle edge cases like multiple @ symbols
- Include detailed error messages for invalid formats
- Add comprehensive docstrings and type hints"

4. Optimize for Speed and Cost

Use the Right Tier for Each Stage

Different tiers are optimized for different use cases. Use this strategy to balance quality and efficiency:

Fast Tier (_fast)

  • Prototyping and rapid iteration
  • Internal drafts and brainstorming
  • High-volume, low-stakes tasks
  • Testing different approaches

Standard Tier (default)

  • Production content and code
  • Customer-facing outputs
  • Professional deliverables
  • Most day-to-day work

Pro Tier (_pro)

  • Institutional-grade outputs
  • High-stakes presentations
  • Complex strategic work
  • Advanced frameworks needed

Ultra Tier (_ultra)

  • Mission-critical work (legal, compliance, finance, strategy)
  • Board presentations and regulatory submissions
  • Zero-tolerance for errors required
  • Arbiter review ensures professional standards

5. Leverage Streaming for Long Tasks

Enable Streaming for Better UX

Streaming provides real-time feedback and allows you to see the AI's thinking process. This is especially valuable for:

  • Long-running tasks where users need progress updates
  • Interactive applications where immediate feedback improves UX
  • Debugging and development to see each workflow step
  • Catching issues early before the full response completes
// Enable streaming in your request
{
  "model": "promptshield/architect",
  "messages": [...],
  "stream": true  // Enable streaming
}

6. Handle Errors Gracefully

Implement Robust Error Handling

Production applications should handle errors gracefully with retry logic and exponential backoff:

Python Example

import time
from openai import OpenAI, APIError, RateLimitError

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://orchestrator.zaguanai.com/v1"
)

def generate_with_retry(model, messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=model,
                messages=messages
            )
            return response.choices[0].message.content
        
        except RateLimitError:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Rate limit hit. Waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise
        
        except APIError as e:
            if attempt < max_retries - 1 and e.status_code >= 500:
                wait_time = 2 ** attempt
                print(f"Server error. Retrying in {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise
    
    raise Exception("Max retries exceeded")

7. Monitor Rate Limits

Track Your Usage

Monitor rate limit headers to avoid hitting limits and optimize your request patterns:

# Check rate limit headers in response
response = client.chat.completions.create(...)

# Access headers (implementation varies by SDK)
remaining = response.headers.get('X-RateLimit-Remaining')
limit = response.headers.get('X-RateLimit-Limit')
reset_time = response.headers.get('X-RateLimit-Reset')

if int(remaining) < 10:
    print(f"Warning: Only {remaining} requests remaining")
    # Consider implementing request queuing

8. Security Best Practices

Protect Your API Keys

Do

  • Store API keys in environment variables
  • Use secret management services (AWS Secrets Manager, HashiCorp Vault)
  • Rotate keys regularly
  • Use different keys for development and production
  • Call the API from your backend, not client-side

Don't

  • Hardcode API keys in your source code
  • Commit API keys to version control
  • Expose API keys in client-side JavaScript
  • Share API keys in public forums or documentation
  • Use production keys for testing

9. Test and Iterate

Optimize Through Experimentation

Different personas and prompts can produce significantly different results. Test multiple approaches:

  • Try different personas for the same task to find the best fit
  • Experiment with prompt phrasing and specificity levels
  • Use Fast tier for rapid iteration, then switch to Standard/Pro
  • Collect feedback on outputs and refine your approach
  • Document what works well for your specific use cases

Quick Reference

For Best Quality

  • ✓ Use specialized personas
  • ✓ Be specific in requests
  • ✓ Use Standard or Pro tier
  • ✓ Provide clear context

For Best Performance

  • ✓ Use Fast tier for iteration
  • ✓ Enable streaming for long tasks
  • ✓ Implement retry logic
  • ✓ Monitor rate limits

For Best Security

  • ✓ Use environment variables
  • ✓ Call from backend only
  • ✓ Rotate keys regularly
  • ✓ Never commit keys to git

For Best Cost

  • ✓ Use Fast tier when possible
  • ✓ Be concise in prompts
  • ✓ Cache common responses
  • ✓ Monitor usage patterns