xAI Responses API

xAI Responses API - Stateful Conversations

Build stateful conversations with server-side storage and encrypted thinking content. The xAI Responses API maintains conversation context automatically, eliminating the need to send full conversation history with each request.

Server-Side Storage

30-day conversation storage with automatic context management

Encrypted Thinking

Access to model's reasoning process with secure encryption

Automatic Context

No need to send full conversation history with each request

Response IDs

Reference previous responses easily for conversation continuity

Quick Start

Use the xAI Responses API through the extra_body parameter with the OpenAI SDK.

Python Example - Stateful Conversation

import openai

client = openai.OpenAI(
    base_url="https://api.zaguanai.com/v1",
    api_key="your-zaguan-api-key"
)

# First message - creates a new conversation
response1 = client.chat.completions.create(
    model="xai/grok-beta",
    messages=[
        {"role": "user", "content": "What's the capital of France?"}
    ],
    extra_body={
        "use_responses_api": True,
        "store": True
    }
)

print(response1.choices[0].message.content)
# "The capital of France is Paris."

# Get the response ID for continuation
response_id = response1.id

# Continue the conversation - context is maintained server-side
response2 = client.chat.completions.create(
    model="xai/grok-beta",
    messages=[
        {"role": "user", "content": "What's its population?"}
    ],
    extra_body={
        "use_responses_api": True,
        "previous_response_id": response_id
    }
)

print(response2.choices[0].message.content)
# "Paris has approximately 2.2 million people in the city proper..."

TypeScript Example

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ZAGUAN_API_KEY,
  baseURL: "https://api.zaguanai.com/v1",
});

// Start a new conversation
const response1 = await client.chat.completions.create({
  model: "xai/grok-beta",
  messages: [
    { role: "user", content: "Tell me about quantum computing" }
  ],
  extra_body: {
    use_responses_api: true,
    store: true
  }
});

const responseId = response1.id;

// Continue the conversation
const response2 = await client.chat.completions.create({
  model: "xai/grok-beta",
  messages: [
    { role: "user", content: "What are its main applications?" }
  ],
  extra_body: {
    use_responses_api: true,
    previous_response_id: responseId
  }
});

Encrypted Thinking Access

Access the model's reasoning process to understand how it arrived at its answer.

# Access the model's reasoning process
response = client.chat.completions.create(
    model="xai/grok-beta",
    messages=[
        {"role": "user", "content": "Solve: 2x + 5 = 15"}
    ],
    extra_body={
        "use_responses_api": True,
        "include_thinking": True
    }
)

# Access encrypted thinking
if hasattr(response, 'metadata') and response.metadata:
    thinking = response.metadata.get('encrypted_thinking')
    print("Model's reasoning:", thinking)
    # Shows step-by-step problem solving process

Configuration Options

Extra Body Parameters

use_responses_apiboolean

Enable the xAI Responses API for stateful conversations

storeboolean

Store the conversation on the server for 30 days (default: true)

previous_response_idstring

ID of the previous response to continue the conversation

include_thinkingboolean

Include the model's encrypted thinking in the response

Use Cases

Customer Support

Multi-turn conversations with automatic context preservation, eliminating the need to repeat information.

Educational Tutoring

Long-form educational conversations where the AI remembers previous questions and builds on them.

Research Assistance

Extended Q&A sessions for research projects with maintained context across multiple queries.

Code Debugging

Code review sessions with context preservation, allowing iterative debugging and improvements.

Interview Assessments

Multi-question assessments where follow-up questions build on previous answers.

Personal Assistants

AI assistants that remember user preferences and conversation history across sessions.

Benefits Over Traditional Chat

Traditional Approach

  • Send full conversation history with each request
  • Manage conversation state in your application
  • Higher token costs for repeated context
  • Complex state management code

xAI Responses API

  • Send only new messages, context maintained server-side
  • Automatic conversation state management
  • Lower token costs, pay only for new content
  • Simple API, just reference previous response ID

Technical Details

  • Storage Duration: 30 days (configurable)
  • Endpoint: /v1/chat/completions (via extra_body)
  • Model: xai/grok-beta
  • Streaming: Fully supported with stateful conversations
  • Thinking: Encrypted reasoning content available via metadata
  • OpenAI Compatible: Works with standard OpenAI SDK