Home Documentation Anthropic Messages API

Anthropic Messages API NEW

LLM Resayil now supports the Anthropic Messages API format alongside the existing OpenAI-compatible endpoint. Use POST /v1/messages with the same API keys, same credits, and same backend models. If you already use the Anthropic SDK, point it at our base URL and start making requests immediately.

Overview

The Anthropic Messages API is an alternative request/response format for interacting with LLM Resayil models. While our OpenAI-compatible /v1/chat/completions endpoint remains fully supported, the Messages API provides a familiar interface for developers already using the Anthropic ecosystem.

Why Use the Messages API?

  • Drop-in compatibility: Use the official Anthropic Python or TypeScript SDK by changing only the base URL.
  • Same backend: Both endpoints route to the same models, use the same credit system, and share the same authentication.
  • Structured tool calling: The Messages API supports the Anthropic tool-use format for function calling.
  • Streaming with events: Server-sent events with typed event blocks (message_start, content_block_delta, etc.).

Note: You do not need separate API keys for the Messages API. Your existing LLM Resayil API keys work with both /v1/chat/completions and /v1/messages.

Authentication

The Messages API accepts the same authentication methods as all other LLM Resayil endpoints. You can use either the standard Authorization: Bearer header or the Anthropic-style x-api-key header.

Option 1: Bearer Token (Standard)

Header
Authorization: Bearer YOUR_API_KEY

Option 2: x-api-key Header (Anthropic-style)

Header
x-api-key: YOUR_API_KEY

Both methods are equivalent. The x-api-key header is supported for compatibility with the official Anthropic SDK, which sends credentials this way by default.

Anthropic SDK users: The SDK sends x-api-key automatically. You only need to configure the base_url and provide your LLM Resayil API key as the api_key parameter.

Basic Usage

Send a POST request to /v1/messages with a JSON body containing your model selection and messages.

Endpoint

Endpoint
POST https://llmapi.resayil.io/v1/messages

Simple Text Request

bash
curl -X POST https://llmapi.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "mistral",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Response

json
{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "model": "mistral",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 14,
    "output_tokens": 9
  }
}

Request Format

The full request body specification for POST /v1/messages.

Parameter Type Required Description
model string Yes The model to use (e.g., "mistral", "llama3.1"). See Available Models.
messages array Yes Array of message objects with role and content fields.
max_tokens integer Yes Maximum number of tokens to generate. Must be greater than 0.
system string No System prompt. In the Anthropic format, the system message is a top-level field, not part of the messages array.
temperature float No Sampling temperature between 0.0 and 1.0. Default: model-dependent.
top_p float No Nucleus sampling parameter. Default: model-dependent.
top_k integer No Top-K sampling parameter. Only sample from the top K most likely tokens.
stop_sequences array No Array of strings that will cause the model to stop generating when encountered.
stream boolean No Set to true for streaming responses via SSE. Default: false.
tools array No Array of tool definitions for function calling. See Tool Calling.
tool_choice object No Controls tool selection: {"type": "auto"}, {"type": "any"}, or {"type": "tool", "name": "..."}.

System Message

Unlike the OpenAI format where the system prompt is a message with "role": "system", the Anthropic format uses a top-level system field:

json
{
  "model": "mistral",
  "max_tokens": 1024,
  "system": "You are a helpful assistant who speaks formally.",
  "messages": [
    {"role": "user", "content": "Greet me."}
  ]
}

Multi-turn Conversation

json
{
  "model": "llama3.1",
  "max_tokens": 2048,
  "messages": [
    {"role": "user", "content": "What is machine learning?"},
    {"role": "assistant", "content": "Machine learning is a subset of AI..."},
    {"role": "user", "content": "Can you give me a practical example?"}
  ]
}

Response Format

Non-streaming responses return a complete message object.

Field Type Description
id string Unique message identifier (prefixed with msg_).
type string Always "message".
role string Always "assistant".
content array Array of content blocks. Each block has a type field ("text" or "tool_use").
model string The model that generated the response.
stop_reason string Why generation stopped: "end_turn", "max_tokens", "stop_sequence", or "tool_use".
stop_sequence string|null The stop sequence that triggered stopping, if applicable.
usage object Token usage: input_tokens and output_tokens.

Full Response Example

json
{
  "id": "msg_01A2B3C4D5E6F7G8H9I0J1K2",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here is a Python function that calculates factorial:\n\n```python\ndef factorial(n):\n    if n <= 1:\n        return 1\n    return n * factorial(n - 1)\n```"
    }
  ],
  "model": "mistral",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 22,
    "output_tokens": 48
  }
}

Streaming

Set "stream": true to receive the response as a stream of server-sent events (SSE). Each event has an event type and a data payload.

Streaming Request

bash
curl -X POST https://llmapi.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "mistral",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about coding."}
    ]
  }'

Event Types

Event Description
message_start Initial event containing the message object with metadata (id, model, role, usage).
content_block_start Indicates a new content block is starting. Contains index and content_block with its type.
content_block_delta Incremental text content. Contains delta.text with the new text fragment.
content_block_stop Indicates the current content block has finished.
message_delta Final message metadata update. Contains stop_reason and updated usage.
message_stop Signals the end of the streamed message.
ping Keep-alive event. Can be safely ignored.

Streaming Response Example

SSE Stream
event: message_start
data: {"type":"message_start","message":{"id":"msg_01ABC","type":"message","role":"assistant","content":[],"model":"mistral","stop_reason":null,"usage":{"input_tokens":15,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: ping
data: {"type":"ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Lines"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" of"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" code"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":12}}

event: message_stop
data: {"type":"message_stop"}

Tool Calling

The Messages API supports tool (function) calling using the Anthropic tool-use format. Define tools in your request, and the model may respond with a tool_use content block.

Step 1: Define Tools

json
{
  "model": "mistral",
  "max_tokens": 1024,
  "tools": [
    {
      "name": "get_weather",
      "description": "Get the current weather for a given location.",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City and country, e.g. London, UK"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "Temperature unit"
          }
        },
        "required": ["location"]
      }
    }
  ],
  "messages": [
    {"role": "user", "content": "What is the weather in Kuwait City?"}
  ]
}

Step 2: Handle tool_use Response

When the model decides to use a tool, the response includes a tool_use content block:

json response
{
  "id": "msg_01XYZ",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_01ABC",
      "name": "get_weather",
      "input": {
        "location": "Kuwait City, Kuwait",
        "unit": "celsius"
      }
    }
  ],
  "stop_reason": "tool_use",
  "usage": {"input_tokens": 85, "output_tokens": 42}
}

Step 3: Send Tool Result

Execute the tool on your end and send the result back:

json
{
  "model": "mistral",
  "max_tokens": 1024,
  "messages": [
    {"role": "user", "content": "What is the weather in Kuwait City?"},
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "id": "toolu_01ABC",
          "name": "get_weather",
          "input": {"location": "Kuwait City, Kuwait", "unit": "celsius"}
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "toolu_01ABC",
          "content": "Currently 38C and sunny in Kuwait City."
        }
      ]
    }
  ]
}

Error Handling

Errors from the Messages API follow the Anthropic error format:

json error response
{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens: field required"
  }
}
HTTP Status Error Type Description
400 invalid_request_error Malformed request body, missing required fields, or invalid parameter values.
401 authentication_error Missing or invalid API key.
403 permission_error Your API key does not have permission for the requested action.
404 not_found_error The requested model does not exist or is not available.
429 rate_limit_error Too many requests. Wait and retry with exponential backoff.
500 api_error Internal server error. Retry the request or contact support.
529 overloaded_error The API is temporarily overloaded. Retry after a brief wait.

Retry strategy: For 429 and 529 errors, implement exponential backoff starting at 1 second. For 500 errors, retry up to 3 times before failing.

OpenAI vs Anthropic: Endpoint Comparison

Both endpoints coexist on the same backend. Here is a side-by-side comparison:

Feature OpenAI Format Anthropic Format
Endpoint /v1/chat/completions /v1/messages
HTTP Method POST POST
Auth Header Authorization: Bearer x-api-key or Authorization: Bearer
System Message In messages array: {"role": "system", ...} Top-level "system" field
max_tokens Optional Required
Response Content choices[0].message.content (string) content[0].text (array of blocks)
Stop Reason finish_reason: "stop", "length" stop_reason: "end_turn", "max_tokens"
Usage Tokens prompt_tokens / completion_tokens input_tokens / output_tokens
Streaming Format SSE with data: {} lines SSE with typed event: + data:
Tool Calling functions / tools parameter tools with input_schema
Credit System Same credits Same credits
API Keys Same keys Same keys

Code Examples

Python (Anthropic SDK)

Install the official Anthropic SDK and point it at LLM Resayil:

bash
pip install anthropic
python
import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://llmapi.resayil.io/v1"
)

message = client.messages.create(
    model="mistral",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
)

print(message.content[0].text)

Python (Anthropic SDK with Streaming)

python
import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://llmapi.resayil.io/v1"
)

with client.messages.stream(
    model="mistral",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short story about AI."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js (Anthropic SDK)

javascript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://llmapi.resayil.io/v1'
});

const message = await client.messages.create({
  model: 'mistral',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'What are the benefits of TypeScript?' }
  ]
});

console.log(message.content[0].text);

cURL with System Prompt

bash
curl -X POST https://llmapi.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "llama3.1",
    "max_tokens": 2048,
    "system": "You are a senior software architect. Answer concisely.",
    "messages": [
      {"role": "user", "content": "When should I use microservices vs monolith?"}
    ]
  }'

Feature Support Matrix

What is currently supported and what is not in our Messages API implementation:

Feature Status Notes
Text messages Supported Full support for text input/output.
System prompt Supported Top-level system field.
Multi-turn conversations Supported Alternating user/assistant messages.
Streaming (SSE) Supported All standard event types.
Tool calling Supported Tool definitions, tool_use responses, tool_result messages.
Temperature / top_p / top_k Supported Standard sampling parameters.
Stop sequences Supported Custom stop strings.
Image input (vision) Not Supported Image content blocks are not accepted. Use text-only messages.
PDF input Not Supported Document content blocks are not accepted.
Prompt caching Not Supported The cache_control parameter is ignored.
Batch API Not Supported Use individual requests.
Extended thinking Not Supported The thinking parameter is not available.

Integration Guide

If you already use the Anthropic SDK, switching to LLM Resayil requires minimal changes.

Environment Variable Setup

Set the following environment variables for your application:

env
ANTHROPIC_BASE_URL=https://llmapi.resayil.io/v1
ANTHROPIC_API_KEY=sk-your-llm-resayil-api-key

Credit System for Anthropic Users

Credits are deducted identically whether you use the OpenAI or Anthropic endpoint. Token usage reported in the response (input_tokens and output_tokens) maps directly to credit consumption based on the model tier:

  • Small models (e.g., phi3): 0.5 credits per 1K tokens
  • Medium models (e.g., mistral): 1.5 credits per 1K tokens
  • Large models (e.g., llama3.1): 3.0 credits per 1K tokens

Check your credit balance on the dashboard or via the Credits documentation.

Switching from Anthropic API

If you are migrating from the official Anthropic API:

  1. Create an account at LLM Resayil and generate an API key.
  2. Change your base_url to https://llmapi.resayil.io/v1.
  3. Replace your Anthropic API key with your LLM Resayil API key.
  4. Update model to one of our available models (e.g., "mistral", "llama3.1").
  5. Remove any unsupported parameters (cache_control, image blocks).
  6. Test your integration on the dev environment at https://llmdev.resayil.io/v1.

That's it. Your existing code, error handling, and streaming logic all work unchanged. The response format is identical to what the official Anthropic API returns.

Troubleshooting

Common Issues

Problem Cause Solution
"max_tokens: field required" Missing max_tokens in request body Add "max_tokens": 1024 (required for Messages API, unlike OpenAI).
401 authentication_error Invalid or missing API key Check that x-api-key or Authorization: Bearer is set correctly.
"model not found" Using an Anthropic model name like "claude-3-sonnet" Use LLM Resayil model names. See Available Models.
Empty response content max_tokens set too low Increase max_tokens to give the model room to respond.
Streaming not working Missing "stream": true or incorrect SSE parsing Ensure stream is a boolean (not string). Parse event: and data: lines.
Tool result not accepted Mismatched tool_use_id The tool_use_id in your tool_result must match the id from the tool_use response.

Frequently Asked Questions

Can I use both endpoints in the same application?

Yes. The OpenAI /v1/chat/completions and Anthropic /v1/messages endpoints coexist independently. You can call either one using the same API key. Credits are shared across both.

Do I need a separate API key for the Messages API?

No. Your existing LLM Resayil API key works for both endpoints. No additional configuration is needed.

Which models can I use with the Messages API?

All models available on LLM Resayil work with both endpoints. The model names are the same (e.g., "mistral", "llama3.1"). Note that Anthropic-specific model names like "claude-3-sonnet" are not available.

Is the credit cost different?

No. Credits are calculated identically regardless of which endpoint you use. The same model with the same token count costs the same credits.

Can I send images or PDFs?

Not currently. Only text content blocks are supported. Image and PDF input will be rejected with an error.

Is the anthropic-version header required?

It is recommended but not strictly required. If omitted, the API defaults to the latest supported version. For production use, pin to 2023-06-01 for consistent behavior.