Anthropic Messages API — LLM Resayil Documentation

Overview

The Anthropic Messages API is an alternative request/response format for interacting with LLM Resayil models. While our OpenAI-compatible /v1/chat/completions endpoint remains fully supported, the Messages API provides a familiar interface for developers already using the Anthropic ecosystem.

Why Use the Messages API?

Drop-in compatibility: Use the official Anthropic Python or TypeScript SDK by changing only the base URL.
Same backend: Both endpoints route to the same models, use the same credit system, and share the same authentication.
Structured tool calling: The Messages API supports the Anthropic tool-use format for function calling.
Streaming with events: Server-sent events with typed event blocks (message_start, content_block_delta, etc.).

Note: You do not need separate API keys for the Messages API. Your existing LLM Resayil API keys work with both /v1/chat/completions and /v1/messages.

Authentication

The Messages API accepts the same authentication methods as all other LLM Resayil endpoints. You can use either the standard Authorization: Bearer header or the Anthropic-style x-api-key header.

Option 1: Bearer Token (Standard)

Header

Authorization: Bearer YOUR_API_KEY

Option 2: x-api-key Header (Anthropic-style)

Header

x-api-key: YOUR_API_KEY

Both methods are equivalent. The x-api-key header is supported for compatibility with the official Anthropic SDK, which sends credentials this way by default.

Option 3: ANTHROPIC_AUTH_TOKEN env var

When ANTHROPIC_AUTH_TOKEN is set, the SDK automatically sends an Authorization: Bearer header. This is preferred over ANTHROPIC_API_KEY when connecting to LLM Resayil to avoid credential conflicts with a locally logged-in claude.ai session.

Anthropic SDK users: Use auth_token="rsl-your-key-here" and base_url="https://llm.resayil.io" when constructing the client. This sends your key as an Authorization: Bearer header and routes requests to LLM Resayil.

Basic Usage

Send a POST request to /v1/messages with a JSON body containing your model selection and messages.

Endpoint

POST https://llm.resayil.io/v1/messages

Simple Text Request

bash

curl -X POST https://llm.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "mistral",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Response

json

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "model": "mistral",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 14,
    "output_tokens": 9
  }
}

Request Format

The full request body specification for POST /v1/messages.

Parameter	Type	Required	Description
`model`	string	Yes	The model to use (e.g., `"mistral"`, `"llama3.1"`). See Available Models.
`messages`	array	Yes	Array of message objects with `role` and `content` fields.
`max_tokens`	integer	Yes	Maximum number of tokens to generate. Must be greater than 0.
`system`	string	No	System prompt. In the Anthropic format, the system message is a top-level field, not part of the messages array.
`temperature`	float	No	Sampling temperature between 0.0 and 1.0. Default: model-dependent.
`top_p`	float	No	Nucleus sampling parameter. Default: model-dependent.
`top_k`	integer	No	Top-K sampling parameter. Only sample from the top K most likely tokens.
`stop_sequences`	array	No	Array of strings that will cause the model to stop generating when encountered.
`stream`	boolean	No	Set to `true` for streaming responses via SSE. Default: `false`.
`tools`	array	No	Array of tool definitions for function calling. See Tool Calling.
`tool_choice`	object	No	Controls tool selection: `{"type": "auto"}`, `{"type": "any"}`, or `{"type": "tool", "name": "..."}`.

System Message

Unlike the OpenAI format where the system prompt is a message with "role": "system", the Anthropic format uses a top-level system field:

json

{
  "model": "mistral",
  "max_tokens": 1024,
  "system": "You are a helpful assistant who speaks formally.",
  "messages": [
    {"role": "user", "content": "Greet me."}
  ]
}

Multi-turn Conversation

json

{
  "model": "llama3.1",
  "max_tokens": 2048,
  "messages": [
    {"role": "user", "content": "What is machine learning?"},
    {"role": "assistant", "content": "Machine learning is a subset of AI..."},
    {"role": "user", "content": "Can you give me a practical example?"}
  ]
}

Response Format

Non-streaming responses return a complete message object.

Field	Type	Description
`id`	string	Unique message identifier (prefixed with `msg_`).
`type`	string	Always `"message"`.
`role`	string	Always `"assistant"`.
`content`	array	Array of content blocks. Each block has a `type` field (`"text"` or `"tool_use"`).
`model`	string	The model that generated the response.
`stop_reason`	string	Why generation stopped: `"end_turn"`, `"max_tokens"`, `"stop_sequence"`, or `"tool_use"`.
`stop_sequence`	string\|null	The stop sequence that triggered stopping, if applicable.
`usage`	object	Token usage: `input_tokens` and `output_tokens`.

Full Response Example

json

{
  "id": "msg_01A2B3C4D5E6F7G8H9I0J1K2",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here is a Python function that calculates factorial:\n\n```python\ndef factorial(n):\n    if n <= 1:\n        return 1\n    return n * factorial(n - 1)\n```"
    }
  ],
  "model": "mistral",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 22,
    "output_tokens": 48
  }
}

Streaming

Set "stream": true to receive the response as a stream of server-sent events (SSE). Each event has an event type and a data payload.

Streaming Request

bash

curl -X POST https://llm.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "mistral",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about coding."}
    ]
  }'

Event Types

Event	Description
`message_start`	Initial event containing the message object with metadata (id, model, role, usage).
`content_block_start`	Indicates a new content block is starting. Contains `index` and `content_block` with its type.
`content_block_delta`	Incremental text content. Contains `delta.text` with the new text fragment.
`content_block_stop`	Indicates the current content block has finished.
`message_delta`	Final message metadata update. Contains `stop_reason` and updated `usage`.
`message_stop`	Signals the end of the streamed message.
`ping`	Keep-alive event. Can be safely ignored.

Streaming Response Example

SSE Stream

event: message_start
data: {"type":"message_start","message":{"id":"msg_01ABC","type":"message","role":"assistant","content":[],"model":"mistral","stop_reason":null,"usage":{"input_tokens":15,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: ping
data: {"type":"ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Lines"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" of"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" code"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":12}}

event: message_stop
data: {"type":"message_stop"}

Tool Calling

The Messages API supports tool (function) calling using the Anthropic tool-use format. Define tools in your request, and the model may respond with a tool_use content block.

Step 1: Define Tools

json

{
  "model": "mistral",
  "max_tokens": 1024,
  "tools": [
    {
      "name": "get_weather",
      "description": "Get the current weather for a given location.",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City and country, e.g. London, UK"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"],
            "description": "Temperature unit"
          }
        },
        "required": ["location"]
      }
    }
  ],
  "messages": [
    {"role": "user", "content": "What is the weather in Kuwait City?"}
  ]
}

Step 2: Handle tool_use Response

When the model decides to use a tool, the response includes a tool_use content block:

json response

{
  "id": "msg_01XYZ",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_01ABC",
      "name": "get_weather",
      "input": {
        "location": "Kuwait City, Kuwait",
        "unit": "celsius"
      }
    }
  ],
  "stop_reason": "tool_use",
  "usage": {"input_tokens": 85, "output_tokens": 42}
}

Step 3: Send Tool Result

Execute the tool on your end and send the result back:

json

{
  "model": "mistral",
  "max_tokens": 1024,
  "messages": [
    {"role": "user", "content": "What is the weather in Kuwait City?"},
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "id": "toolu_01ABC",
          "name": "get_weather",
          "input": {"location": "Kuwait City, Kuwait", "unit": "celsius"}
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "toolu_01ABC",
          "content": "Currently 38C and sunny in Kuwait City."
        }
      ]
    }
  ]
}

Error Handling

Errors from the Messages API follow the Anthropic error format:

json error response

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens: field required"
  }
}

HTTP Status	Error Type	Description
`400`	`invalid_request_error`	Malformed request body, missing required fields, or invalid parameter values.
`401`	`authentication_error`	Missing or invalid API key.
`403`	`permission_error`	Your API key does not have permission for the requested action.
`404`	`not_found_error`	The requested model does not exist or is not available.
`429`	`rate_limit_error`	Too many requests. Wait and retry with exponential backoff.
`500`	`api_error`	Internal server error. Retry the request or contact support.
`529`	`overloaded_error`	The API is temporarily overloaded. Retry after a brief wait.

Retry strategy: For 429 and 529 errors, implement exponential backoff starting at 1 second. For 500 errors, retry up to 3 times before failing.

OpenAI vs Anthropic: Endpoint Comparison

Both endpoints coexist on the same backend. Here is a side-by-side comparison:

Feature	OpenAI Format	Anthropic Format
Endpoint	`/v1/chat/completions`	`/v1/messages`
HTTP Method	POST	POST
Auth Header	`Authorization: Bearer`	`x-api-key` or `Authorization: Bearer`
System Message	In messages array: `{"role": "system", ...}`	Top-level `"system"` field
max_tokens	Optional	Required
Response Content	`choices[0].message.content` (string)	`content[0].text` (array of blocks)
Stop Reason	`finish_reason`: `"stop"`, `"length"`	`stop_reason`: `"end_turn"`, `"max_tokens"`
Usage Tokens	`prompt_tokens` / `completion_tokens`	`input_tokens` / `output_tokens`
Streaming Format	SSE with `data: {}` lines	SSE with typed `event:` + `data:`
Tool Calling	`functions` / `tools` parameter	`tools` with `input_schema`
Credit System	Same credits	Same credits
API Keys	Same keys	Same keys

Code Examples

Python (Anthropic SDK)

Install the official Anthropic SDK and point it at LLM Resayil:

bash

pip install anthropic

python

import anthropic

client = anthropic.Anthropic(
    base_url="https://llm.resayil.io",
    auth_token="rsl-your-key-here",
)

message = client.messages.create(
    model="kimi-k2.6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
)

print(message.content[0].text)

Python (Anthropic SDK with Streaming)

python

import anthropic

client = anthropic.Anthropic(
    base_url="https://llm.resayil.io",
    auth_token="rsl-your-key-here",
)

with client.messages.stream(
    model="kimi-k2.6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short story about AI."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js (Anthropic SDK)

javascript

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  authToken: 'rsl-your-key-here',
  baseURL: 'https://llm.resayil.io'
});

const message = await client.messages.create({
  model: 'kimi-k2.6',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'What are the benefits of TypeScript?' }
  ]
});

console.log(message.content[0].text);

cURL with System Prompt

bash

curl -X POST https://llm.resayil.io/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "llama3.1",
    "max_tokens": 2048,
    "system": "You are a senior software architect. Answer concisely.",
    "messages": [
      {"role": "user", "content": "When should I use microservices vs monolith?"}
    ]
  }'

Feature Support Matrix

What is currently supported and what is not in our Messages API implementation:

Feature	Status	Notes
Text messages	Supported	Full support for text input/output.
System prompt	Supported	Top-level `system` field.
Multi-turn conversations	Supported	Alternating user/assistant messages.
Streaming (SSE)	Supported	All standard event types.
Tool calling	Supported	Tool definitions, tool_use responses, tool_result messages.
Temperature / top_p / top_k	Supported	Standard sampling parameters.
Stop sequences	Supported	Custom stop strings.
Image input (vision)	Not Supported	Image content blocks are not accepted. Use text-only messages.
PDF input	Not Supported	Document content blocks are not accepted.
Prompt caching	Not Supported	The `cache_control` parameter is ignored.
Batch API	Not Supported	Use individual requests.
Extended thinking	Not Supported	The `thinking` parameter is not available.

Available Models

Use the GET /v1/models endpoint to list all models available to your account. Use the exact model id in your request's model field — spelling must match exactly (e.g. kimi-k2.6 not kimi-k2-6).

bash

curl https://llm.resayil.io/v1/models \
  -H "Authorization: Bearer rsl-your-key-here"

The response returns an OpenAI-compatible array with each model's id and display name. Use the id value in your /v1/messages requests. See Available Models for the full list.

Claude Code CLI Setup NEW

You can use the official Claude Code CLI with LLM Resayil as the backend, allowing you to run any available model instead of being locked to Claude models.

Key note: Use ANTHROPIC_AUTH_TOKEN (not ANTHROPIC_API_KEY). This avoids auth conflicts if you are also logged into claude.ai on the same machine.

Windows — .cmd launcher

Create a file named claude-resayil.cmd and add it to your PATH:

claude-resayil.cmd

@echo off
set ANTHROPIC_AUTH_TOKEN=rsl-your-key-here
set ANTHROPIC_BASE_URL=https://llm.resayil.io
set ANTHROPIC_API_KEY=
%USERPROFILE%\.local\bin\claude.exe --model kimi-k2.6 %*

Save the file, add its folder to PATH, then run it anywhere with claude-resayil.

Mac / Linux — shell launcher

Create a file named claude-resayil and make it executable:

claude-resayil (bash)

#!/bin/bash
export ANTHROPIC_AUTH_TOKEN=rsl-your-key-here
export ANTHROPIC_BASE_URL=https://llm.resayil.io
export ANTHROPIC_API_KEY=
claude --model kimi-k2.6 "$@"

Then run: chmod +x claude-resayil && sudo mv claude-resayil /usr/local/bin/

Important notes

The model name must match exactly the id from GET /v1/models — e.g. kimi-k2.6 not kimi-k2-6.
Extended thinking requests are silently ignored — Claude Code works fine, just without thinking output.
Set ANTHROPIC_API_KEY to empty to prevent it from overriding ANTHROPIC_AUTH_TOKEN.

Integration Guide

If you already use the Anthropic SDK, switching to LLM Resayil requires minimal changes.

Environment Variable Setup

Set the following environment variables for your application:

env

ANTHROPIC_AUTH_TOKEN=rsl-your-key-here
ANTHROPIC_BASE_URL=https://llm.resayil.io
ANTHROPIC_API_KEY=

Credit System for Anthropic Users

Credits are deducted identically whether you use the OpenAI or Anthropic endpoint. Token usage reported in the response (input_tokens and output_tokens) maps directly to credit consumption based on the model tier:

Small models (e.g., phi3): 0.5 credits per 1K tokens
Medium models (e.g., mistral): 1.5 credits per 1K tokens
Large models (e.g., llama3.1): 3.0 credits per 1K tokens

Check your credit balance on the dashboard or via the Credits documentation.

Switching from Anthropic API

If you are migrating from the official Anthropic API:

Create an account at LLM Resayil and generate an API key.
Change your base_url to https://llm.resayil.io/v1.
Replace your Anthropic API key with your LLM Resayil API key.
Update model to one of our available models (e.g., "mistral", "llama3.1").
Remove any unsupported parameters (cache_control, image blocks).
Test your integration on the dev environment at https://llmdev.resayil.io/v1.

That's it. Your existing code, error handling, and streaming logic all work unchanged. The response format is identical to what the official Anthropic API returns.

Troubleshooting

Common Issues

Problem	Cause	Solution
"max_tokens: field required"	Missing `max_tokens` in request body	Add `"max_tokens": 1024` (required for Messages API, unlike OpenAI).
401 authentication_error	Invalid or missing API key	Check that `x-api-key` or `Authorization: Bearer` is set correctly.
"model not found"	Using an Anthropic model name like `"claude-3-sonnet"`	Use LLM Resayil model names. See Available Models.
Empty response content	`max_tokens` set too low	Increase `max_tokens` to give the model room to respond.
Streaming not working	Missing `"stream": true` or incorrect SSE parsing	Ensure `stream` is a boolean (not string). Parse `event:` and `data:` lines.
Tool result not accepted	Mismatched `tool_use_id`	The `tool_use_id` in your `tool_result` must match the `id` from the `tool_use` response.

Frequently Asked Questions

Can I use both endpoints in the same application?

Yes. The OpenAI /v1/chat/completions and Anthropic /v1/messages endpoints coexist independently. You can call either one using the same API key. Credits are shared across both.

Do I need a separate API key for the Messages API?

No. Your existing LLM Resayil API key works for both endpoints. No additional configuration is needed.

Which models can I use with the Messages API?

All models available on LLM Resayil work with both endpoints. The model names are the same (e.g., "mistral", "llama3.1"). Note that Anthropic-specific model names like "claude-3-sonnet" are not available.

Is the credit cost different?

No. Credits are calculated identically regardless of which endpoint you use. The same model with the same token count costs the same credits.

Can I send images or PDFs?

Not currently. Only text content blocks are supported. Image and PDF input will be rejected with an error.

Is the anthropic-version header required?

It is recommended but not strictly required. If omitted, the API defaults to the latest supported version. For production use, pin to 2023-06-01 for consistent behavior.

Anthropic Messages API NEW

Overview

Why Use the Messages API?

Authentication

Option 1: Bearer Token (Standard)

Option 2: x-api-key Header (Anthropic-style)

Option 3: ANTHROPIC_AUTH_TOKEN env var

Basic Usage

Endpoint

Simple Text Request

Response

Request Format

System Message

Multi-turn Conversation

Response Format

Full Response Example

Streaming

Streaming Request

Event Types

Streaming Response Example

Tool Calling

Step 1: Define Tools

Step 2: Handle tool_use Response

Step 3: Send Tool Result

Error Handling

OpenAI vs Anthropic: Endpoint Comparison

Code Examples

Python (Anthropic SDK)

Python (Anthropic SDK with Streaming)

Node.js (Anthropic SDK)

cURL with System Prompt

Feature Support Matrix

Available Models

Claude Code CLI Setup NEW

Windows — .cmd launcher

Mac / Linux — shell launcher

Important notes

Integration Guide

Environment Variable Setup

Credit System for Anthropic Users

Switching from Anthropic API

Troubleshooting

Common Issues

Frequently Asked Questions

Can I use both endpoints in the same application?

Do I need a separate API key for the Messages API?

Which models can I use with the Messages API?

Is the credit cost different?

Can I send images or PDFs?

Is the anthropic-version header required?