Available Models

36+ AI Models at Your Fingertips

OpenAI-compatible API — switch models instantly by changing a single field.

36+ Models Updated Daily

Credit Multipliers Standard Models (0) Frontier Models (36) Models API Sending Requests

Credit Multiplier System

Every request consumes credits equal to tokens processed multiplied by the model's credit multiplier. The base rate is 1.0×, ranging from 0.5× for lightweight models up to 3.5× for large frontier models.

Multiplier	Tier	Description
0.5×	Standard — Lightweight	Small, ultra-fast models ideal for simple tasks
1.0×	Frontier — Embedding	Lightweight embedding models
1.5×	Standard — Mid	Mid-size models with strong performance
2.5×	Frontier — Mid	Balanced frontier models for quality and cost
3.5×	Frontier — Large	Highest-performance, largest-scale models

Example: A request consuming 1,000 tokens on a 1.5× model deducts 1,500 credits. Monitor exact consumption via the usage field in every response.

Standard Models

Models optimized for fast performance and efficient credit usage. Base rate is 1 credit per token adjusted by the model multiplier.

Frontier Models

Frontier models provide access to some of the most powerful AI models available, with tens or hundreds of billions of parameters. Multipliers range from 2.5× to 3.5×.

nemotron-3-super

NVIDIA Nemotron 3 Super model

chat Small

3.5× credits per 1K tokens

Context: 128K tokens

qwen3.5:397b

Qwen 3.5 with 397B parameters (MoE)

thinking Large

3.5× credits per 1K tokens

Context: 33K tokens

gpt-oss:120b

OpenAI open-source 120B flagship model

chat Large

3.5× credits per 1K tokens

Context: 128K tokens

qwen3-next:80b

Qwen 3 Next with 80B parameters

thinking Large

3.0× credits per 1K tokens

Context: 128K tokens

gpt-oss:20b

OpenAI open-source 20B model

chat Medium

2.5× credits per 1K tokens

Context: 128K tokens

qwen3-vl:235b-instruct

Qwen3 Vision-Language 235B flagship multimodal model

vision Large

4.0× credits per 1K tokens

Context: 128K tokens

devstral-2:123b

Mistral Devstral 2 with 123B parameters

code Large

3.5× credits per 1K tokens

Context: 128K tokens

deepseek-v3.1:671b

DeepSeek V3.1 with 671B parameters

thinking Large

3.5× credits per 1K tokens

Context: 128K tokens

kimi-k2.5

Moonshot Kimi K2.5 reasoning model (~1T MoE)

thinking Small

3.5× credits per 1K tokens

Context: 128K tokens

kimi-k2:1t

Moonshot Kimi K2 1-trillion parameter MoE model

thinking Small

4.0× credits per 1K tokens

Context: 128K tokens

kimi-k2-thinking

Moonshot Kimi K2 extended thinking variant

thinking Small

4.0× credits per 1K tokens

Context: 128K tokens

minimax-m2.7

MiniMax M2.7 with 1M context window

chat Small

3.5× credits per 1K tokens

Context: 1,000K tokens

minimax-m2.5

MiniMax M2.5 long-context model

chat Small

3.0× credits per 1K tokens

Context: 1,000K tokens

minimax-m2.1

MiniMax M2.1 long-context model

chat Small

2.5× credits per 1K tokens

Context: 1,000K tokens

minimax-m2

MiniMax M2 original long-context model

chat Small

2.5× credits per 1K tokens

Context: 1,000K tokens

glm-5.1

Zhipu AI GLM-5.1 latest flagship multimodal model

vision Small

4.0× credits per 1K tokens

Context: 128K tokens

glm-5

Zhipu AI GLM-5 multimodal model

vision Small

3.5× credits per 1K tokens

Context: 128K tokens

glm-4.7

Zhipu AI GLM-4.7 multimodal model

vision Medium

2.5× credits per 1K tokens

Context: 128K tokens

glm-4.6

Zhipu AI GLM-4.6 multimodal model

vision Small

2.0× credits per 1K tokens

Context: 128K tokens

deepseek-v3.2

DeepSeek V3.2 latest version

thinking Large

3.5× credits per 1K tokens

Context: 128K tokens

gemma4:31b

Google Gemma 4 with 31B parameters

chat Large

3.5× credits per 1K tokens

Context: 128K tokens

gemma3:27b

Google Gemma 3 with 27B parameters

chat Medium

3.0× credits per 1K tokens

Context: 128K tokens

gemma3:12b

Google Gemma 3 with 12B parameters

chat Medium

2.5× credits per 1K tokens

Context: 128K tokens

gemma3:4b

Google Gemma 3 with 4B parameters

chat Small

1.5× credits per 1K tokens

Context: 128K tokens

gemini-3-flash-preview

Google Gemini 3 Flash preview model

chat Small

3.5× credits per 1K tokens

Context: 1,000K tokens

qwen3-coder:480b

Qwen 3 Coder 480B flagship coding model

code Large

4.0× credits per 1K tokens

Context: 128K tokens

qwen3-coder-next

Qwen 3 Coder next-generation coding model

code Small

3.5× credits per 1K tokens

Context: 128K tokens

qwen3-vl:235b

Qwen3 Vision-Language 235B base model

vision Large

4.0× credits per 1K tokens

Context: 128K tokens

mistral-large-3:675b

Mistral Large 3 with 675B parameters

chat Large

4.0× credits per 1K tokens

Context: 128K tokens

ministral-3:14b

Mistral Ministral 3 with 14B parameters

chat Medium

2.5× credits per 1K tokens

Context: 128K tokens

ministral-3:8b

Mistral Ministral 3 with 8B parameters

chat Medium

2.0× credits per 1K tokens

Context: 128K tokens

ministral-3:3b

Mistral Ministral 3 with 3B parameters — compact and fast

chat Small

1.5× credits per 1K tokens

Context: 128K tokens

devstral-small-2:24b

Mistral Devstral Small 2 with 24B parameters

code Medium

2.5× credits per 1K tokens

Context: 128K tokens

nemotron-3-nano:30b

NVIDIA Nemotron 3 Nano with 30B parameters

chat Large

3.0× credits per 1K tokens

Context: 128K tokens

cogito-2.1:671b

Cogito 2.1 reasoning model with 671B parameters

thinking Large

4.0× credits per 1K tokens

Context: 128K tokens

rnj-1:8b

RNJ-1 conversational model with 8B parameters

chat Medium

2.0× credits per 1K tokens

Context: 33K tokens

Models API Endpoint

Retrieve the full list of available models via the following OpenAI-compatible endpoint:

GET

GET https://llmapi.resayil.io/v1/models

bash

curl https://llmapi.resayil.io/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

json

{
  "object": "list",
  "data": [
    {
      "id": "llama3.2:3b",
      "object": "model",
      "created": 1700000000,
      "owned_by": "llm-resayil"
    },
    {
      "id": "qwen3.5:397b",
      "object": "model",
      "created": 1700000000,
      "owned_by": "llm-resayil"
    }
  ]
}

Note: The endpoint is also accessible at GET /api/v1/models. Both paths return the same list.

Sending Requests

All models share a single OpenAI-compatible endpoint. Simply change the model field to switch models:

POST

POST https://llmapi.resayil.io/v1/chat/completions

json

{
  "model": "mistral-small3.2:24b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain quantum computing in simple terms."}
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "max_tokens": 500,
  "stream": false
}

Every response includes a usage field showing exact token consumption:

json

"usage": {
  "prompt_tokens": 15,
  "completion_tokens": 142,
  "total_tokens": 157
}

Model Availability & Status

Full Access Across All Tiers

All subscription tiers have immediate access to all 36 models with no restrictions. The only differentiator is your available credit balance.

Model Updates

We continuously update the model catalog to include the latest and most capable models. New models appear immediately in GET /v1/models results and are ready to use.

Deprecations

If a model is deprecated, at least 30 days notice is provided along with migration guidance. Notifications are sent via email and dashboard alerts.

Related Resources

Getting Started — Your first API request
Billing & Credits — Credit consumption rates
Rate Limits — Request quotas per tier
Pricing — Subscription tiers and costs

Ready to start building?

Learn about the credit system and billing to understand costs.

Go to Billing & Credits →