Available Models

36+ AI Models at Your Fingertips

OpenAI-compatible API — switch models instantly by changing a single field.

36+ Models Updated Daily
Home Documentation Available Models

Credit Multiplier System

Every request consumes credits equal to tokens processed multiplied by the model's credit multiplier. The base rate is 1.0×, ranging from 0.5× for lightweight models up to 3.5× for large frontier models.

Multiplier Tier Description
0.5×
Standard — Lightweight Small, ultra-fast models ideal for simple tasks
1.0×
Frontier — Embedding Lightweight embedding models
1.5×
Standard — Mid Mid-size models with strong performance
2.5×
Frontier — Mid Balanced frontier models for quality and cost
3.5×
Frontier — Large Highest-performance, largest-scale models

Example: A request consuming 1,000 tokens on a 1.5× model deducts 1,500 credits. Monitor exact consumption via the usage field in every response.

Standard Models

0

Models optimized for fast performance and efficient credit usage. Base rate is 1 credit per token adjusted by the model multiplier.

Frontier Models

36

Frontier models provide access to some of the most powerful AI models available, with tens or hundreds of billions of parameters. Multipliers range from 2.5× to 3.5×.

nemotron-3-super
NVIDIA Nemotron 3 Super model
chat Small
3.5× credits per 1K tokens
Context: 128K tokens
qwen3.5:397b
Qwen 3.5 with 397B parameters (MoE)
thinking Large
3.5× credits per 1K tokens
Context: 33K tokens
gpt-oss:120b
OpenAI open-source 120B flagship model
chat Large
3.5× credits per 1K tokens
Context: 128K tokens
qwen3-next:80b
Qwen 3 Next with 80B parameters
thinking Large
3.0× credits per 1K tokens
Context: 128K tokens
gpt-oss:20b
OpenAI open-source 20B model
chat Medium
2.5× credits per 1K tokens
Context: 128K tokens
qwen3-vl:235b-instruct
Qwen3 Vision-Language 235B flagship multimodal model
vision Large
4.0× credits per 1K tokens
Context: 128K tokens
devstral-2:123b
Mistral Devstral 2 with 123B parameters
code Large
3.5× credits per 1K tokens
Context: 128K tokens
deepseek-v3.1:671b
DeepSeek V3.1 with 671B parameters
thinking Large
3.5× credits per 1K tokens
Context: 128K tokens
kimi-k2.5
Moonshot Kimi K2.5 reasoning model (~1T MoE)
thinking Small
3.5× credits per 1K tokens
Context: 128K tokens
kimi-k2:1t
Moonshot Kimi K2 1-trillion parameter MoE model
thinking Small
4.0× credits per 1K tokens
Context: 128K tokens
kimi-k2-thinking
Moonshot Kimi K2 extended thinking variant
thinking Small
4.0× credits per 1K tokens
Context: 128K tokens
minimax-m2.7
MiniMax M2.7 with 1M context window
chat Small
3.5× credits per 1K tokens
Context: 1,000K tokens
minimax-m2.5
MiniMax M2.5 long-context model
chat Small
3.0× credits per 1K tokens
Context: 1,000K tokens
minimax-m2.1
MiniMax M2.1 long-context model
chat Small
2.5× credits per 1K tokens
Context: 1,000K tokens
minimax-m2
MiniMax M2 original long-context model
chat Small
2.5× credits per 1K tokens
Context: 1,000K tokens
glm-5.1
Zhipu AI GLM-5.1 latest flagship multimodal model
vision Small
4.0× credits per 1K tokens
Context: 128K tokens
glm-5
Zhipu AI GLM-5 multimodal model
vision Small
3.5× credits per 1K tokens
Context: 128K tokens
glm-4.7
Zhipu AI GLM-4.7 multimodal model
vision Medium
2.5× credits per 1K tokens
Context: 128K tokens
glm-4.6
Zhipu AI GLM-4.6 multimodal model
vision Small
2.0× credits per 1K tokens
Context: 128K tokens
deepseek-v3.2
DeepSeek V3.2 latest version
thinking Large
3.5× credits per 1K tokens
Context: 128K tokens
gemma4:31b
Google Gemma 4 with 31B parameters
chat Large
3.5× credits per 1K tokens
Context: 128K tokens
gemma3:27b
Google Gemma 3 with 27B parameters
chat Medium
3.0× credits per 1K tokens
Context: 128K tokens
gemma3:12b
Google Gemma 3 with 12B parameters
chat Medium
2.5× credits per 1K tokens
Context: 128K tokens
gemma3:4b
Google Gemma 3 with 4B parameters
chat Small
1.5× credits per 1K tokens
Context: 128K tokens
gemini-3-flash-preview
Google Gemini 3 Flash preview model
chat Small
3.5× credits per 1K tokens
Context: 1,000K tokens
qwen3-coder:480b
Qwen 3 Coder 480B flagship coding model
code Large
4.0× credits per 1K tokens
Context: 128K tokens
qwen3-coder-next
Qwen 3 Coder next-generation coding model
code Small
3.5× credits per 1K tokens
Context: 128K tokens
qwen3-vl:235b
Qwen3 Vision-Language 235B base model
vision Large
4.0× credits per 1K tokens
Context: 128K tokens
mistral-large-3:675b
Mistral Large 3 with 675B parameters
chat Large
4.0× credits per 1K tokens
Context: 128K tokens
ministral-3:14b
Mistral Ministral 3 with 14B parameters
chat Medium
2.5× credits per 1K tokens
Context: 128K tokens
ministral-3:8b
Mistral Ministral 3 with 8B parameters
chat Medium
2.0× credits per 1K tokens
Context: 128K tokens
ministral-3:3b
Mistral Ministral 3 with 3B parameters — compact and fast
chat Small
1.5× credits per 1K tokens
Context: 128K tokens
devstral-small-2:24b
Mistral Devstral Small 2 with 24B parameters
code Medium
2.5× credits per 1K tokens
Context: 128K tokens
nemotron-3-nano:30b
NVIDIA Nemotron 3 Nano with 30B parameters
chat Large
3.0× credits per 1K tokens
Context: 128K tokens
cogito-2.1:671b
Cogito 2.1 reasoning model with 671B parameters
thinking Large
4.0× credits per 1K tokens
Context: 128K tokens
rnj-1:8b
RNJ-1 conversational model with 8B parameters
chat Medium
2.0× credits per 1K tokens
Context: 33K tokens

Models API Endpoint

Retrieve the full list of available models via the following OpenAI-compatible endpoint:

GET
GET https://llmapi.resayil.io/v1/models
bash
curl https://llmapi.resayil.io/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"
json
{
  "object": "list",
  "data": [
    {
      "id": "llama3.2:3b",
      "object": "model",
      "created": 1700000000,
      "owned_by": "llm-resayil"
    },
    {
      "id": "qwen3.5:397b",
      "object": "model",
      "created": 1700000000,
      "owned_by": "llm-resayil"
    }
  ]
}

Note: The endpoint is also accessible at GET /api/v1/models. Both paths return the same list.

Sending Requests

All models share a single OpenAI-compatible endpoint. Simply change the model field to switch models:

POST
POST https://llmapi.resayil.io/v1/chat/completions
json
{
  "model": "mistral-small3.2:24b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain quantum computing in simple terms."}
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "max_tokens": 500,
  "stream": false
}

Every response includes a usage field showing exact token consumption:

json
"usage": {
  "prompt_tokens": 15,
  "completion_tokens": 142,
  "total_tokens": 157
}

Model Availability & Status

Full Access Across All Tiers

All subscription tiers have immediate access to all 36 models with no restrictions. The only differentiator is your available credit balance.

Model Updates

We continuously update the model catalog to include the latest and most capable models. New models appear immediately in GET /v1/models results and are ready to use.

Deprecations

If a model is deprecated, at least 30 days notice is provided along with migration guidance. Notifications are sent via email and dashboard alerts.

Related Resources

Ready to start building?

Learn about the credit system and billing to understand costs.

Go to Billing & Credits →