Developers building conversational agents often need the model to remember what was said a few turns earlier, handle user corrections, and switch languages on the fly. Multi‑turn conversations—the back‑and‑forth exchange where each request carries the full dialogue history—are the backbone of modern chatbots, virtual assistants, and customer‑support bots. LLM Resayil’s API is designed for exactly this pattern, offering OpenAI‑compatible and Anthropic‑compatible endpoints, streaming responses, function calling, vision, and full Arabic language support.
Introduction
Developers building conversational agents often need the model to remember what was said a few turns earlier, handle user corrections, and switch languages on the fly. Multi‑turn conversations—the back‑and‑forth exchange where each request carries the full dialogue history—are the backbone of modern chatbots, virtual assistants, and customer‑support bots. LLM Resayil’s API is designed for exactly this pattern, offering OpenAI‑compatible and Anthropic‑compatible endpoints, streaming responses, function calling, vision, and full Arabic language support.
Comparison Table
| Feature | LLM Resayil | OpenAI |
|---------|--------------|---------|
| API compatibility | OpenAI‑compatible and Anthropic‑compatible | OpenAI‑compatible only |
| Arabic language support | Built‑in Arabic & multi‑language handling | Limited Arabic support (mostly English‑centric) |
| Streaming | Yes, via stream=true | Yes |
| Function calling / tool use | Yes, supports function calling and tool use | Yes (function calling) |
| Vision (image input) | Yes, vision models available | Yes (via dedicated vision endpoints) |
| Thinking models | Available (e.g., deepseek-v4-pro) | Available (GPT‑4 Turbo, etc.) |
| Pay‑per‑use pricing | Credit‑based, billed in USD | Credit‑based, billed in USD |
| Payment methods | Stripe, PayPal | Credit card, PayPal |
| Hosting | USA | Multiple regions |
| Integrations | n8n, LangChain, LiteLLM, OpenAI SDK, Anthropic SDK, Python, JavaScript, cURL | Official SDKs, LangChain, etc. |
What LLM Resayil Offers
LLM Resayil delivers a single API that speaks the languages of both OpenAI and Anthropic, letting you switch providers without changing code. Every request can include a full messages array, so the model sees the entire conversation context. The platform supports Arabic language out of the box, meaning tokenisation, diacritics and right‑to‑left script are handled natively.
In addition to standard chat, Resayil provides function calling, tool use, vision (image inputs), and thinking models that excel at step‑by‑step reasoning. All of these capabilities are accessible through the same endpoints, simplifying development and reducing integration overhead.
What OpenAI Offers
OpenAI provides a mature suite of models, extensive documentation, and a large ecosystem of libraries. Its API is widely adopted, with strong support for streaming, function calling, and vision models (e.g., GPT‑4V). Pricing is credit‑based and billed in USD, and the platform integrates with many cloud services.
Why LLM Resayil Wins for Multi‑Turn Arabic Chatbots
When your bot must converse fluently in Arabic over many turns, Resayil’s native Arabic language support eliminates the need for work‑arounds or post‑processing. Because the same endpoint can handle OpenAI‑style chat, Anthropic‑style messages, and vision inputs, you can prototype complex flows—like asking a user for a photo, analysing it, then calling an external API—all without changing the request format.
The pay‑per‑use model lets you scale from a prototype to production without upfront licensing costs, and the credit system is transparent: each token consumed is deducted from your balance.
What You Get by Using LLM Resayil
- OpenAI and Anthropic compatibility – drop‑in replacements for existing SDKs.
- Full Arabic language handling – right‑to‑left script, diacritics, and cultural nuances are processed natively.
- Streaming responses – deliver real‑time replies to users.
- Function calling & tool use – integrate external services within the conversation.
- Vision capabilities – accept image inputs for richer interactions.
- Thinking models – leverage models optimized for reasoning across turns.
- Pay‑per‑use credits – only pay for the tokens you actually consume.
- Easy integration – ready‑made adapters for n8n, LangChain, LiteLLM, Python, JavaScript, cURL, and the official SDKs.
What Are Multi‑Turn Conversations and Why They Matter (≈300 words)
A multi‑turn conversation is a dialogue where each request includes the full history of prior exchanges. Instead of treating each user message as an isolated prompt, the model receives a messages array:
[
{"role": "system", "content": "You are a helpful Arabic assistant."},
{"role": "user", "content": "ما هو الطقس اليوم؟"},
{"role": "assistant", "content": "الطقس مشمس مع درجة حرارة 28°"},
{"role": "user", "content": "هل سيستمر هذا الطقس غداً؟"}
]
The model can reference earlier turns, understand follow‑up questions, and keep track of user intent. This capability is essential for:
- Customer support – agents can ask clarifying questions without losing context.
- Virtual assistants – users can refine commands (“play the song”, then “increase the volume”).
- Educational bots – step‑by‑step tutoring benefits from remembering previous answers.
LLM Resayil’s endpoints accept this array directly, making multi‑turn handling a matter of appending new messages to the list and sending the updated payload.
Getting Started: API Endpoints for Multi‑Turn Conversations (≈420 words)
Resayil provides two primary endpoints for conversational workloads:
/v1/chat/completions– OpenAI‑compatible. Use themessagesfield to pass the conversation history. The response contains anassistantmessage that you append to the array for the next turn./v1/messages– Anthropic‑compatible. The payload is similar, withrolevalues (assistant,user,system).
Basic Request Flow
- Create the initial payload with a
systemprompt that defines the bot’s behaviour (e.g., Arabic assistance). - Add the user message as the latest entry in the
messagesarray. - POST the payload to the chosen endpoint.
- Append the returned assistant message to the array for the next request.
Streaming for Real‑Time Replies
Add "stream": true to the JSON body (or ?stream=true as a query parameter). The API will emit Server‑Sent Events (SSE) where each chunk contains a partial token. This is useful for chat UI that displays text as it arrives.
Example (cURL – OpenAI style)
curl https://llm.resayil.io/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "system", "content": "You are an Arabic‑speaking assistant."},
{"role": "user", "content": "ما هي عاصمة مصر؟"}
],
"stream": false
}'
The response includes choices[0].message.content which you store and add to the next request's messages array.
Advanced Features for Rich Multi‑Turn Interactions (≈380 words)
Beyond plain chat, Resayil’s API supports several advanced capabilities that can be woven into a multi‑turn flow.
Function Calling & Tool Use
Define a JSON schema for a function (e.g., get_weather). Include it in the request under tools. When the model decides a function is needed, it returns a tool_calls object. Your application executes the function, then sends the result back as a new assistant message, preserving the conversation context.
Vision (Image Inputs)
Select a vision‑enabled model such as qwen3-vl:235b-instruct. Send a base64‑encoded image in the content array alongside text. The model can analyse the picture, then continue the dialogue based on the visual analysis.
Thinking Models
For tasks that require step‑by‑step reasoning—like math or planning—choose a thinking model (e.g., deepseek-v4-pro). These models are tuned for chain‑of‑thought prompting, which works well when you break a complex request into multiple turns.
Putting It All Together
A typical rich flow might look like:
- User uploads a receipt image.
- Vision model extracts the total amount.
- Assistant asks for clarification (multi‑turn).
- Function call retrieves the user's loyalty points.
- Final response combines image analysis, loyalty data, and natural language.
All steps reuse the same messages array, so the model always has the full conversational context.
Arabic Language Support in Multi‑Turn Dialogues (≈320 words)
Resayil’s Arabic language support is a core feature. The tokeniser understands right‑to‑left scripts, diacritics, and common Arabic punctuation. This means you can send Arabic text directly without any preprocessing.
Ready to try Resayil LLM API?
Start FreeExample Conversation
[
{"role": "system", "content": "أنت مساعد ذكي يتحدث العربية بطلاقة."},
{"role": "user", "content": "ما هو تعريف الذكاء الاصطناعي؟"},
{"role": "assistant", "content": "الذكاء الاصطناعي هو فرع من علوم الحاسوب يهدف إلى إنشاء أنظمة تستطيع التفكير والتعلم."},
{"role": "user", "content": "هل يمكنك إعطائي مثالاً عملياً؟"}
]
The model retains the earlier definition and can build on it in the next turn. Because 39 active models are available, developers can experiment with different sizes and specialties (chat, vision, thinking) while keeping Arabic handling consistent.
Cultural Nuance
While the platform does not claim custom fine‑tuning for cultural nuance, the built‑in Arabic support ensures that common expressions, honorifics, and regional terminology are processed correctly, reducing the likelihood of nonsensical output.
Pricing and Scalability for Multi‑Turn Use Cases (≈280 words)
Resayil uses a pay‑per‑use credit model billed in USD. Credits are deducted based on token consumption—both input and output tokens. To estimate costs before sending a request, call the /v1/messages/count_tokens endpoint with your messages array; the response tells you how many tokens will be counted.
Example Token Count Request
curl https://llm.resayil.io/v1/messages/count_tokens \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "ما هو سعر الصرف اليوم؟"}]}'
The returned token count helps you forecast credit usage for long, multi‑turn sessions.
Scalability is straightforward: as your chatbot grows, you simply purchase more credits via the /v1/pricing/topups endpoint. Payments are accepted through Stripe or PayPal, and the platform runs on infrastructure hosted in the USA, ensuring low latency for global users.
Integration Examples: SDKs and Frameworks (≈320 words)
Resayil integrates with the most popular developer tools.
Python (OpenAI SDK compatible)
import openai
openai.api_key = "YOUR_API_KEY"
openai.base_url = "https://llm.resayil.io/v1"
response = openai.ChatCompletion.create(
model="deepseek-v4-flash",
messages=[
{"role": "system", "content": "أنت مساعد عربي محترف."},
{"role": "user", "content": "ما هو أحدث فيلم عربي؟"}
]
)
print(response.choices[0].message.content)
JavaScript (Node.js)
const { OpenAI } = require("openai");
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://llm.resayil.io/v1"
});
(async () => {
const chat = await client.chat.completions.create({
model: "deepseek-v4-flash",
messages: [
{ role: "system", content: "أنت مساعد ذكي يتحدث العربية." },
{ role: "user", content: "ما هي آخر أخبار التقنية في الشرق الأوسط؟" }
]
});
console.log(chat.choices[0].message.content);
})();
cURL (Anthropic style)
curl https://llm.resayil.io/v1/messages \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-pro",
"messages": [
{"role": "system", "content": "أنت خبير في الاقتصاد العربي."},
{"role": "user", "content": "ما هو معدل النمو الاقتصادي في السعودية لعام 2023؟"}
]
}'
Low‑code (n8n) & Workflow (LangChain)
Both n8n and LangChain have built‑in nodes that accept a base_url. Point them to https://llm.resayil.io/v1 and select any of the 39 models. You can then chain together function calls, vision analysis, and chat steps without writing additional HTTP code.
Frequently Asked Questions
Q: How do I maintain conversation context across multiple turns with LLM Resayil?
A: Include the full messages array in each request. The array can contain system, user, and assistant roles. Append the assistant’s reply to the array before sending the next turn, ensuring the model sees the entire dialogue history.
Q: Can I use streaming with multi‑turn conversations?
A: Yes. Set stream=true (or the equivalent query parameter) on either the /v1/chat/completions or /v1/messages endpoint. The API will return incremental token chunks, allowing you to display the response in real time while preserving the same messages payload for subsequent turns.
Q: Does LLM Resayil support function calling in multi‑turn dialogues?
A: Yes. Function calling and tool use are part of the platform’s feature set. Define your functions in the request, let the model invoke them, execute the function on your side, then send the result back as a new assistant message within the same conversation.
Q: How do I count tokens for a multi‑turn conversation?
A: Use the /v1/messages/count_tokens endpoint. Send the messages array you plan to use, and the response will tell you the total token count, helping you estimate credit usage before making the actual call.
Q: What models are best for multi‑turn conversations in Arabic?
A: All 39 active models support Arabic, but chat‑oriented models such as deepseek-v4-flash or nemotron-3-super are commonly used for dialogue. You can browse the full list via the /v1/models endpoint and filter by category (chat, vision, thinking) to pick the one that matches your latency and reasoning needs.
Take the Next Step
Ready to build an Arabic‑enabled, multi‑turn chatbot? Sign up at LLM Resayil Register, explore the pricing page for credit packages, and dive into the documentation for full API details.