Vision-Enabled Chat on LLM Resayil

Developers building multilingual chatbots often need more than plain text. Adding image understanding opens doors to visual assistants, document analysis, and richer user experiences—especially when the application must also handle Arabic content. LLM Resayil offers a vision‑enabled chat capability that works through the same OpenAI‑compatible endpoint used for text, letting you send pictures alongside prompts. In this guide we explain what vision‑enabled chat means on Resayil, how to call it, which models can process images, pricing, advanced features, and how it stacks up against OpenAI and Anthropic.

Introduction

Quick Comparison

| Feature | LLM Resayil | OpenAI | |---|---|---| | API compatibility | OpenAI & Anthropic compatible | OpenAI native | | Arabic language support | ✅ (built‑in) | ✅ (via prompts) | | Vision (image input) | ✅ (supported feature) | ✅ (GPT‑4‑Vision) | | Streaming responses | ✅ | ✅ | | Function calling | ✅ | ✅ | | Tool use | ✅ | ✅ | | Pay‑per‑use billing | ✅ (credits) | ✅ (pay‑as‑you‑go) | | Supported currency | USD only | Multiple | | Payment methods | Stripe, PayPal | Credit card, PayPal | | Hosting location | USA | Multi‑region |

What We Offer

LLM Resayil is positioned as a single source of truth for developers who need a flexible, multilingual LLM API. The platform provides:

OpenAI and Anthropic compatibility – you can use the official SDKs from either ecosystem without code changes.
Arabic language support – the models understand and generate Arabic out of the box, which is essential for Middle‑East markets.
Vision capability – image data can be included in /v1/chat/completions or /v1/messages calls, enabling visual reasoning.
Streaming, function calling, and tool use – all of these advanced features work together with vision, allowing real‑time image‑based responses and structured output.
Pay‑per‑use credits – you are billed only for the tokens you consume, with no hidden tiers.
Integrations – ready‑to‑go connectors for n8n, LangChain, LiteLLM, the OpenAI SDK, the Anthropic SDK, as well as direct use from Python, JavaScript, or cURL.

What OpenAI Offers

OpenAI provides a robust set of models, including the GPT‑4‑Vision series, which accept image inputs and return text or JSON. Their API is also OpenAI‑compatible, with built‑in streaming and function calling. OpenAI supports a wider range of billing currencies and offers a global infrastructure that can reduce latency for users outside the United States.

Why LLM Resayil Wins for Vision‑Enabled Chat

If your primary requirement is Arabic language support combined with image understanding, Resayil delivers a unified endpoint that already handles both. You avoid the overhead of managing separate providers for text and vision, and you keep all usage under a single credit‑based billing model. The ability to call functions and stream responses while processing images means you can build sophisticated conversational agents without stitching together multiple services.

What You Get by Using LLM Resayil

Single API surface for text, code, vision, and tool use.
Unified billing in USD via Stripe or PayPal, simplifying accounting.
Access to 39 active models, many of which are tuned for different tasks (chat, thinking, code, vision).
Developer‑friendly integrations that let you drop in the OpenAI or Anthropic SDKs and start sending images immediately.
Full‑stack Arabic support, from prompt understanding to generated output.

What Does Vision‑Enabled Chat Mean on LLM Resayil?

Vision‑enabled chat refers to the ability to include an image as part of a conversational request. When you call the /v1/chat/completions endpoint, you can attach a content block of type image_url (or a base64‑encoded image) alongside regular text messages. The model receives the visual context, processes it, and can generate a response that references objects, text, or scenes in the picture. This feature is listed among Resayil’s supported capabilities, alongside streaming, function calling, and tool use. It works the same way for Arabic prompts, allowing you to ask questions like “ما معنى هذا النص في الصورة؟” and receive an Arabic answer.

How to Use Image Input with LLM Resayil’s API

Below is a step‑by‑step guide for sending an image to the chat endpoint.

Choose the endpoint – either /v1/chat/completions (the newer chat‑style API) or /v1/messages (legacy). Both accept the same payload structure for images.
Prepare the image – you can provide a public URL or embed the image as a base64 string. The field name is image_url for URLs or image_base64 for inline data.
Build the request body – include a messages array where each entry can be a role (user, assistant, system) and a content array. The content array can contain a text part and an image part.
Set the model – pick any model that supports vision. You can discover which models have this ability by querying /v1/models and inspecting the returned metadata.
Add optional parameters – stream to receive incremental chunks, functions to enable function calling, or tools for tool use.
Send the request – use your preferred HTTP client. Below are examples in cURL and Python.

cURL Example

curl https://api.llm.resayil.io/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-vl:235b",
    "messages": [
      {"role": "user", "content": [
        {"type": "text", "text": "Describe what is happening in this picture in Arabic."},
        {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
      ]}
    ],
    "stream": false
  }'

Python (OpenAI SDK) Example

import openai

client = openai.OpenAI(base_url="https://api.llm.resayil.io/v1")
response = client.chat.completions.create(
    model="qwen3-vl:235b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Explain the diagram in Arabic."},
                {"type": "image_url", "image_url": {"url": "https://example.com/diagram.png"}}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Both snippets demonstrate how the image is attached and how the response can be received in Arabic. The same pattern works for base64 images – just replace the image_url block with an image_base64 block containing the encoded string.

Which Models Support Vision on LLM Resayil?

Resayil’s catalog contains 39 active models covering chat, thinking, code, and vision categories. To find the models that accept images, call the /v1/models endpoint. The response includes metadata for each model, and models that have vision capability will be flagged accordingly. While we do not list the individual model names here, you can programmatically filter the list to get only the vision‑enabled options. Once you have the slug (e.g., qwen3-vl:235b), plug it into the request payload as shown in the earlier examples.

Pricing and Billing for Vision Requests

Vision requests are billed exactly the same way as text‑only requests – on a pay‑per‑use credit basis. There is no separate surcharge for image processing. Credits are deducted based on the number of tokens generated and, where applicable, tokens counted for the image payload. All billing is performed in USD and you can pay via Stripe or PayPal. You can view the current rates on the /v1/pricing endpoint or the pricing page in the dashboard.

Streaming, Function Calling, and Tool Use with Vision

One of Resayil’s strengths is the ability to mix vision with other advanced features.

Streaming – Set stream: true in your request and receive Server‑Sent Events (SSE) that deliver the model’s answer chunk by chunk. This works even when the request includes an image, allowing you to display partial results as soon as they are generated.
Function Calling – Define a function schema in the functions array. When the model decides that a function should be invoked (for example, to extract structured data from a chart image), it will return a function_call object that you can execute on your backend.
Tool Use – Similar to function calling, tools let you hook external services into the conversation. An image of a receipt could trigger a tool that parses the total amount and returns a structured receipt object.

By combining these, you can build sophisticated agents such as a visual customer‑support bot that receives a screenshot, streams a helpful answer, and automatically creates a support ticket via a tool.

Comparing LLM Resayil’s Vision Support with OpenAI and Anthropic

Both OpenAI and Anthropic have released vision‑enabled models (e.g., GPT‑4‑Vision, Claude‑3‑Opus with image input). Their APIs require separate endpoints or special request structures. With Resayil, you get a single, OpenAI‑compatible endpoint that also respects Anthropic’s request format, meaning you can switch between model families without changing code. Additionally, Resayil’s built‑in Arabic language handling removes the need for prompt engineering tricks that are often required on other platforms to achieve high‑quality Arabic output. The unified billing and US‑based hosting simplify compliance for teams that need a single point of contact.

Ready to try Resayil LLM API?

Start Free

Code Example: Sending an Image to a Vision Model

{
  "model": "qwen3-vl:235b",
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "ما هو النص الموجود في هذه الصورة؟"},
        {"type": "image_url", "image_url": {"url": "https://example.com/arabic-sign.jpg"}}
      ]
    }
  ],
  "stream": true,
  "functions": [
    {
      "name": "extract_text",
      "description": "Extract Arabic text from the image",
      "parameters": {
        "type": "object",
        "properties": {
          "text": {"type": "string", "description": "The extracted Arabic text"}
        },
        "required": ["text"]
      }
    }
  ]
}

The JSON payload above can be sent directly to /v1/chat/completions. It streams the answer, and if the model decides the extract_text function should be called, it will return a function_call object you can act on.

Frequently Asked Questions

Q: Can I send images to LLM Resayil using the OpenAI SDK?

A: Yes. Because the Resayil API is OpenAI compatible, you can use the official OpenAI Python SDK. Include an image_url (or image_base64) block inside the content array of a user message, as shown in the Python example above.

Q: Does LLM Resayil support streaming with vision inputs?

A: Absolutely. The stream flag works for any request, including those that contain images. Responses are delivered via Server‑Sent Events, allowing you to display partial output as soon as it is generated.

Q: How do I check which models on LLM Resayil support image input?

A: Call the /v1/models endpoint. The returned model list includes metadata that indicates whether a model has vision capability. Filter the list for those entries to obtain the slugs you can use for image‑based calls.

Q: Is there a separate cost for vision requests on LLM Resayil?

A: No. Vision requests are billed on the same pay‑per‑use credit model as text requests. Pricing is based on the tokens consumed; there is no additional surcharge for image payloads.

Q: Can I use function calling with image inputs on LLM Resayil?

A: Yes. Define functions in the request payload, and the model can decide to invoke them after analyzing the image. The response will contain a function_call object that you can execute on your backend.

Take the Next Step

Ready to add visual intelligence to your Arabic chatbot? Sign up at Resayil Register, explore the pricing page, and dive into the documentation for detailed API references. With vision, streaming, and function calling all under a single, pay‑per‑use model, LLM Resayil gives you the tools to build next‑generation conversational experiences.