Custom API
Also know as BYOK (Bring Your Own Key). This guide is for obtaining a custom AI API token for WriteTex. If you are familiar with this you can skip this guide. Here is a step by step guide for setup.
Disclaimer
WriteTex (we) are not responsible for any costs incurred by using a custom API. You must follow the terms of service of the AI API service provider and governing laws of your jurisdiction. We are not responsible for the results generated by the models from your provider. We are not affiliated with these companies. For more information, refer to our Terms of Use
Definitions
- OpenAI Compatible API: An API that is compatible with OpenAI's API specification. WriteTex expects your API endpoint to be in this format.
- API Endpoint/Base URL: The base URL of your API endpoint. This is the root URL where your API is hosted. For example,
https://api.openai.com/v1. - API Token/Key: A unique identifier used to authenticate requests to your API. It should be kept secret and not shared publicly. For example,
sk-1234567890. - Model Name/ID: The name or ID of the model you want to use with WriteTex. For example,
gpt-5.1,qwen/qwen3-vl-8b-instruct. This is a highly unique parameter and does not expect vague inputs such asgptorqwen. - API version: The version of the API you are using. WriteTex expects version
v1on windows, android, and macos. If the Base URL end with/v1, then you should remove the/v1from your baseURL on windows, android, and macos devices. For more information of base URL and API version, refer to the platform specific guide. - API Service Provider: The service provider that hosts your API. These are usually well known tech giants like OpenAI, Alibaba, Anthropic, Tencent, Google, ByteDance, etc.
- Vision Language Models / Multi-Modal Models: These models are capable of understanding text and images. They are usually more powerful than traditional language models. For example,
gpt-5.1is a multi-modal model. - Tokens: Tokens are the basic units of text that the model processes. For each request you consume an amount of tokens based on your input and the model's output. The number of tokens consumed is proportional to the cost of the request. Normally the input and output tokens are priced differently. For example,
gpt-5.1has a pricing of $1.25 / 1M tokens for input and $10 / 1M tokens for output.
Choose Provider and Model
When choosing a provider and model, consider the following factors:
- Performance: Look for models that performs well in OCR tasks. Refer to LMArena Leaderboard for more information.
- Cost: Compare the cost of using different models. Some models may be more expensive than others. Costs are calculated by:
Input Token Count * Input Pricing + Output Token Count * Output Pricing.
- Model Capabilities: Your model must support vision input, which means it must be a multi-modal model. For example,
gpt-5.1is a multi-modal model. You can refer the the provider's website for more information.
How to check if a model supports vision input?
- Look for models with "vision" or "multi-modal" in their description.
- Check the model card on the provider's website. Usually there is an icon indicating that the model supports image input.
Here are some providers and models to consider:
| Provider | Value Model | Pricing In/Out | Performance Model | Pricing In/Out |
|---|---|---|---|---|
| OpenAI | gpt5 mini | $0.25 /$2 | gpt 5.1 | $1.25/$10 |
| Anthropic | claude sonnet 4.5 | $3/$15 | claude opus 4.5 | $5/$25 |
| gemini 2.5 flash | $0.3/$2.5 | gemini 3 flash | $0.5/$3 | |
| Openrouter | nvidia/nemotron-nano-12b-v2-vl:free | 0 | grok 4 | $3/$15 |
| Alibaba | qwen3 vl flash | ¥0.15/¥1.5 | qwen3 vl plus | ¥1/¥10 |
| Tencent | hunyuan turbos vision | ¥3/¥9 | hunyuan t1 vision | ¥3/¥9 |
| ByteDance | doubao seed 1.6 flash | ¥0.15/¥1.5 | doubao seed 1.6 vision | ¥0.8/¥8 |
- USD $1 ~ CNY ¥ 7.1
- Prices are shown per million tokens.
For each request made at WriteTex, you typically consume around 300 to 1000 input tokens and 10 to 100 output tokens.
Most model providers offer a generous free tier. You can usually sign up for a free account and start using the models without any cost.
Obtain API Config
For a API config, you need three components
- API Endpoint
- API Key
- Model ID
The base url is usually found in the provider's documentations. Obtaining the API Key and Model ID is also fairly simple:
- Register an account at a provider.
- Read the provider's documentation.
- Create an API token on the provider's platform.
- Choose your model and get model ID.
OpenAI
- Register a OpenAI account.
- Read the OpenAI API documentation
- Create a token at OpenAI Platform.
- Choose your model, for example GPT 5.1 with model ID
gpt-5.1.
Base URL: https://api.openai.com/v1
Anthropic
- Register a Anthropic console account.
- Read the Anthropic API documentation
- Create a token at Anthropic Platform.
- Choose your model at Models overview for example Claude Sonnet 4.5 with model ID
claude-sonnet-4-5.
Base URL: https://api.anthropic.com/v1
Google
- Register a Google account. Log into Google AI Studio
- Read the Gemini API Docs
- Create a token at Google AI Studio
- Choose your model, for example Gemini 2.5 Flash with model ID
gemini-2.5-flash.
Base URL: https://generativelanguage.googleapis.com/v1beta/openai
You can also use the gemini api at Google Vertex AI.
Google offers gemini 2.5 flash for free with a daily limit of 20 requests at google ai studio.
Openrouter
Openrouter is a model router that allows you to use multiple models from different providers. You can refer to the Openrouter documentation for more information.
There are often free models available on Openrouter. For example
nvidia/nemotron-nano-12b-v2-vl:freeis a free model that you can use without any cost. Base URL:https://openrouter.ai/api/v1
Alibaba
- Register a Aliyun Account at Aliyun
- Read the Aliyun API documentation
- Follow this guide to create a API Key at API Key
- Choose your model from the Model Market. For example Qwen3 vl plus with model ID
qwen3-vl-plus
Base URL: https://dashscope.aliyuncs.com/compatible-mode/v1
For new registered users, you get a free quota of 1M tokens for each model for the first three months.
Tencent
- Register a Tencent Cloud account
- Read the Tencent API documentation
- Create a API Key at Tencent Cloud Console
- Choose your model from the Model Sqaure, for example Hunyuan Turbos Vision with model ID
hunyuan-turbos-vision.
Base URL: https://api.hunyuan.cloud.tencent.com/v1
For new registered users, you get a free quota of 1M tokens in total for all models.
ByteDance
- Register a Volc Engine account
- Read the Volc Engine API documentation
- Create a API Key at Volc Engine Console
- Choose our model from the Model Sqaure, for example Doubao 1.6 Vision with model ID
doubao-seed-1-6-vision-250815.
Base URL: https://ark.cn-beijing.volces.com/api/v3
For new registered users, you get a free quota of 0.5M tokens per model.
About Deepseek
The models provided directly by deepseek does not support vision inputs. Open-source models like DeepSeek-OCR does support recognizing math equations but requires self hosting or finding a seperate provider.
Test your API ( if needed )
Now suppose you've obtained the API Key and Model ID. You can test your API by using a tool like CherryStudio. Download the cherry studio app and use your api settings in the app to test your API. Try inserting a picture to see if the model supports vision input.
If you don't feel like downloading the app, you can also test your API by sending a simple request. Here is an example request for qwen3-vl-plus using curl. Copy and paste this command into your terminal or CMD:
curl https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer sk-1234567890" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hello"
}
]
}
]
}'Replace https://dashscope.aliyuncs.com/compatible-mode/v1 with your own Base URL, keep the /chat/completions part same.
Replace qwen3-vl-plus with your own Model ID.
Replace sk-1234567890 with your own API Key.
Successful Response:
{
"choices": [
{
"message": {
"content": "Hello! How can I help you today?😊",
"reasoning_content": "",
"role": "assistant"
},
}
],
}Configure the settings in WriteTex
Advanced
In this part, the authors assume the reader knows what they are doing. Hosting a custom model is beyond this guide. It is not recommended for beginners or users without a computer science background.
Self Hosting
ollama is a platform that allows you to run large language models on your own machine. It provides a simple API that you can use to run models. You can refer to the ollama documentation for more information. As an example, Deepseek-OCR is a model that you can use to run for WriteTex.
vLLM is a high-performance inference engine for large language models. It supports a variety of models and provides a OpenAI compatible API. You can refer to the vLLM documentation for more information. I recommend trying out HunyuanOCR and DeepSeek-OCR for fast light weight local LaTeX OCR for WriteTex.