OriginalPoint API
The OriginalPoint API provides a single, unified endpoint to access over 50 AI models from 9 providers. It is fully compatible with the OpenAI SDK, so you can switch with a single line of code. All requests are routed intelligently based on your configuration for cost, latency, and reliability.
This reference covers authentication, available endpoints, request and response formats, streaming, error handling, and rate limits.
Base URL
https://api.originalpoint.ai/v1Authentication
All API requests must include your API key in the Authorization header using the Bearer scheme. API keys are prefixed with op_sk_ and can be created from your dashboard.
Authorization: Bearer op_sk_your_api_keyServer-side only
Never expose your API key in client-side code, browser JavaScript, or public repositories. All API calls should be made from your server or a secure backend environment.
Quickstart
Send your first request in under a minute. Install the OpenAI SDK for your language, point it at the OriginalPoint base URL, and make a chat completion request.
import openai
client = openai.OpenAI(
base_url="https://api.originalpoint.ai/v1",
api_key="op_sk_your_api_key",
)
response = client.chat.completions.create(
model="claude-sonnet-4",
messages=[
{"role": "user", "content": "Explain quantum computing in one paragraph."}
],
)
print(response.choices[0].message.content)Models
List all models available through the API. The response follows the OpenAI list models format and includes models from every supported provider.
List models
/v1/models{
"object": "list",
"data": [
{
"id": "gpt-4o",
"object": "model",
"created": 1700000000,
"owned_by": "openai"
},
{
"id": "claude-sonnet-4",
"object": "model",
"created": 1700000000,
"owned_by": "anthropic"
}
]
}See the Model Directory for a full list of supported models, pricing, and capabilities.
Chat Completions
Create a chat completion by sending a list of messages. The API will return the model's response. This is the primary endpoint for interacting with language models.
/v1/chat/completionsRequest body
{
"model": "claude-sonnet-4",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is the capital of France?"
}
],
"max_tokens": 256,
"temperature": 0.7,
"stream": false
}| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The model ID to use for completion. |
messages | array | Yes | An array of message objects with role and content. |
max_tokens | integer | No | Maximum number of tokens to generate. Defaults to model limit. |
temperature | number | No | Sampling temperature between 0 and 2. Default is 1. |
stream | boolean | No | If true, responses are streamed as server-sent events. |
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "claude-sonnet-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 8,
"total_tokens": 32
}
}Streaming
Enable streaming by setting stream: true in your request. The API returns server-sent events (SSE) where each event contains a delta of the response. Streaming works across all supported models and providers.
const stream = await client.chat.completions.create({
model: "claude-sonnet-4",
messages: [{ role: "user", content: "Write a haiku about APIs." }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}Each streamed chunk follows the OpenAI format with a delta object containing incremental content. The stream ends with a [DONE] message.
Error Handling
The API uses standard HTTP status codes to indicate the success or failure of a request. Errors include a JSON body with a descriptive message and error type.
{
"error": {
"message": "Invalid API key provided.",
"type": "authentication_error",
"code": 401
}
}| Code | Name | Description |
|---|---|---|
| 400 | Bad Request | The request body is malformed or missing required fields. |
| 401 | Unauthorized | The API key is missing or invalid. |
| 403 | Forbidden | The API key does not have permission to access the requested resource. |
| 429 | Too Many Requests | You have exceeded your rate limit. Retry after the period indicated in the Retry-After header. |
| 500 | Internal Server Error | An unexpected error occurred on our servers. If this persists, contact support. |
| 502 | Bad Gateway | The upstream model provider returned an invalid response. The request was not billed. |
| 503 | Service Unavailable | The requested model or provider is temporarily unavailable. Failover may be attempted automatically. |
Rate Limits
Rate limits are applied per API key and vary by plan. When you exceed your rate limit, the API returns a 429 status code with a Retry-After header indicating how many seconds to wait before retrying.
| Plan | Requests / min | Tokens / day | Concurrent |
|---|---|---|---|
| Free | 60 | 100,000 | 5 |
| Pro | 600 | Unlimited | 50 |
| Enterprise | Custom | Unlimited | Custom |
Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset.
SDKs
OriginalPoint is compatible with the official OpenAI SDK for every language. Simply set the base URL to https://api.originalpoint.ai/v1 and use your OriginalPoint API key.
Python
openai
pip install openaiTypeScript / Node.js
openai
npm install openaiGo
github.com/sashabaranov/go-openai
go get github.com/sashabaranov/go-openaiRuby
ruby-openai
gem install ruby-openai