Documentation Hub

Integrate Global Intelligence in Seconds.

Freeference provides an OpenAI-compatible API that intelligently manages model selection and provider failover for you.

🚀 Quick Start

1. Base URL

All requests target our global gateway:

https://api.freeference.idey.click/v1

2. Authentication

Requests use standard Bearer tokens:

Authorization: Bearer YOUR_FREEFERENCE_KEY

🛠️ Smart Integration

By default, Freeference handles model selection automatically based on your query's intent. Providing a model name is optional for most users.

cURL (Auto-Routing)

curl https://api.freeference.idey.click/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $FREEFERENCE_API_KEY" \
  -d '{
    "messages": [{"role": "user", "content": "How do I implement a binary search in Rust?"}]
  }'

Python (Managed Intelligence)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.freeference.idey.click/v1",
    api_key="your_freeference_key"
)

# No model required - Freeference routes based on intent
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Analyze these financial statistics."}]
)
print(response.choices[0].message.content)

🛰️ Smart Features

🧠 Automatic Intent Classification

Our routing layer uses heuristic benchmarks to classify your task:

Code: Routes to specialized coding models (Qwen-2.5-Coder, etc.)
Reasoning: Routes to high-logic models (Gemini-1.5, Command-R)
General: Fast, cost-efficient chat models

🔄 Multi-Provider Failover

If a specific capability target is experiencing latency or downtime, we automatically re-route to an equivalent or superior model to ensure zero downtime.

📋 Models Endpoint

List All Models

GET /v1/models

Returns all available models with their capabilities and context windows.

Response:

{
  "object": "list",
  "data": [
    {
      "id": "text-basic",
      "object": "model",
      "owned_by": "freeference",
      "context_window": 8192,
      "capabilities": ["chat"]
    }
  ]
}

Random Model (Recommended for Resilience)

GET /v1/models/random

Perfect for:

Fallback logic when your preferred model is unavailable
Testing with different models
Building resilient applications that never break

Rate Limit: 10 requests/hour per API key

Response:

{
  "modelId": "text-basic",
  "name": "General Chat Model",
  "capabilityMapping": "text-basic",
  "contextWindow": 8192,
  "providerId": "openrouter"
}

Usage Example: Resilient App

import openai

client = openai.OpenAI(
    base_url="https://api.freeference.idey.click/v1",
    api_key="your_key"
)

# Get a random working model
response = requests.get(
    "https://api.freeference.idey.click/v1/models/random",
    headers={"Authorization": "Bearer your_key"}
)
model = response.json()

# Use it in your chat completion
completion = client.chat.completions.create(
    model=model["modelId"],
    messages=[{"role": "user", "content": "Hello!"}]
)

🎯 Best Practices

For Beginners

Don't specify a model. Let Freeference auto-route based on your prompt.

# Just send your message
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Your question"}]
)

For Production Apps

Use the random model endpoint as a fallback when your primary model fails.

try:
    response = client.chat.completions.create(
        model="your-preferred-model",
        messages=messages
    )
except Exception:
    # Fallback to random available model
    fallback_model = get_random_model()
    response = client.chat.completions.create(
        model=fallback_model["modelId"],
        messages=messages
    )

For Advanced Users

Specify exact models when you need specific capabilities (Pro tier).

response = client.chat.completions.create(
    model="qwen/qwen-2.5-coder-32b-instruct:free",
    messages=[{"role": "user", "content": "Write a Rust function"}]
)

📊 Error Handling

200: Success
429: Rate limit reached
502: All available providers failed (Extremely rare)
503: No models currently available (Check /models/random)

🚨 Why This Matters

Traditional Approach:

Hardcode model name → Model gets deprecated → App breaks → Manual fix required

Freeference Approach:

Use auto-routing or random fallback → Model changes happen transparently → App never breaks

Ship once. Run forever.

Documentation - Ship in 60 Seconds