# How to inference an AI model

The Phoeniqs Model Service exposes an OpenAI-compatible API for every Active Model. To send an inference request you need three pieces of information: a Base URL, a Model Name, and an API Key.


# Make your first call

Get the Base API URL

All inference requests are sent to the Phoeniqs Model Service endpoint.

Environment Base API URL
Production https://maas.phoeniqs.com/

Pick a Model Name

Choose a model from the Active Models table. The value in the Model Name column is what you pass as "model" in your request body.

Examples:

  • inference-llama4-maverick
  • inference-bge-m3

Get your API Key

Your API Key authenticates every request. Pass it in the Authorization HTTP header:

Authorization: Bearer YOUR_API_KEY

Send the request

With the three pieces above you can call any of the OpenAI-compatible endpoints. The minimal chat completion call looks like this:

curl --location 'https://maas.phoeniqs.com/v1/chat/completions' \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "inference-llama4-maverick",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
from openai import OpenAI

client = OpenAI(
    base_url="https://maas.phoeniqs.com/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="inference-llama4-maverick",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://maas.phoeniqs.com/v1",
  apiKey: "YOUR_API_KEY",
});

const response = await client.chat.completions.create({
  model: "inference-llama4-maverick",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

# See also

Sample API calls
../sample-api-calls/
Active Models
../../active-models/