#
Phoeniqs Model Service
The Model Serving) hosting solution within the Phoeniqs portal provides access to a suite of open-source Large Language Models (LLMs) with full API integration, hosted entirely on Phoeniqs infrastructure in Switzerland.
This service also supports Bring-Your-Own-Model (BYOM) subscriptions, allowing you to deploy custom models—such as those from Hugging Face within our secure, sovereign Swiss-based infrastructure.
We use vLLM for optimized inference and provide API compatibility with the OpenAI API, including support for streamed responses. Our pricing model is based on token-per-minute throughput, calibrated as a percentage of GPU utilization for each model.
An Evaluation Plan is available to try all models at no cost. For production workloads, you can select which models to use and define the throughput you require.
#
Accessing and Using Model
Create an account in the Phoeniqs portal.
Tip: If you don't have one yet, follow our Getting Started guide.
Buy a Subscription.
In the Phoeniqs Portal, select your desired plan.Tip: We recommend starting with the Evaluation Plan. If you need help, see our Buy a Subscription guide.
Use your token in your AI application.
Tip: See our API usage example.
#
Related Resources
- OpenAI API Reference – For integrating with applications
- vLLM Documentation – Details on optimized model serving
- Hugging Face Model Hub – Find and download models for BYOM