Active Orchestration

GPU orchestration for the mid-market rack.

CALSTIQ delivers bare-metal efficiency for fleets from a quarter-rack to a few racks of H100s — the operators the hyperscaler platforms quietly ignore. No abstraction tax, no platform team required.

LLM inference32 req/s
Fine-tune batch8 jobs
Vision pipeline12 streams
RAG embedding240 req/s
Code generation54 req/s
Audio transcribe18 streams
CalstIQ
Fleet Status · Rack-0198.2% Efficiency
H100-01
H100-02
H100-03
H100-04
H100-05
H100-06
H100-07
H100-08
Macro view of a GPU server rack with green status LEDs
Thermal Map42.1°C · Nominal
The platform

Infrastructure-first capabilities.

Every layer designed for small-to-mid scale clusters where every cycle, watt, and dollar is accounted for.

01/ Telemetry

Real-time fleet observability

Per-core utilization, thermal envelope, power draw, and NVLink topology streamed at sub-frame intervals.

4msUpdate Latency
02/ Orchestration

Fractional GPU slicing

Schedule multiple tenants on a single H100 with MIG-aware isolation, fair-share queueing, and zero hypervisor tax.

1/7thMinimum Slice
03/ Networking

Direct-to-metal RDMA

Optimized drivers for InfiniBand and RoCE v2 tuned for small-cluster topologies — no fabric controller required.

400GPer-Node Throughput
04/ Multi-tenancy

Tenant-aware billing

Metered usage by slice, project, and namespace. Export to your billing stack or run the built-in invoice engine.

USD/hrPer-Slice Granularity
05/ Deployment

PXE-to-production in minutes

Bare-metal provisioning, image management, and driver pinning. No Kubernetes operator engineering required.

18sMedian Provision
06/ Security

Air-gap and sovereign ready

Runs fully offline. SSO, SCIM, audit logs, and signed control-plane updates suitable for regulated workloads.

SOC 2Ready Architecture
Who it's for

Built for operators the hyperscaler platforms overlook.

CALSTIQ is the orchestration layer for teams with real hardware and finite footprints — not 100-node clusters and a six-person platform team.

Regional clouds & neoclouds

Stand up a billable GPU product on a quarter-rack of H100s without rebuilding a hyperscaler control plane.

Enterprise on-prem AI

Give internal ML teams self-service access to the GPUs sitting in your datacenter — with quotas, projects, and chargeback.

AI labs & research groups

Share scarce H100/A100 capacity across teams with fair-share scheduling, reservations, and live preemption.

Colocation operators

Offer managed GPU-as-a-Service to colo tenants — multi-tenant isolation, metering, and a customer-facing portal included.

Positioning

Right-sized for your rack.

Hyperscaler orchestrators like Rafay are engineered for fleets that don't look like yours. CALSTIQ is the inverse.

MetricCALSTIQHyperscaler orchestrators
Ideal Fleet Size0.25 – 10 racks100+ nodes, multi-region
Control-plane Overhead~1.2% native8 – 15% abstraction tax
Setup TimeHours, PXE bootstrapWeeks of platform engineering
Cost StructureFixed per rack, per yearConsumption-weighted, tiered
Air-gap DeploymentFirst-classLimited or unsupported
Tenant BillingBuilt-in meteringBring-your-own integration
Get started

Ready to operationalize the hardware you already own?

We'll walk through your fleet, your workloads, and what a CALSTIQ deployment looks like — usually inside a week.