LLM inference32 req/s

Fine-tune batch8 jobs

Vision pipeline12 streams

RAG embedding240 req/s

Code generation54 req/s

Audio transcribe18 streams

CalstIQ

Active Orchestration

GPU orchestration for the mid-market rack.

CALSTIQ delivers bare-metal efficiency for fleets from a quarter-rack to a few racks of H100s — the operators the hyperscaler platforms quietly ignore. No abstraction tax, no platform team required.

Book a technical demo View platform

LLM inference32 req/s

Fine-tune batch8 jobs

Vision pipeline12 streams

RAG embedding240 req/s

Code generation54 req/s

Audio transcribe18 streams

CalstIQ

Fleet Status · Rack-0198.2% Efficiency

H100-01

H100-02

H100-03

H100-04

H100-05

H100-06

H100-07

H100-08

Macro view of a GPU server rack with green status LEDs

Thermal Map42.1°C · Nominal

The platform

Infrastructure-first capabilities.

Every layer designed for small-to-mid scale clusters where every cycle, watt, and dollar is accounted for.

01/ Telemetry

Real-time fleet observability

Per-core utilization, thermal envelope, power draw, and NVLink topology streamed at sub-frame intervals.

4msUpdate Latency

02/ Orchestration

Fractional GPU slicing

Schedule multiple tenants on a single H100 with MIG-aware isolation, fair-share queueing, and zero hypervisor tax.

1/7thMinimum Slice

03/ Networking

Direct-to-metal RDMA

Optimized drivers for InfiniBand and RoCE v2 tuned for small-cluster topologies — no fabric controller required.

400GPer-Node Throughput

04/ Multi-tenancy

Tenant-aware billing

Metered usage by slice, project, and namespace. Export to your billing stack or run the built-in invoice engine.

USD/hrPer-Slice Granularity

05/ Deployment

PXE-to-production in minutes

Bare-metal provisioning, image management, and driver pinning. No Kubernetes operator engineering required.

18sMedian Provision

06/ Security

Air-gap and sovereign ready

Runs fully offline. SSO, SCIM, audit logs, and signed control-plane updates suitable for regulated workloads.

SOC 2Ready Architecture

Who it's for

Built for operators the hyperscaler platforms overlook.

CALSTIQ is the orchestration layer for teams with real hardware and finite footprints — not 100-node clusters and a six-person platform team.

Regional clouds & neoclouds

Stand up a billable GPU product on a quarter-rack of H100s without rebuilding a hyperscaler control plane.

Enterprise on-prem AI

Give internal ML teams self-service access to the GPUs sitting in your datacenter — with quotas, projects, and chargeback.

AI labs & research groups

Share scarce H100/A100 capacity across teams with fair-share scheduling, reservations, and live preemption.

Colocation operators

Offer managed GPU-as-a-Service to colo tenants — multi-tenant isolation, metering, and a customer-facing portal included.

Positioning

Right-sized for your rack.

Hyperscaler orchestrators like Rafay are engineered for fleets that don't look like yours. CALSTIQ is the inverse.

Metric	CALSTIQ	Hyperscaler orchestrators
Ideal Fleet Size	0.25 – 10 racks	100+ nodes, multi-region
Control-plane Overhead	~1.2% native	8 – 15% abstraction tax
Setup Time	Hours, PXE bootstrap	Weeks of platform engineering
Cost Structure	Fixed per rack, per year	Consumption-weighted, tiered
Air-gap Deployment	First-class	Limited or unsupported
Tenant Billing	Built-in metering	Bring-your-own integration

Get started

Ready to operationalize the hardware you already own?

We'll walk through your fleet, your workloads, and what a CALSTIQ deployment looks like — usually inside a week.

Book a technical deep dive Request pricing