A Kubernetes-native control plane for AI model inference workloads, built with Go and controller-runtime. Extends the Kubernetes API with a custom AIDeployment resource that declaratively manages deployment orchestration, autoscaling, drift detection, and observability — controllable through a CLI, HTTP tool server, or natural language interface.
The AIDeployment CRD lets you declaratively define model name, replica count, service type, resource limits, and HPA config — Kubernetes reconciles the rest.
Controller continuously compares desired vs actual state across owned Deployment, Service, and HPA resources — safely updating without disrupting autoscaler-managed replicas.
Exposes reconciliation loop metrics and business-level inference metrics to Prometheus. Deployment readiness propagated into CRD status conditions.
Manage model lifecycle via aictl CLI, REST HTTP tool server, or a natural language interface — enabling both traditional ops and AI agent-driven workflows.