Dashboard Overview

May 1 – May 11, 2026
Inference Calls
2.84M
+12.8% vs last week
Avg Latency
47ms
-3.2% vs last week
Active Models
24
+2 deployed today
Uptime
99.97%
SLA: 99.9%

Inference Volume LIVE

3M 2M 1M 0.5M 0 May 4 5 6 7 8 9 10 11
API Calls Batch Jobs

Recent Activity

  • Model gpt-v4-turbo deployed to production4m ago
  • Batch inference job #18472 completed — 1.2M rows12m ago
  • New dataset customer_sentiment_v3 registered28m ago
  • Fine-tuning run ft-lr2e5-epoch3 finished with 0.931 F11h ago
  • Auto-scaling: +4 GPU nodes activated (us-east-1)2h ago
  • Alert: latency spike on classifier-v2 resolved3h ago
  • Team member sarah.kim created new A/B test4h ago

Deployed Models

Model Name Version Status Throughput Latency P99 Accuracy
gpt-v4-turbo-instruct v2.4.1 Live 1,200 req/s 85ms 96.2%
sentiment-classifier-v3 v3.0.0 Live 3,400 req/s 22ms 94.7%
entity-extractor-pro v1.8.2 Live 890 req/s 64ms 92.1%
image-caption-xl v2.0.0-beta Staging 420 req/s 310ms 88.9%
text-embed-v2 v2.3.0 Live 5,200 req/s 12ms 99.1%

Model Performance

GPT4 96.2 Sent. 94.7 NER 92.1 Caption 88.9 Embed 99.1

Resource Usage

GPU Cluster A78%
GPU Cluster B62%
Inference API45%
Storage91%