Degraded Performance in Orchestrate

Incident Report for SuperAnnotate AI

Resolved

By 05:15 UTC, immediate mitigation was completed by scaling up Orchestrate service resources, stabilizing the system and stopping further degradation. Additional safeguards, including payload size limits and an upcoming SDK fix, were prepared to prevent recurrence.

Posted Apr 09, 2026 - 05:15 UTC

Identified

At 04:00 UTC on April 9, 2026, degraded performance was observed in the Orchestrate service due to increased system load. The issue was identified as being caused by a retry mechanism repeatedly processing failed jobs. These failures stemmed from oversized payloads (~6 MB) generated in custom actions, which exceeded Kubernetes API server limits, leading to continuous retries and added strain on the system.

Posted Apr 09, 2026 - 04:00 UTC

This incident affected: SuperAnnotate Web Platform.