Summary
When a Worker Deployment's backing system workflow (temporal-sys-worker-deployment:*) is in a damaged or divergent state, CLI operations (set-ramping-version --delete, set-current-version --unversioned, delete-version) return success messages but don't actually persist state changes. describe shows identical values and unchanged timestamps before and after.
Note: This may be a Temporal server bug rather than a controller bug, but filing here since it was discovered in the context of the controller's reconcile failures and affects the same recovery workflow.
Environment
- Temporal CLI: latest (as of 2026-05-27)
- Temporal server: self-hosted (Cassandra-backed)
- Controller image: v1.5.x
What happened
After the controller's plan executor bug left a Worker Deployment in a corrupted state (see: plan executor delete+update bug), we attempted manual cleanup:
$ temporal worker deployment set-ramping-version --deployment-name '...' --build-id '17.233.1-497c' --delete --yes
Successfully set the ramping worker deployment version
$ temporal worker deployment describe --name '...' --namespace account
RampingVersionBuildID: 17.233.1-497c # unchanged
RampingVersionPercentage: 50 # unchanged
RampingVersionPercentageChangedTime: 4 days ago # unchanged
This pattern reproduced across multiple operations:
set-ramping-version --delete → "Successfully set" → no change
set-ramping-version --percentage 0 → "Successfully set" → no change
set-current-version --unversioned → "Successfully set" → no change
delete-version → "Successfully deleted" → version still present in describe
The deployment had no active pollers (workers had been sunset). The --allow-no-pollers and --ignore-missing-task-queues flags were used.
The system workflow (temporal-sys-worker-deployment:*) existed in the namespace and was reachable via temporal workflow describe, but its internal state was divergent from what deployment describe reported.
Recovery
Deleted the backing system workflow directly:
temporal workflow delete
--workflow-id 'temporal-sys-worker-deployment:'
--namespace account
This scrubbed the deployment from worker deployment list and allowed clean re-registration.
Expected behavior
Operations should either persist the change or return an error. Returning success without mutation is dangerous during incident recovery.
Summary
When a Worker Deployment's backing system workflow (
temporal-sys-worker-deployment:*) is in a damaged or divergent state, CLI operations (set-ramping-version --delete,set-current-version --unversioned,delete-version) return success messages but don't actually persist state changes.describeshows identical values and unchanged timestamps before and after.Note: This may be a Temporal server bug rather than a controller bug, but filing here since it was discovered in the context of the controller's reconcile failures and affects the same recovery workflow.
Environment
What happened
After the controller's plan executor bug left a Worker Deployment in a corrupted state (see: plan executor delete+update bug), we attempted manual cleanup:
Successfully set the ramping worker deployment version
This pattern reproduced across multiple operations:
set-ramping-version --delete→ "Successfully set" → no changeset-ramping-version --percentage 0→ "Successfully set" → no changeset-current-version --unversioned→ "Successfully set" → no changedelete-version→ "Successfully deleted" → version still present in describeThe deployment had no active pollers (workers had been sunset). The
--allow-no-pollersand--ignore-missing-task-queuesflags were used.The system workflow (
temporal-sys-worker-deployment:*) existed in the namespace and was reachable viatemporal workflow describe, but its internal state was divergent from whatdeployment describereported.Recovery
Deleted the backing system workflow directly:
This scrubbed the deployment from
worker deployment listand allowed clean re-registration.Expected behavior
Operations should either persist the change or return an error. Returning success without mutation is dangerous during incident recovery.