Skip to content

Scheduled & batched Lagoon redeploys (upgrade rollouts)#205

Open
dan2k3k4 wants to merge 2 commits into
devfrom
feature/scheduled-lagoon-redeploys
Open

Scheduled & batched Lagoon redeploys (upgrade rollouts)#205
dan2k3k4 wants to merge 2 commits into
devfrom
feature/scheduled-lagoon-redeploys

Conversation

@dan2k3k4

@dan2k3k4 dan2k3k4 commented Jul 1, 2026

Copy link
Copy Markdown
Member

Roll out Lagoon redeploys to running app instances — manually in bulk, and automatically on a per-app cadence — tracked in the admin panel. Redeploy-latest only; instances stay in RUNNING_HEALTHY_* (does not use the UPGRADE state machine).

Built from plans/feature/ (001–004; 005 deployment-windows is deferred). One commit per plan.

What's included

  • 001 Data modelpolydock_deployment_runs table + model, cached per-instance deploy fields + indexed next_redeploy_at, store-app cadence columns, user_groups.is_beta.
  • 002 Service + pollPolydockDeploymentService (single deploy code path: filters eligible/non-in-flight, one bulkDeployEnvironments, claims instances only after a successful trigger; poll maps deployments back by project+branch, a failed build never mutates instance status). PollDeploymentRunJob + polydock:deployments:poll (every 5m).
  • 003 Cadencepolydock:dispatch-scheduled-redeploys (every 10m): due selection, per-run cap, group-by-store-app, beta cadence override, deterministic jitter, trials excluded.
  • 004 Admin UImanage_polydock_deployments permission, instance list Last Deploy / Next Redeploy columns, gated "Redeploy selected" bulk action, store-app "Redeploy Schedule" form + "Redeploy all", is_beta toggle, read-only Deployments dashboard.

Tests

Full suite green (316 passed); ~30 new tests cover eligibility, idempotency, failed-build safety, poll transitions, cadence/beta/jitter, and the admin gate.

Deploy notes

  • Run migrations, and php artisan db:seed --class=SuperAdminRoleSeeder to create/grant the new permission.
  • redeploy_enabled defaults off — opt store apps in per app (recommend exercising on the pre-warm pool before enabling for claimed apps).

Greptile Summary

This PR introduces scheduled and manual bulk Lagoon redeploys ("upgrade rollouts") for running app instances, tracked via a new polydock_deployment_runs table and surfaced in the admin panel. It does not use the existing UPGRADE state machine — instances remain in their RUNNING_HEALTHY_* statuses throughout.

  • Data model (001): New polydock_deployment_runs table + model, cached deploy columns on polydock_app_instances with next_redeploy_at index, cadence columns on polydock_store_apps, and user_groups.is_beta.
  • Service + polling (002): PolydockDeploymentService is the single redeploy code path — filters ineligible/in-flight instances, claims instances only after a successful trigger, and tracks run status via PollDeploymentRunJob. A failed build never mutates instance lifecycle status.
  • Cadence dispatch (003): polydock:dispatch-scheduled-redeploys (every 10 min) selects due non-trial instances, caps per-run volume, groups by store app, applies beta cadence overrides, and spreads schedules with deterministic per-instance jitter.
  • Admin UI (004): manage_polydock_deployments permission gate on bulk "Redeploy selected" action, store-app cadence form, "Redeploy all" action, is_beta toggle on user groups, and a read-only Deployments dashboard.

Confidence Score: 4/5

The feature is well-structured and safe to merge for initial rollout with redeploy_enabled off by default; one config key documents a chunking behaviour that was never implemented.

The core redeploy path is solid: post-trigger instance claiming prevents ghost in-flight locks, failed builds never mutate instance lifecycle status, and the poll loop is correctly bounded. One gap stands out: config/polydock.php defines bulk_chunk_size with an explicit comment that it prevents oversized single mutations to the Lagoon API, and the feature plans describe chunking by that value — but PolydockDeploymentService::redeploy() never reads the key and sends all environments in a single call. With default config (max_per_run = 50, bulk_chunk_size = 50) this is invisible, but operators raising POLYDOCK_DEPLOY_MAX_PER_RUN via env var would produce uncapped single mutations.

app/Services/PolydockDeploymentService.php (missing bulk chunk loop), app/Filament/Admin/Resources/PolydockStoreAppResource.php (missing with('deploymentRun') in redeploy_all action)

Important Files Changed

Filename Overview
app/Services/PolydockDeploymentService.php Core redeploy service: solid eligibility filtering, post-success instance claiming, and fail-safe poll exhaustion logic. bulk_chunk_size config is documented but never read — all environments go in a single Lagoon API call.
app/Console/Commands/DispatchScheduledRedeploysCommand.php Cadence-based scheduler: clean trial exclusion, beta-group override, and deterministic jitter. The per-instance refresh() in the confirmation loop is an N+1 pattern; a single batch query would be cleaner.
app/Models/PolydockDeploymentRun.php New model for deployment run tracking; well-structured with UUID, soft terminal-state guard, and correct relationship definitions.
app/Models/PolydockAppInstance.php Adds deployment tracking fields, isRedeployEligible(), hasInFlightDeployment(), and the deploymentRun BelongsTo relation. All new behaviour is additive and non-breaking.
app/Filament/Admin/Resources/PolydockStoreAppResource.php Adds redeploy schedule form section and redeploy_all action. Instance fetch for the action is missing with('deploymentRun'), triggering N+1 queries via hasInFlightDeployment() in the service.
app/Filament/Admin/Resources/PolydockDeploymentRunResource.php Read-only admin dashboard for deployment runs; canCreate/Edit/Delete all return false, visibility gated behind currentUserCanManage(). Clean implementation.
app/Jobs/PollDeploymentRunJob.php Thin job that guards against polling a terminal run and delegates to the service. Correct and straightforward.
app/Console/Commands/PollDeploymentRunsCommand.php Polls in-flight runs on schedule, respecting poll_attempts < maxAttempts guard and last_polled_at backoff. Correct and well-bounded.
database/migrations/2026_07_01_090001_create_polydock_deployment_runs_table.php Migration for polydock_deployment_runs table: appropriate nullable FKs with nullOnDelete, composite index on (status, last_polled_at) for the poll command query, and reversible down().
database/migrations/2026_07_01_090002_add_deployment_tracking_to_polydock_app_instances_table.php Adds cached deploy fields and deployment_run_id FK to polydock_app_instances with nullOnDelete. Composite index on (polydock_store_app_id, next_redeploy_at) supports the cadence query. Fully reversible.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Scheduler as Cron / Admin UI
    participant Cmd as DispatchScheduledRedeploysCommand
    participant Svc as PolydockDeploymentService
    participant Lagoon as Lagoon API
    participant DB as Database

    Scheduler->>Cmd: polydock:dispatch-scheduled-redeploys (every 10m)
    Cmd->>DB: SELECT eligible, due, non-trial instances (limit max_per_run)
    DB-->>Cmd: instances grouped by store_app_id

    loop per store-app group
        Cmd->>Svc: redeploy(group, SCHEDULED)
        Svc->>Svc: filter ineligible + in-flight
        Svc->>DB: INSERT polydock_deployment_runs (RUNNING)
        Svc->>Lagoon: bulkDeployEnvironments(environments[])
        Lagoon-->>Svc: bulkDeployEnvironmentLatest (bulk_id)
        Svc->>DB: UPDATE instances SET deployment_run_id
        Svc->>DB: dispatch PollDeploymentRunJob
        Svc-->>Cmd: PolydockDeploymentRun
        Cmd->>DB: UPDATE next_redeploy_at for claimed instances
    end

    note over Scheduler: polydock:deployments:poll (every 5m)
    Scheduler->>DB: SELECT non-terminal runs due for poll
    DB-->>Scheduler: in-flight runs
    loop per run
        Scheduler->>Lagoon: getDeploymentsByBulkId(bulk_id)
        Lagoon-->>Scheduler: deployment statuses
        Scheduler->>DB: UPDATE instances cached deploy state
        alt all terminal
            Scheduler->>DB: UPDATE run status COMPLETED / PARTIAL_FAILED / FAILED
        else poll exhausted
            Scheduler->>DB: UPDATE run status FAILED
        end
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Scheduler as Cron / Admin UI
    participant Cmd as DispatchScheduledRedeploysCommand
    participant Svc as PolydockDeploymentService
    participant Lagoon as Lagoon API
    participant DB as Database

    Scheduler->>Cmd: polydock:dispatch-scheduled-redeploys (every 10m)
    Cmd->>DB: SELECT eligible, due, non-trial instances (limit max_per_run)
    DB-->>Cmd: instances grouped by store_app_id

    loop per store-app group
        Cmd->>Svc: redeploy(group, SCHEDULED)
        Svc->>Svc: filter ineligible + in-flight
        Svc->>DB: INSERT polydock_deployment_runs (RUNNING)
        Svc->>Lagoon: bulkDeployEnvironments(environments[])
        Lagoon-->>Svc: bulkDeployEnvironmentLatest (bulk_id)
        Svc->>DB: UPDATE instances SET deployment_run_id
        Svc->>DB: dispatch PollDeploymentRunJob
        Svc-->>Cmd: PolydockDeploymentRun
        Cmd->>DB: UPDATE next_redeploy_at for claimed instances
    end

    note over Scheduler: polydock:deployments:poll (every 5m)
    Scheduler->>DB: SELECT non-terminal runs due for poll
    DB-->>Scheduler: in-flight runs
    loop per run
        Scheduler->>Lagoon: getDeploymentsByBulkId(bulk_id)
        Lagoon-->>Scheduler: deployment statuses
        Scheduler->>DB: UPDATE instances cached deploy state
        alt all terminal
            Scheduler->>DB: UPDATE run status COMPLETED / PARTIAL_FAILED / FAILED
        else poll exhausted
            Scheduler->>DB: UPDATE run status FAILED
        end
    end
Loading

Reviews (1): Last reviewed commit: "feat(deploy): add deploy management" | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

@dan2k3k4 dan2k3k4 force-pushed the feature/scheduled-lagoon-redeploys branch from 0690ee7 to 4d32c02 Compare July 1, 2026 18:59
@dan2k3k4 dan2k3k4 changed the base branch from main to dev July 1, 2026 19:00
@dan2k3k4 dan2k3k4 force-pushed the feature/scheduled-lagoon-redeploys branch 2 times, most recently from 0690ee7 to 2609166 Compare July 1, 2026 19:03
Comment on lines +91 to +107
$client = $this->lagoon->getAuthenticatedClient();
$result = $client->bulkDeployEnvironments(
environments: $environments,
name: 'Polydock redeploy '.$run->uuid,
buildVariables: $buildVariables,
);
} catch (\Throwable $e) {
Log::error('Redeploy trigger failed', ['run' => $run->uuid, 'error' => $e->getMessage()]);
$this->failRun($run, 'Trigger failed: '.$e->getMessage());

return $run;
}

if (isset($result['error'])) {
$error = is_array($result['error']) ? json_encode($result['error']) : (string) $result['error'];
Log::error('Redeploy trigger returned error', ['run' => $run->uuid, 'error' => $error]);
$this->failRun($run, 'Trigger error: '.$error);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 bulk_chunk_size config is defined but never applied

config/polydock.php documents polydock.deploy.bulk_chunk_size as "Max environments per bulkDeployEnvironments mutation (avoids one huge call)", and the feature plan docs describe chunking by this value. However, PolydockDeploymentService::redeploy() sends all environments in a single bulkDeployEnvironments call without reading the config key or chunking the $environments array. With default settings (max_per_run = 50, bulk_chunk_size = 50) this is benign, but raising POLYDOCK_DEPLOY_MAX_PER_RUN via env var would produce oversized single mutations. The chunk config is effectively dead code.

Comment thread app/Filament/Admin/Resources/PolydockStoreAppResource.php
Comment thread app/Console/Commands/DispatchScheduledRedeploysCommand.php
@dan2k3k4 dan2k3k4 force-pushed the feature/scheduled-lagoon-redeploys branch from 9ab756b to 850b27c Compare July 1, 2026 19:42
- docs: add scheduled Lagoon redeploy feature plans
- feat(deploy): deployment-tracking data model (plan 001)
- feat(deploy): redeploy service + poll job (plan 002)
- feat(deploy): scheduled cadence redeploy dispatch (plan 003)
- feat(deploy): admin UI + permission for redeploys (plan 004)
- test(deploy): update trigger-deploy command test for service refactor
@dan2k3k4 dan2k3k4 force-pushed the feature/scheduled-lagoon-redeploys branch from 850b27c to 91c2740 Compare July 1, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant