Pass app-lifetime context to controller, orchestrator, repomanager#115
Conversation
|
|
e0fc957 to
641a958
Compare
4d8e23a to
1866682
Compare
1866682 to
f7d9283
Compare
6bd1655 to
12fbcd7
Compare
| // is cancelled on shutdown. Per-operation deadlines are the storage | ||
| // backend's responsibility — the controller is backend-agnostic and | ||
| // must not encode any one implementation's I/O budget. | ||
| treehash1 := readTreehash(c.appCtx, c.storage, logger, request.GetFirstRevision()) |
There was a problem hiding this comment.
the context here should be detached from existing one, so that when c.appctx is cancelled, the upload still continues on best effort
There was a problem hiding this comment.
App context is cancelled when application shuts down as it is signalled by the host management platform. SIGKILL is very likely to follow soon after. How should "best effort" look like in this case?
There was a problem hiding this comment.
I'm thinking best effort of uploading results to survive both client disconnect and app shutdown -- nothing cancels it, upload whatever it can.
There was a problem hiding this comment.
there are few severe anti-patterns in this decision as I see of.
https://dave.cheney.net/2016/12/22/never-start-a-goroutine-without-knowing-how-it-will-stop
The async cache-write goroutine in GetChangedTargets used context.Background(), so it had no way to honor server shutdown — a rolling restart could leak in-flight cache writes past the point where storage was being torn down. Streaming handlers also only watched the gRPC stream context, so an in-flight request kept running until the client disconnected even after the server began shutting down. Make all three top-level constructors (NewController, NewNativeOrchestrator, NewRepoManager) accept an appCtx context.Context as their first argument. The controller routes its async cache write through appCtx; the orchestrator and repomanager store it for future fire-and-forget goroutines, with comments documenting the convention. Add a controller.linkRequestCtx(reqCtx) helper that returns a context cancelled by either the request's stream context (client disconnect) or the controller's appCtx (server shutdown), backed by context.AfterFunc so the watcher handle is released on the request's defer. Each streaming handler now derives its working context from this helper and threads it through downstream calls. example/main.go wires appCtx to signal.NotifyContext on SIGINT/SIGTERM and propagates it to all three constructors, so background work and in-flight requests are cancelled at process shutdown rather than left to leak.
12fbcd7 to
ed8a2e3
Compare
Summary
Streaming handlers in the controller only watched the gRPC stream context, so an in-flight request kept running until the client disconnected even after the server began shutting down. This PR adds an app-lifetime context so handlers — and the async cache-write goroutine — can also be cancelled at server teardown.
Changes
All three top-level constructors now accept an app-lifetime context as their first argument:
controller.NewController(appCtx, Params)orchestrator.NewNativeOrchestrator(appCtx, Params)repomanager.NewRepoManager(appCtx, Params)controller.linkRequestCtx(reqCtx)helper returns a context cancelled by either the request's stream context (client disconnect) or the controller'sappCtx(server shutdown), backed bycontext.AfterFuncso the watcher handle is released by the deferred cancel —appCtxitself is never cancelled by this helper. Each streaming handler (GetChangedTargets,GetTargetGraph) now derives its working context from this helper and threads it through all downstream calls.Goroutines launched from handlers inherit the linked context. The per-job graph-fetch workers in
GetChangedTargetsare derived viacontext.WithCancel(ctx), so they abort on either client disconnect or app shutdown. The async cache-write goroutine instead passesc.appCtxdirectly to storage — the write must survive client disconnect (so detaching from the request context is the whole point) but must NOT outlive the server, andc.appCtxsatisfies both with no extra plumbing. Per-operation deadlines are the storage backend's responsibility.`example/main.go` wires `appCtx` from `signal.NotifyContext(ctx, SIGINT, SIGTERM)` and propagates it to all three constructors.
Test plan