Cog Server Rewrite #2641

tempusfrangit · 2026-01-16T17:51:21Z

This is a complete rewrite of the cog http server moving away from python and to a rust pyo3, ABI3 wheel. The aim is to provide a significant uplift in control of how we run predictors and isolate the predictor run from the core interface used to run the predictor.

═══════════════════════════════════════════════════════════════════════════════════
                           COMPONENT OWNERSHIP
═══════════════════════════════════════════════════════════════════════════════════
  PredictionSupervisor (DashMap - lock-free concurrent access)
  ├── owns: prediction state (id, status, input, output, logs, error, timestamps)
  ├── owns: webhook sender (sends terminal webhook, then cleans up entry)
  ├── owns: cancel_token (propagates cancellation to worker)
  └── owns: completion notifier (for waiting on result)
  PredictionSlot (RAII container)
  ├── owns: Prediction (logs, outputs from worker event loop)
  ├── owns: Permit (concurrency token, returns to pool on drop)
  └── Drop: marks permit idle, releases back to pool
  PredictionHandle (returned to route handler)
  ├── has: reference to supervisor (for state queries)
  ├── has: cancel_token clone (for cancellation)
  ├── has: completion notifier clone (for waiting)
  └── method: sync_guard() → SyncPredictionGuard (cancels on drop)
═══════════════════════════════════════════════════════════════════════════════════
                         WORKER SUBPROCESS PROTOCOL
═══════════════════════════════════════════════════════════════════════════════════
  Control Channel (stdin/stdout - JSON framed)
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │  Parent → Child                    Child → Parent                           │
  │  ──────────────                    ──────────────                           │
  │  Init { predictor_ref, slots }     Ready { schema }                         │
  │  Cancel                            Cancelled                                │
  │  Shutdown                          Failed { error }                         │
  │                                    Idle                                     │
  └─────────────────────────────────────────────────────────────────────────────┘
  Slot Channel (Unix socket per slot - JSON framed)
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │  Parent → Child                    Child → Parent                           │
  │  ──────────────                    ──────────────                           │
  │  Predict { id, input }             Log { data }                             │
  │                                    Output { value }  (streaming)            │
  │                                    Done { output }                          │
  │                                    Failed { error }                         │
  │                                    Cancelled                                │
  └─────────────────────────────────────────────────────────────────────────────┘
═══════════════════════════════════════════════════════════════════════════════════
                            HEALTH STATE MACHINE
═══════════════════════════════════════════════════════════════════════════════════
                    ┌──────────────┐
                    │   STARTING   │  (HTTP serves 503)
                    └──────┬───────┘
                           │
              ┌────────────┴────────────┐
              ▼                         ▼
     ┌────────────────┐        ┌───────────────┐
     │     READY      │        │ SETUP_FAILED  │  (setup() threw)
     └────────┬───────┘        └───────────────┘
              │
              │ (all slots busy)
              ▼
     ┌────────────────┐
     │     BUSY       │  (409 for new predictions)
     └────────┬───────┘
              │
              │ (slot freed)
              ▼
     ┌────────────────┐
     │     READY      │
     └────────┬───────┘
              │
              │ (fatal error / worker crash)
              ▼
     ┌────────────────┐
     │    DEFUNCT     │  (HTTP serves 503)
     └────────────────┘
═══════════════════════════════════════════════════════════════════════════════════
                              FILE STRUCTURE
═══════════════════════════════════════════════════════════════════════════════════
  crates/coglet/src/
  ├── lib.rs                    # Public API exports
  ├── service.rs                # PredictionService (orchestrates everything)
  ├── supervisor.rs             # PredictionSupervisor (lifecycle + webhooks)
  ├── prediction.rs             # Prediction state (logs, outputs, status)
  ├── health.rs                 # Health enum + SetupResult
  ├── orchestrator.rs           # Worker subprocess management
  ├── permit/
  │   ├── mod.rs
  │   ├── pool.rs               # PermitPool (concurrency control)
  │   └── slot.rs               # PredictionSlot (Prediction + Permit RAII)
  ├── bridge/
  │   ├── mod.rs
  │   ├── protocol.rs           # Control/Slot request/response types
  │   ├── codec.rs              # JSON length-delimited framing
  │   └── transport.rs          # Unix socket transport
  ├── transport/
  │   └── http/
  │       ├── mod.rs
  │       ├── server.rs         # Axum server setup
  │       └── routes.rs         # HTTP handlers (uses supervisor)
  ├── webhook.rs                # WebhookSender (retry logic, trace context)
  ├── worker/
  │   ├── mod.rs
  │   ├── manager.rs            # Worker spawn/lifecycle
  │   └── worker.rs             # Worker main loop (child process side)
  └── version.rs                # VersionInfo
  crates/coglet-python/src/
  └── lib.rs                    # PyO3 bindings (coglet.serve())

┌─────────────────────────────────────────────────────────────────────────────────┐
│                              HTTP Transport (axum)                               │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐ │
│  │ POST        │  │ PUT         │  │ POST        │  │ GET                     │ │
│  │ /predictions│  │ /predictions│  │ /cancel     │  │ /health-check           │ │
│  │             │  │ /{id}       │  │             │  │ /openapi.json           │ │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └───────────┬─────────────┘ │
└─────────┼────────────────┼────────────────┼─────────────────────┼───────────────┘
        │                │                │                     │
        ▼                ▼                ▼                     ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                            PredictionService                                     │
│  ┌────────────────────────────────────────────────────────────────────────────┐ │
│  │                        PredictionSupervisor (DashMap)                      │ │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐            │ │
│  │  │ PredictionEntry │  │ PredictionEntry │  │ PredictionEntry │  ...       │ │
│  │  │ ─────────────── │  │ ─────────────── │  │ ─────────────── │            │ │
│  │  │ state           │  │ state           │  │ state           │            │ │
│  │  │ cancel_token    │  │ cancel_token    │  │ cancel_token    │            │ │
│  │  │ webhook         │  │ webhook         │  │ webhook         │            │ │
│  │  │ completion      │  │ completion      │  │ completion      │            │ │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────┘            │ │
│  └────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                  │
│  ┌────────────────────────────────────────────────────────────────────────────┐ │
│  │                           PermitPool                                       │ │
│  │  ┌────────┐  ┌────────┐  ┌────────┐                                       │ │
│  │  │ Permit │  │ Permit │  │ Permit │  (concurrency control)                │ │
│  │  │ slot_0 │  │ slot_1 │  │ slot_2 │                                       │ │
│  │  └────────┘  └────────┘  └────────┘                                       │ │
│  └────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                  │
│  ┌────────────────────────────────────────────────────────────────────────────┐ │
│  │                        OrchestratorHandle                                  │ │
│  │  (slot_ids, control_tx for worker comms)                                   │ │
│  └────────────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────┬──────────────────────────────────────────────┘
                                 │
                  Unix Socket (slot) + stdin/stdout (control)
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                         Worker Subprocess (Python)                               │
│  ┌────────────────────────────────────────────────────────────────────────────┐ │
│  │                              Predictor                                     │ │
│  │  ┌─────────────────────────────────────────────────────────────────────┐  │ │
│  │  │  setup()    →  runs once at startup                                 │  │ │
│  │  │  predict()  →  handles SlotRequest::Predict                         │  │ │
│  │  └─────────────────────────────────────────────────────────────────────┘  │ │
│  └────────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
═══════════════════════════════════════════════════════════════════════════════════
                            PREDICTION FLOW
═══════════════════════════════════════════════════════════════════════════════════
Sync Request (POST /predictions)
─────────────────────────────────
  Client                    Routes                Supervisor           Worker
     │                        │                       │                   │
     │──POST /predictions────▶│                       │                   │
     │                        │──submit(id,input)────▶│                   │
     │                        │◀──PredictionHandle────│                   │
     │                        │                       │                   │
     │                        │──acquire permit──────▶│                   │
     │                        │                       │                   │
     │     (SyncPredictionGuard held - cancels on connection drop)       │
     │                        │                       │                   │
     │                        │──predict(slot,input)─────────────────────▶│
     │                        │                       │                   │
     │                        │◀─────────────result──────────────────────│
     │                        │                       │                   │
     │                        │──update_status(terminal)─▶│              │
     │                        │                       │──send webhook───▶│
     │                        │                       │──cleanup entry───│
     │                        │                       │                   │
     │◀──200 {output}─────────│                       │                   │
Async Request (Prefer: respond-async)
─────────────────────────────────────
  Client                    Routes                Supervisor           Worker
     │                        │                       │                   │
     │──POST + respond-async─▶│                       │                   │
     │                        │──submit(id,input)────▶│                   │
     │                        │◀──PredictionHandle────│                   │
     │                        │                       │                   │
     │◀──202 {starting}───────│                       │                   │
     │                        │                       │                   │
     │   (no guard - prediction continues independently)                 │
     │                        │                       │                   │
     │                        │    ┌──────────────────────────────────┐  │
     │                        │    │  spawned task:                   │  │
     │                        │    │    predict(slot, input)          │──▶│
     │                        │    │    update_status(result)         │  │
     │                        │    │    → triggers webhook + cleanup  │  │
     │                        │    └──────────────────────────────────┘  │
     │                        │                       │                   │
     │◀─────────────────────webhook (completed)───────│                   │
Idempotent PUT (PUT /predictions/{id})
──────────────────────────────────────
  Client                    Routes                Supervisor
     │                        │                       │
     │──PUT /predictions/X───▶│                       │
     │                        │──get_state("X")──────▶│
     │                        │                       │
     │         ┌──────────────┴──────────────┐       │
     │         │ if exists:                  │       │
     │◀────────│   return 202 + full state   │◀──────│
     │         │ else:                       │       │
     │         │   submit + run prediction   │       │
     │         └─────────────────────────────┘       │
Connection Drop (Sync Mode)
───────────────────────────
  Client                    Routes                Supervisor           Worker
     │                        │                       │                   │
     │──POST /predictions────▶│                       │                   │
     │                        │   (SyncPredictionGuard armed)            │
     │                        │──predict(slot)───────────────────────────▶│
     │                        │                       │                   │
     │ ✕ (connection drops)   │                       │                   │
     │                        │                       │                   │
     │                   guard.drop()                 │                   │
     │                        │──cancel_token.cancel()▶│                  │
     │                        │                       │──Cancel──────────▶│
     │                        │                       │◀──Cancelled───────│
     │                        │                       │                   │

═══════════════════════════════════════════════════════════════════════════════════
                         COG HTTP → COGLET INVOCATION
═══════════════════════════════════════════════════════════════════════════════════
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │                        cog predict / cog run                                │
  │                               (CLI)                                         │
  └─────────────────────────────────┬───────────────────────────────────────────┘
                                    │
                                    ▼
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │                     python -m cog.server.http                               │
  │                                                                             │
  │   if USE_COGLET env var:                                                    │
  │       import coglet                                                         │
  │       coglet.serve(predictor_ref, port=5000)  ──────────────────────────┐   │
  │   else:                                                                 │   │
  │       # original Python FastAPI server                                  │   │
  │       uvicorn.run(app, port=5000)                                       │   │
  └─────────────────────────────────────────────────────────────────────────┼───┘
                                                                            │
                                                                            ▼
  ┌─────────────────────────────────────────────────────────────────────────────┐
  │                          coglet (Rust)                                      │
  │                                                                             │
  │   ┌───────────────────────────────────────────────────────────────────┐     │
  │   │  HTTP Server (axum)  :5000                                        │     │
  │   │    /predictions, /health-check, etc.                              │     │
  │   └───────────────────────────────────────────────────────────────────┘     │
  │                              │                                              │
  │                              ▼                                              │
  │   ┌───────────────────────────────────────────────────────────────────┐     │
  │   │  PredictionService + Supervisor                                   │     │
  │   └───────────────────────────────────────────────────────────────────┘     │
  │                              │                                              │
  │                    Unix socket + pipes                                      │
  │                              │                                              │
  │                              ▼                                              │
  │   ┌───────────────────────────────────────────────────────────────────┐     │
  │   │  Worker subprocess (Python)                                       │     │
  │   │    - loads predictor_ref                                          │     │
  │   │    - runs setup()                                                 │     │
  │   │    - handles predict() requests                                   │     │
  │   └───────────────────────────────────────────────────────────────────┘     │
  └─────────────────────────────────────────────────────────────────────────────┘
  Why?
  ────
  • Rust HTTP server (axum) is faster, handles backpressure better
  • Worker isolation: Python crash doesn't kill server
  • Same API surface as original cog (drop-in replacement)
  • Subprocess reuse: predictor stays loaded between requests

michaeldwan · 2026-01-16T20:15:53Z

in an effort to keep this as small as it can be (lololol) can we leave the coglet dir in the repo for now. Or if necessary, remove the go code and leave the python code? Would be nice to merge this as an addition only and then start wiring it up to the build/wheel selection system. I'd also like to salvage the python sdk portion of coglet since the complex type system and schema worked well for pipelines.

AGENTS.local.md

CONTRIBUTING.md

tempusfrangit · 2026-01-17T15:47:56Z

in an effort to keep this as small as it can be (lololol) can we leave the coglet dir in the repo for now. Or if necessary, remove the go code and leave the python code? Would be nice to merge this as an addition only and then start wiring it up to the build/wheel selection system. I'd also like to salvage the python sdk portion of coglet since the complex type system and schema worked well for pipelines.

The python section of the code is the problem as it conflicts with the new wheel. I'll get some rename I. Place instead of removal.

Tasks for coglet development: - build:rust / build:rust:release - Build Rust crates - build:coglet - Build Python wheel with maturin (dev) - build:coglet:wheel* - Build release wheels for various platforms - test:rust - Run Rust tests with nextest - test:coglet-python - Run Python integration tests - fmt:rust / fmt:rust:check - Format Rust code - clippy - Run lints - deny - Check licenses/advisories Tools added: cargo-binstall, cargo-deny, cargo-insta, cargo-nextest, maturin, ruff, ty

Workspace structure: - crates/Cargo.toml - workspace root with shared dependencies - crates/coglet/ - main Rust library (empty scaffold) - crates/deny.toml - cargo-deny license/advisory config - crates/.gitignore - ignore build artifacts Shared workspace dependencies defined for async runtime (tokio), HTTP (axum, reqwest), serialization (serde), and error handling.

Runtime: tokio, tokio-util (codec), futures, async-trait Serialization: serde, serde_json HTTP: axum (server), reqwest + ureq (webhooks) Utils: uuid, chrono, thiserror, anyhow, dashmap, tracing Unix: nix (signal handling) Dev: insta, wiremock, tower, http-body-util

Health enum: Unknown, Starting, Ready, Busy, SetupFailed, Defunct SetupStatus enum with SetupResult for tracking setup phase VersionInfo for runtime version reporting Includes serde serialization with appropriate casing and tests.

Protocol types for parent-worker communication: - ControlRequest/Response for control channel (cancel, shutdown, ready) - SlotRequest/Response for per-slot data channel (predict, logs, output) - SlotId using UUID for unique slot identification - SlotOutcome for type-safe completion handling JsonCodec wraps LengthDelimitedCodec with serde_json serialization.

NamedSocketTransport: filesystem sockets (all Unix platforms) AbstractSocketTransport: Linux abstract namespace (no filesystem) Adds ControlRequest::Init variant for worker initialization with transport info, predictor ref, slot count, and async/train flags.

PredictionStatus enum with terminal state detection. PredictionOutput for single values or streamed chunks. Prediction struct tracks lifecycle: status, logs, outputs, timing. Completion notification via tokio Notify for async waiting. CancellationToken re-exported from tokio_util for cancel propagation.

PredictionResult aggregates output, timing, and logs. PredictionGuard provides RAII timing with cancellation support. PredictionMetrics for timing data collection. PredictionError for typed prediction failures. PredictFn/AsyncPredictFn type aliases for predictor function signatures. Also fixes PredictionOutput to use untagged serde serialization.

PermitInUse/PermitIdle/PermitPoisoned enforce valid state transitions at compile time - poisoned permits cannot become idle. PermitPool manages slot permits with RAII semantics: - Idle permits return to pool on drop - Poisoned permits are orphaned, reducing capacity AnyPermit enum for dynamic state storage.

PredictionSlot holds both together with separate concerns: - Prediction behind Mutex for concurrent updates - Permit for RAII pool return on drop Slot transitions: into_idle() returns permit, into_poisoned() orphans it. Drop handler fails non-terminal predictions if slot leaked.

WebhookEventType enum for filtering (start, output, logs, completed). WebhookConfig with throttle interval, retry settings, status codes. TraceContext for W3C distributed tracing headers. WebhookSender implements: - send(): fire-and-forget for non-terminal events with throttling - send_terminal(): async with exponential backoff retries - send_terminal_sync(): blocking version for Drop contexts (uses ureq) Bearer auth via WEBHOOK_AUTH_TOKEN env var.

PredictionState snapshot for API responses with to_response() builder. PredictionHandle for waiting, state queries, and cancellation. SyncPredictionGuard for cancel-on-drop behavior in sync predictions. DashMap provides lock-free concurrent access - no deadlock risks. Terminal status triggers webhook send via spawned task.

- input.rs: Runtime detection (Pydantic vs Coglet), URLPath download with parallel ThreadPoolExecutor, PreparedInput RAII for temp file cleanup - output.rs: Output processing using cog.json.make_encodeable() and upload_files() for base64 data URL conversion - audit.rs: TeeWriter for protecting sys.stdout/stderr with audit hooks that wrap user replacements while preserving slot routing - log_writer.rs: Full SlotLogWriter with ContextVar-based prediction routing, line buffering, setup log sender for health-check logs - cancel.rs: SIGUSR1 signal handling with CancelationException, CancelableGuard RAII for sync prediction cancellation

- predictor.rs: Complete PythonPredictor with runtime detection, async/sync support, generator handling, input preparation, output processing, and schema generation - worker_bridge.rs: Full PredictHandler with asyncio event loop, per-slot cancellation tracking, log context management, SIGUSR1 signal cancellation for sync, future.cancel() for async

- coglet.pyi: Type stubs for active(), serve(), _run_worker(), _is_cancelable() - tests/test_coglet.py: Integration tests for sync/async/generator predictors, health check, cancellation endpoints

- Setup log hook for routing setup logs via SlotLogWriter - Export audit helpers (_is_slot_log_writer, _is_tee_writer, etc.) - Export SlotLogWriter and TeeWriter classes for isinstance checks

- Use abi3-py38 feature for stable ABI compatibility - Replace Python::with_gil with Python::attach (0.27 API) - Replace py.allow_threads with py.detach (0.27 API) - Add auto-initialize feature for standalone testing

Add missing install_slot_log_writers() and install_audit_hook() calls in _run_worker() to enable stdout/stderr routing and protection. Remove module-level #![allow(dead_code)] since code is now integrated.

Replace Result<(), String> in PredictHandler::setup() with typed SetupError enum. Variants distinguish load vs setup vs internal errors, enabling future error code mapping without changing behavior.

- crates/README.md: Overall architecture, prediction flow, startup sequence, bridge protocol, directory structure - crates/coglet/README.md: Detailed coglet internals, components, health states, behaviors (shutdown, cancellation, slot poisoning) - crates/coglet-python/README.md: Python bindings details, active() flag, single async loop, stdout/stderr routing, audit hook, cancel

Production always uses orchestrator. The predict_fn/async_predict_fn path was only used in tests and never ran in production. - Remove PredictFn, AsyncPredictFn, PredictFuture types - Remove predict_via_function method - Remove legacy_pool field - Remove PredictionService::new(pool) constructor - Enforce READY requires orchestrator (silently ignores otherwise) - Update tests to not create impossible states

Introduce Orchestrator trait with register_prediction() method: - Enables mocking orchestrator in service unit tests - OrchestratorHandle implements trait, keeps cancel/shutdown as inherent methods - Service now takes Arc<dyn Orchestrator> instead of Arc<OrchestratorHandle> - Export trait from coglet crate for downstream use

- Add MockOrchestrator in both service and routes test modules - Service tests: orchestrator integration, prediction flow, capacity - HTTP routes tests: full request/response cycle with mock predictor - Fix race in predict() by checking terminal before waiting on notify - 15 new tests covering prediction lifecycle end-to-end

- into_idle() now returns Option<IdleToken> instead of panicking - into_poisoned() now returns bool instead of panicking - Invalid state transitions log errors and use debug_assert for dev builds - Callers updated to handle new return types gracefully

- stdin/stdout take() now returns OrchestratorError instead of panicking - Prediction mutex locks use try_lock_prediction() helper that: - On poison: fails the prediction, logs error, returns None - Caller removes poisoned predictions from tracking - Further data from poisoned predictions is ignored

- Add try_lock_prediction() helper that fails prediction on mutex poison - predict() returns PredictionError on mutex poison instead of panicking

- worker_bridge: init_async_loop() and PythonPredictHandler::new() return Result - input: detect_runtime() returns Result, PreparedInput uses Py<PyDict> - log_writer: mutex locks use expect() with clear messages - predictor: handle detect_runtime Result propagation Worker mutex poisoning now panics with clear message - orchestrator handles worker exit and fails predictions appropriately.

- webhook: WebhookSender::new() returns Result for HTTP client creation - webhook: Mutex locks use into_inner() to recover from poison - routes: Document timestamp unwrap safety (can't fail after UNIX_EPOCH) - server: Improve signal handler expect messages for clarity Callers handle webhook creation failure by logging and continuing without webhooks, which is acceptable graceful degradation.

- input.rs: get_item().unwrap() -> ok_or_else(PyKeyError) for missing keys - log_writer.rs: OnceLock.get().unwrap() -> ok_or_else(PyRuntimeError) Both are now proper Python exceptions rather than Rust panics.

- Add .github/workflows/rust.yaml for Rust CI (fmt, clippy, nextest) - Add mise tasks: stubs:generate, stubs:check, stubs:typecheck - Add mise tasks: ci:rust, ci:coglet for local CI simulation - Add scripts/generate_stubs.py to auto-generate coglet.pyi from module - Update coglet.pyi with all module exports (classes + internal functions) CI runs: - Pure Rust checks (fmt, clippy, tests) without Python - coglet-python checks on Python 3.9/3.12/3.13 (build, stub check, typecheck)

- Install cargo-binstall before mise so mise uses binstall for cargo tools - Add uv sync to setup Python environment before maturin build - Order: Rust -> cargo-binstall -> rust-cache -> mise -> uv sync

- Add #[allow(dead_code)] on AbstractSocketTransport.prefix (kept for debugging) - Add 'uv venv' step before maturin build in CI - Move virtualenv PATH setup to individual steps after checkout to prevent "Unable to locate executable file: tar" error during actions/checkout@v4

markphelps

most of my comments are enum cleanup related because I just read that chapter in the Book of Rust

crates/coglet-python/src/predictor.rs

crates/coglet-python/src/worker_bridge.rs

crates/coglet/src/prediction.rs

crates/coglet/src/worker.rs

mfainberg-cf · 2026-01-20T17:11:53Z

Yes the enum change was lost in translation. A number of other places I tried to ensure we had enums. I'll get an addon patch to make those conversions you highlighted unless you're interested in doing it.

Replace boolean soup with strongly-typed enums across four areas: - PredictorKind enum (Class/StandaloneFunction) replacing 5 boolean flags in predictor.rs (is_async, is_async_gen, has_train, is_train_async, is_standalone_function) - PredictionOutcome enum (Success/Failed/Cancelled) replacing success bool and error string matching in worker.rs - HandlerMode enum (Predict/Train) replacing is_train boolean in worker_bridge.rs - SlotState enum (Idle/SyncPrediction/AsyncPrediction) replacing 4 fields with state machine in worker_bridge.rs Benefits: - Invalid states impossible at compile time - Eliminates string matching for cancellation - Clearer semantics and self-documenting code - Easier to extend with new variants Addresses PR #2641 review feedback.

* refactor: replace boolean flags with Rust enums Replace boolean soup with strongly-typed enums across four areas: - PredictorKind enum (Class/StandaloneFunction) replacing 5 boolean flags in predictor.rs (is_async, is_async_gen, has_train, is_train_async, is_standalone_function) - PredictionOutcome enum (Success/Failed/Cancelled) replacing success bool and error string matching in worker.rs - HandlerMode enum (Predict/Train) replacing is_train boolean in worker_bridge.rs - SlotState enum (Idle/SyncPrediction/AsyncPrediction) replacing 4 fields with state machine in worker_bridge.rs Benefits: - Invalid states impossible at compile time - Eliminates string matching for cancellation - Clearer semantics and self-documenting code - Easier to extend with new variants Addresses PR #2641 review feedback. * fix: apply clippy suggestions Use derive(Default) with #[default] attribute instead of manual impl. * style: apply rustfmt formatting * test: update tests to use PredictionOutcome enum Replace direct field access (.success, .error) with pattern matching on the PredictionOutcome enum. * chore: rm deadcode, simplify start_sync_prediction logic Signed-off-by: Mark Phelps <[email protected]> --------- Signed-off-by: Mark Phelps <[email protected]>

tempusfrangit requested review from markphelps and michaeldwan January 16, 2026 18:12

markphelps reviewed Jan 16, 2026

View reviewed changes

AGENTS.local.md Outdated Show resolved Hide resolved

markphelps reviewed Jan 16, 2026

View reviewed changes

CONTRIBUTING.md Show resolved Hide resolved

mfainberg-cf force-pushed the FFI branch from 26a9ea7 to 2488ab1 Compare January 19, 2026 03:43

tempusfrangit added 23 commits January 18, 2026 21:48

Ignore local agent and mise configuration files

eda37ba

Add health and version foundation types

1266447

Health enum: Unknown, Starting, Ready, Busy, SetupFailed, Defunct SetupStatus enum with SetupResult for tracking setup phase VersionInfo for runtime version reporting Includes serde serialization with appropriate casing and tests.

Add Orchestrator for worker subprocess lifecycle

7367158

Add PredictionService for transport-agnostic prediction lifecycle

5982ff7

Add HTTP transport layer with axum server and routes

d6e614f

Add worker subprocess types for PyO3 bindings

977b8e8

Add coglet-python crate with PyO3 bindings

c3305df

Add type stubs and Python tests

0bfe616

- coglet.pyi: Type stubs for active(), serve(), _run_worker(), _is_cancelable() - tests/test_coglet.py: Integration tests for sync/async/generator predictors, health check, cancellation endpoints

Add setup log hook and audit helper exports to coglet module

0c332ab

- Setup log hook for routing setup logs via SlotLogWriter - Export audit helpers (_is_slot_log_writer, _is_tee_writer, etc.) - Export SlotLogWriter and TeeWriter classes for isinstance checks

Upgrade PyO3 from 0.24 to 0.27

de70c6a

- Use abi3-py38 feature for stable ABI compatibility - Replace Python::with_gil with Python::attach (0.27 API) - Replace py.allow_threads with py.detach (0.27 API) - Add auto-initialize feature for standalone testing

tempusfrangit added 15 commits January 18, 2026 21:48

Install log writers and audit hook in worker startup

e258d4d

Add missing install_slot_log_writers() and install_audit_hook() calls in _run_worker() to enable stdout/stderr routing and protection. Remove module-level #![allow(dead_code)] since code is now integrated.

Add SetupError type for typed setup failures

9902b38

Replace Result<(), String> in PredictHandler::setup() with typed SetupError enum. Variants distinguish load vs setup vs internal errors, enabling future error code mapping without changing behavior.

Remove panics from service.rs prediction locks

f1b01da

- Add try_lock_prediction() helper that fails prediction on mutex poison - predict() returns PredictionError on mutex poison instead of panicking

Replace remaining panics with PyErr in coglet-python

3321554

- input.rs: get_item().unwrap() -> ok_or_else(PyKeyError) for missing keys - log_writer.rs: OnceLock.get().unwrap() -> ok_or_else(PyRuntimeError) Both are now proper Python exceptions rather than Rust panics.

Apply cargo fmt formatting

2e79406

Fix Rust CI: add cargo-binstall, uv sync for Python

3de1f1a

- Install cargo-binstall before mise so mise uses binstall for cargo tools - Add uv sync to setup Python environment before maturin build - Order: Rust -> cargo-binstall -> rust-cache -> mise -> uv sync

tempusfrangit force-pushed the FFI branch from 2488ab1 to fe20808 Compare January 19, 2026 05:49

tempusfrangit force-pushed the FFI branch from fe20808 to 03a0158 Compare January 19, 2026 05:52

markphelps reviewed Jan 20, 2026

View reviewed changes

markphelps mentioned this pull request Jan 20, 2026

Replace boolean flags with Rust enums #2643

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cog Server Rewrite #2641

Cog Server Rewrite #2641

tempusfrangit commented Jan 16, 2026 •

edited

Loading

Uh oh!

michaeldwan commented Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

tempusfrangit commented Jan 17, 2026

Uh oh!

markphelps left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mfainberg-cf commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Cog Server Rewrite #2641

Are you sure you want to change the base?

Cog Server Rewrite #2641

Conversation

tempusfrangit commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaeldwan commented Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

tempusfrangit commented Jan 17, 2026

Uh oh!

markphelps left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mfainberg-cf commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tempusfrangit commented Jan 16, 2026 •

edited

Loading