vLLM OpenAI API server supports recording experience data by pan-x-c · Pull Request #591 · agentscope-ai/Trinity-RFT

pan-x-c · 2026-06-24T12:12:44Z

Description

As the title says

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

…th legacy) Refactor experience production so heavy data (tokens/logprobs/routed_experts) no longer rides runner->scheduler->coordinator as serialized bytes. The vLLM recorder now captures it in-process into a MemoryStore keyed by task_id, and the coordinator pulls it at finalize time via /records/consume_task. Runners ship only a small reward map. Both paths coexist behind `explorer.use_recorded_experience` (default off = legacy). Recording module (trinity/common/models/vllm_patch/recording/): - store: drop SqlStore; MemoryStore.update_reward_by_task_id stamps reward/run/task on a whole task-id group, pops and returns it (the in-memory replacement for the SQL HistoryRecorder join). - recorder: track in-flight record tasks; add flush() (await pending + queue.join) so a consume sees a quiesced store; honor skip_recording_ctx. - models: build_experience emits one Experience per completion (n>1) with info["sample_index"]; eid.suffix=request_id kept for traceability. - context: add skip_recording_ctx; task_id already flows via api_key (RecordingIdentityMiddleware) and now also via VLLMModel.chat (Ray entry). - query: POST /records/consume_task (flush -> update_reward_by_task_id -> serialize_many); drop the SqlStore 503 branch. - config/server: remove RecordingConfig entirely; the logprob width is a recorder-internal constant (we store only the chosen token, which vLLM force-includes at logprobs=1). No static config threaded through launch. task_id propagation (Ray entry, same contextvar as the HTTP middleware): - vllm_model: chat/generate accept task_id_key, set task_id_ctx around _generate_internal; logprobs sets skip_recording_ctx (auxiliary forward). - model: ModelWrapper.chat/chat_async forward task_id_key; SGLang.chat accepts-and-ignores it (recording is vLLM-only). Coordinator + runner + workflow: - rollout_coordinator: _resolve_rank_urls (ray.get_actor per engine) and a recording-mode finalize that fans out /records/consume_task per engine, deserializes, and feeds objects to the pipeline (no re-serialization). - experience_pipeline: process_experiences(exps) public object entry. - workflow_runner: recording mode returns a pickled reward map keyed by the per-sample task_id_key the workflow stamped; legacy path unchanged. - workflow: SimpleWorkflow/AsyncSimpleWorkflow run a per-sample n=1 loop in recording mode (distinct task_id_key per sample == reward unit for GRPO), legacy n=repeat_times single-call path unchanged. - config: ExplorerConfig.use_recorded_experience flag. SQL path removal (MemoryStore only): - delete proxy/recorder.py (HistoryRecorder) and proxy_test.py; proxy service/app drop /feedback, /commit, record_feedback, submit_experiences, ready_experiences (keep allocate_model + weight sync); allocator no longer fills record_db_url; drop InferenceModelConfig.record_db_url and the dead ExplorerConfig.db_url field; RecordingConfig deleted. Serve-mode external reward reporting is intentionally left unimplemented this version (proxy /feedback//commit removed); the affected serve integration tests (TestServeWithTrainer, ServeTest) are skipped with a pointer to the recording refactor plan. convert_messages_to_experience redirect (multi-turn) is deferred with TODOs at its call sites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…-x-c/Trinity-RFT into feature/model_self_record_experience

openai api server record experience data

776ae71

pan-x-c marked this pull request as draft June 24, 2026 12:26

pan-x-c and others added 28 commits June 25, 2026 11:15

simplify

5fe9053

add config

180b416

use api key as session id

f27988b

clean stale header

40b53e8

simplify code

4ae9392

unify vllm experience recording

4f0f0a7

add tests

8ce1f4b

add log

02eb4e4

fix middleware

db49262

update interface

e6ae7dc

fix prompt text

63a5404

fix streaming recorder

5ff21b2

add delta stream

6a0ad44

refactor history recording

bb2f22c

sglang self recording experiences

bf931de

add sglang tests

a274ed3

fix sglang tests

35952ae

remove enable recording

6d6ce69

Merge branch 'feature/model_self_record_experience' of github.com:pan…

034fb96

…-x-c/Trinity-RFT into feature/model_self_record_experience

fix models

82d75ba

add recording server

5cc3f05

fix sglang

936da53

remove redundant fields

2228d65

fix vllm test

3122b55

fix tests

579d189

add store

052562a

refactor store

d8f3de6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vLLM OpenAI API server supports recording experience data#591

vLLM OpenAI API server supports recording experience data#591
pan-x-c wants to merge 29 commits into
agentscope-ai:mainfrom
pan-x-c:feature/model_self_record_experience

pan-x-c commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

pan-x-c commented Jun 24, 2026

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant