feat(monitor): unify explorer/trainer wandb logging into one run by MengsD · Pull Request #590 · agentscope-ai/Trinity-RFT

MengsD · 2026-06-24T04:03:56Z

When monitor.monitor_type is wandb and monitor.shared_run is True, explorer and trainer log their metrics to the same run; metrics are labelled by {role}/step.

Description

[Please describe the background, purpose, changes made, and how to test this PR]

Background

Closes #556

Why this design

A wandb run has a single global _step — a per-run counter assigned in arrival order. With two roles writing one shared run, their rows interleave, so each role lands on a gapped _step (e.g. explorer 0,2,4…, trainer 1,3,5…) and can't show a clean 1..N. The only fix is to give each metric its own x-axis via define_metric(key, step_metric="{role}/step").

Three constraints then shaped the implementation:

No glob. wandb's define_metric glob doesn't cross / (issue #9549), and a bare * can't map the two roles to different axes — so bindings must be declared per full key, and keys appear dynamically during the run.
Multi-process. explorer / trainer / launcher are separate processes, so writing one run requires wandb shared mode (one primary = the launcher, secondaries = the actors).
Config is overwritten, not merged. In shared mode each writer's config sync replaces the run's metric defs (last writer wins). So a single declarer is unstable: if only the primary declares, the secondaries clobber it (the live axis falls back to the gapped _step); if each secondary declares only its own role, the two secondaries clobber each other.

Approach: make every writer carry the full binding set, so no sync can drop anything.

Secondaries declare the full set on each flush (their own keys + all roles' keys read from shared binding files) → axis is correct during the run.
The launcher (primary) declares the full set once before finalize → axis is correct after the run (the primary's config wins at finalize).

Changes made

Add monitor.shared_run (opt-in, default False) and an internal monitor.run_id to MonitorConfig.
The launcher creates one primary shared run and injects run_id; explorer/trainer join it as secondary writers (wandb shared mode).
Each role's metrics are prefixed {role}/ and bound (via define_metric) to their own {role}/step axis, so explorer and trainer each show a clean 1..N x-axis within the same run — both during the run and after it finishes.
The config validator downgrades shared_run to False for backends that don't support it (only wandb does); shared_run=False keeps the original behavior (separate runs) byte-for-byte.
Note: the per-role step bindings are exchanged via files under checkpoint_job_dir/monitor/, so multi-node runs require that dir on shared storage (e.g., NAS).

How to test

In an example config (e.g. examples/gsm8k-quick.yaml) set:
monitor:
monitor_type: wandb
shared_run: true
Run a both job: trinity run --config examples/gsm8k-quick.yaml.
Open the wandb run and confirm: explorer and trainer metrics live in one run; each role's panels use its own {role}/step axis showing 1..N (verified both mid-run and after finish).
With shared_run: false (default) or a non-wandb backend, behavior is unchanged (two separate runs).

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

When monitor.monitor_type is wandb and monitor.shared_run is True, explorer and trainer log their metrics to the same run; metrics are labelled by {role}/step.

MengsD · 2026-06-24T04:07:35Z

/unittest-diff

github-actions · 2026-06-24T04:48:09Z

unittest: Run #1780

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Pending ⏳	Other ❓	Flaky 🍂	Duration ⏱️
203	201	0	2	0	0	0	39m 46s

🎉 All tests passed!

Github Test Reporter by CTRF 💚

feat(monitor): unify explorer/trainer wandb logging into one run

4af020d

When monitor.monitor_type is wandb and monitor.shared_run is True, explorer and trainer log their metrics to the same run; metrics are labelled by {role}/step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(monitor): unify explorer/trainer wandb logging into one run#590

feat(monitor): unify explorer/trainer wandb logging into one run#590
MengsD wants to merge 1 commit into
agentscope-ai:mainfrom
MengsD:feat/unified-wandb-run

MengsD commented Jun 24, 2026 •

edited

Loading

Uh oh!

MengsD commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

MengsD commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Background

Why this design

Changes made

How to test

Checklist

Uh oh!

MengsD commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

unittest: Run #1780

🎉 All tests passed!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MengsD commented Jun 24, 2026 •

edited

Loading