Skip to content

Commit fea9871

Browse files
Merge pull request #325 from Emerge-Lab/ricky/merge_conflicts_3.0_beta_and_3.0
- merge 3.0 into 3.0_beta - resolve conflicts on merge - adopt resample_maps() refactoring in drive.py - made the scenario_id string length attached to a constant.
2 parents 3430228 + bca5dca commit fea9871

23 files changed

Lines changed: 785 additions & 441 deletions

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# PufferDrive
22

3+
[![Unit Tests](https://github.com/Emerge-Lab/PufferDrive/actions/workflows/utest.yml/badge.svg)](https://github.com/Emerge-Lab/PufferDrive/actions/workflows/utest.yml)
4+
35
<img align="left" style="width:260px" src="https://github.com/Emerge-Lab/PufferDrive/blob/main/pufferlib/resources/drive/pufferdrive_20fps_long.gif" width="288px">
46

57
**PufferDrive is a fast and friendly driving simulator to train and test RL-based models.**

docs/src/interact-with-agents.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,30 @@ then launch:
1818

1919
This will run `demo()` with an existing model checkpoint.
2020

21+
## Arguments & Configuration
22+
23+
The `drive` tool supports similar CLI arguments as the visualizer to control the environment and rendering. It also reads the `pufferlib/config/ocean/drive.ini` file for default environment settings.
24+
25+
### Command Line Arguments
26+
27+
| Argument | Description | Default |
28+
| :--- | :--- | :--- |
29+
| `--map-name <path>` | Path to the map binary file (e.g., `resources/drive/binaries/training/map_000.bin`). If omitted, picks a random map out of `num_maps` from `map_dir` in `drive.ini`. | Random |
30+
| `--policy-name <path>` | `Path to the policy weights file (.bin).` | `resources/drive/puffer_drive_weights.bin` |
31+
| `--view <mode>` | Selects which views to render: `agent`, `topdown`, or `both`. | `both` |
32+
| `--frame-skip <n>` | Renders every Nth frame to speed up simulation (framerate remains 30fps). | `1` |
33+
| `--num-maps <n>` | Overrides the number of maps to sample from if `--map-name` is not set. | `drive.ini` value |
34+
35+
### Visualization Flags
36+
37+
| Flag | Description |
38+
| :--- | :--- |
39+
| `--show-grid` | Draws the underlying nav-graph/grid on the map. |
40+
| `--obs-only` | Hides objects not currently visible to the agent's sensors (fog of war). |
41+
| `--lasers` | Visualizes the raycast sensor lines from the agent. |
42+
| `--log-trajectories` | Draws the ground-truth "human" expert trajectories as green lines. |
43+
| `--zoom-in` | Zooms the camera mainly on the active region rather than the full map bounds. |
44+
2145
### Controls
2246

2347
**General:**

docs/src/simulator.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,28 @@ A high-performance autonomous driving simulator in C with Python bindings.
2222

2323
- `control_vehicles`: Only vehicles
2424
- `control_agents`: All agent types (vehicles, cyclists, pedestrians)
25-
- `control_tracks_to_predict`: WOMD evaluation mode
25+
- `control_wosac`: WOSAC evaluation mode (controls all valid agents ignoring expert flag and start to goal distance)
2626
- `control_sdc_only`: Self-driving car only
2727

2828
> [!NOTE]
29-
> `control_vehicles` filters out agents marked as "expert" and those too close to their goal (<2m). For full WOMD evaluation, use `control_tracks_to_predict`.
29+
> `control_vehicles` filters out agents marked as "expert" and those too close to their goal (<2m). For full WOMD evaluation, use `control_wosac`.
30+
31+
> [!IMPORTANT]
32+
> **Agent Dynamics:** The simulator supports three types of agents:
33+
> 1. **Policy-Controlled:** Stepped by your model's actions.
34+
> 2. **Experts:** Stepped using ground-truth log trajectories.
35+
> 3. **Static:** Remain frozen in place.
36+
>
37+
> In the simulator, agents not selected for policy control will be treated as **Static** by default. To make them follow their **Expert trajectories**, you must set `mark_as_expert=true` for those agents in the jsons. This is critical for `control_sdc_only` to ensure the environment behaves realistically around the policy-controlled agents.
38+
39+
### Init modes
40+
41+
- **`create_all_valid`** (Default): Initializes every valid agent present in the map file. This includes policy-controlled agents, experts (if marked), and static agents.
42+
43+
- **`create_only_controlled`**: Initializes **only** the agents that are directly controlled by the policy.
44+
45+
> [!NOTE]
46+
> In `create_only_controlled` mode, the environment will contain **no static or expert agents**. Only the policy-controlled agents will exist.
3047
3148
### Goal behaviors
3249

docs/src/train.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
## Training
1+
# Training
22

3-
### Basic training
3+
## Basic training
44

55
Launch a training run with Weights & Biases logging:
66

77
```bash
88
puffer train puffer_drive --wandb --wandb-project "pufferdrive"
99
```
1010

11-
### Environment configurations
11+
## Environment configurations
1212

1313
**Default configuration (Waymo maps)**
1414

@@ -33,11 +33,11 @@ resample_frequency = 100000 # No resampling needed (there are only a few Carla m
3333
termination_mode = 0 # 0: terminate at episode_length, 1: terminate after all agents reset
3434

3535
# Map settings
36-
map_dir = "resources/drive/binaries/carla"
37-
num_maps = 2
36+
map_dir = "resources/drive/binaries"
37+
num_maps = 2 # Number of Carla maps you're training in
3838
```
3939

40-
this should give a good starting point. With these settings, you'll need about 2-3 billion steps to get an agent that reaches most of it's goals (> 95%) and has a combined collsion / off-road rate of 3 % per episode of 300 steps.
40+
this should give a good starting point. With these settings, you'll need about 2-3 billion steps to get an agent that reaches most of it's goals (> 95%) and has a combined collsion / off-road rate of 3 % per episode of 300 steps in town 1 and 2, which can be found [here](https://github.com/Emerge-Lab/PufferDrive/tree/2.0/data_utils/carla/carla_data). Before launching your experiment, run `drive.py` with the folder to the Carla towns to process them to binaries, then ensure the `map_dir` above is pointed to these binaries.
4141

4242
> [!Note]
4343
> The default training hyperparameters work well for both configurations and typically don't need adjustment.

docs/src/visualizer.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ bash scripts/build_ocean.sh visualize local
2525

2626
If you need to force a rebuild, remove the cached binary first (`rm ./visualize`).
2727

28-
## Run headless
29-
Launch the visualizer with a virtual display and export an `.mp4`:
28+
## Rendering a Video
29+
Launch the visualizer with a virtual display and export an `.mp4` for the binary scenario:
3030

3131
```bash
3232
xvfb-run -s "-screen 0 1280x720x24" ./visualize
@@ -43,3 +43,37 @@ puffer render puffer_drive
4343
```
4444

4545
This mode parallelizes rendering based on `vec.num_workers`.
46+
47+
## Arguments & Configuration
48+
49+
The `visualize` tool supports several CLI arguments to control the rendering output. It also reads the `pufferlib/config/ocean/drive.ini` file for default environment settings(For more details on these settings, refer to [Configuration](simulator.md#configuration)).
50+
51+
### Command Line Arguments
52+
53+
| Argument | Description | Default |
54+
| :--- | :--- | :--- |
55+
| `--map-name <path>` | Path to the map binary file (e.g., `resources/drive/binaries/training/map_000.bin`). If omitted, picks a random map out of `num_maps` from `map_dir` in `drive.ini`. | Random |
56+
| `--policy-name <path>` | Path to the policy weights file (`.bin`). | `resources/drive/puffer_drive_weights.bin` |
57+
| `--view <mode>` | Selects which views to render: `agent`, `topdown`, or `both`. | `both` |
58+
| `--output-agent <path>` | Output filename for agent view video. | `<policy>_agent.mp4` |
59+
| `--output-topdown <path>` | Output filename for top-down view video. | `<policy>_topdown.mp4` |
60+
| `--frame-skip <n>` | Renders every Nth frame to speed up generation (framerate remains 30fps). | `1` |
61+
| `--num-maps <n>` | Overrides the number of maps to sample from if `--map-name` is not set. | `drive.ini` value |
62+
63+
### Visualization Flags
64+
65+
| Flag | Description |
66+
| :--- | :--- |
67+
| `--show-grid` | Draws the underlying nav-graph/grid on the map. |
68+
| `--obs-only` | Hides objects not currently visible to the agent's sensors (fog of war). |
69+
| `--lasers` | Visualizes the raycast sensor lines from the agent. |
70+
| `--log-trajectories` | Draws the ground-truth "human" expert trajectories as green lines. |
71+
| `--zoom-in` | Zooms the camera mainly on the active region rather than the full map bounds. |
72+
73+
### Key `drive.ini` Settings
74+
The visualizer initializes the environment using `pufferlib/config/ocean/drive.ini`. Important settings include:
75+
76+
- `[env] dynamics_model`: `classic` or `jerk`. Must match the trained policy.
77+
- `[env] episode_length`: Duration of the playback. defaults to 91 if set to 0.
78+
- `[env] control_mode`: Determines which agents are active (`control_vehicles` vs `control_sdc_only`).
79+
- `[env] goal_behavior`: Defines agent behavior upon reaching goals (respawn vs stop).

docs/src/wosac.md

Lines changed: 7 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -55,30 +55,18 @@ We provide baselines on a small curated dataset from the WOMD validation set wit
5555

5656
| Method | Realism meta-score | Kinematic metrics | Interactive metrics | Map-based metrics | minADE | ADE |
5757
|--------|-------------------|-------------------|---------------------|-------------------|--------|------|
58-
| Ground-truth (UB) | 0.832 | 0.606 | 0.846 | 0.961 | 0 | 0 |
59-
| π_Base self-play RL | 0.737 | 0.319 | 0.789 | 0.938 | 10.834 | 11.317 |
60-
| [SMART-tiny-CLSFT](https://arxiv.org/abs/2412.05334) | 0.805 | 0.534 | 0.830 | 0.949 | 1.124 | 3.123 |
61-
| π_Random | 0.485 | 0.214 | 0.657 | 0.408 | 6.477 | 18.286 |
58+
| Ground-truth (UB) | 0.8179 | 0.6070 | 0.9590 | 0.8722 | 0 | 0 |
59+
| Self-play RL agent | 0.6750 | 0.2798 | 0.7966 | 0.7811 | 10.8057 | 11.4108 |
60+
| [SMART-tiny-CLSFT](https://arxiv.org/abs/2412.05334) | 0.7818 | 0.5200 | 0.8914 | 0.8378 | 1.1236 | 3.1231 |
61+
| Random | 0.4459 | 0.0506 | 0.7843 | 0.4704 | 23.5936 | 25.0097 |
6262

6363
*Table: WOSAC baselines in PufferDrive on 229 selected clean held-out validation scenarios.*
6464

65-
66-
> ✏️ Download the dataset from [Hugging Face](https://huggingface.co/datasets/daphne-cornelisse/pufferdrive_wosac_val_clean) to reproduce these results or benchmark your policy.
67-
68-
69-
| Method | Realism meta-score | Kinematic metrics | Interactive metrics | Map-based metrics | minADE | ADE |
70-
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
71-
| Ground-truth (UB) | 0.833 | 0.574 | 0.864 | 0.958 | 0 | 0 |
72-
| π_Base self-play RL | 0.737 | 0.323 | 0.792 | 0.930 | 8.530 | 9.088 |
73-
| [SMART-tiny-CLSFT](https://arxiv.org/abs/2412.05334) | 0.795 | 0.504 | 0.832 | 0.932 | 1.182 | 2.857 |
74-
| π_Random | 0.497 | 0.238 | 0.656 | 0.430 | 6.395 | 18.617 |
75-
76-
*Table: WOSAC baselines in PufferDrive on validation 10k dataset.*
77-
78-
79-
> ✏️ Download the dataset from [Hugging Face](https://huggingface.co/datasets/daphne-cornelisse/pufferdrive_womd_val) to reproduce these results or benchmark your policy.
65+
- **Random agent:** Following the [WOSAC 2023 paper](https://arxiv.org/abs/2305.12032), the random agent samples future trajectories by independently sampling (x, y, θ) at each timestep from a Gaussian distribution in the AV coordinate frame `(mu=1.0, sigma=0.1)`, producing uncorrelated random motion over the horizon of 80 steps.
66+
- **Goal-conditioned self-play RL agent**: An agent trained through self-play RL to reach the end point points ("goals") without colliding or going off-road. Baseline can be reproduced using the default settings in the `drive.ini` file with the Waymo dataset. We also open-source the weights of this policy, see `pufferlib/resources/drive/puffer_drive_weights` `.bin` and `.pt`.
8067

8168

69+
> ✏️ Download the dataset from [Hugging Face](https://huggingface.co/datasets/daphne-cornelisse/pufferdrive_wosac_val_clean) to reproduce these results or benchmark your policy.
8270
8371
## Evaluating trajectories
8472

docs/theme/extra.css

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -463,3 +463,8 @@ blockquote {
463463
margin: 1rem 0;
464464
border-radius: 0 8px 8px 0;
465465
}
466+
467+
/* Fix table visibility - remove alternating row colors */
468+
table tr:nth-child(2n) {
469+
background-color: transparent !important;
470+
}

pufferlib/config/ocean/drive.ini

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -177,8 +177,14 @@ render_map = none
177177
eval_interval = 1000
178178
; Path to dataset used for evaluation
179179
map_dir = "resources/drive/binaries/training"
180-
; Evaluation will run on the first num_maps maps in the map_dir directory
181-
num_maps = 20
180+
; Number of scenarios to process per batch
181+
wosac_batch_size = 32
182+
; Target number of unique scenarios perform evaluation in
183+
wosac_target_scenarios = 64
184+
; Total pool of scenarios to sample from
185+
wosac_scenario_pool_size = 1000
186+
; Max batches, used as a timeout to prevent an infinite loop
187+
wosac_max_batches = 100
182188
backend = PufferEnv
183189
; WOSAC (Waymo Open Sim Agents Challenge) evaluation settings
184190
; If True, enables evaluation on realism metrics each time we save a checkpoint
@@ -198,10 +204,14 @@ wosac_goal_radius = 2.0
198204
wosac_sanity_check = False
199205
; Only return aggregate results across all scenes
200206
wosac_aggregate_results = True
207+
; Evaluation mode: "policy", "ground_truth"
208+
wosac_eval_mode = "policy"
201209
; If True, enable human replay evaluation (pair policy-controlled agent with human replays)
202210
human_replay_eval = False
203211
; Control only the self-driving car
204212
human_replay_control_mode = "control_sdc_only"
213+
; Number of scenarios for human replay evaluation equals the number of agents
214+
human_replay_num_agents = 16
205215

206216
[render]
207217
; Mode to render a bunch of maps with a given policy

pufferlib/ocean/benchmark/evaluate_imported_trajectories.py

Lines changed: 31 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,30 @@
11
import sys
22
import pickle
33
import numpy as np
4-
from scipy.spatial import cKDTree
54
import pufferlib.pufferl as pufferl
65
from pufferlib.ocean.benchmark.evaluator import WOSACEvaluator
76

87

9-
def align_trajectories_by_initial_position(simulated, ground_truth, tolerance=1e-4):
10-
"""
11-
If the trajectories where generated using the same dataset, then regardless of the algorithm the initial positions should be the same.
12-
We use this information to align the trajectories for WOSAC evaluation.
13-
14-
Ideally we would not have to use a tolerance, but the preprocessing in SMART shifts some values by around 2-e5 for some agents.
15-
16-
Also, the preprocessing in SMART messes up some heading values, so I decided not to include heading.
17-
18-
Idea of this script, use a nearest neighbor algorithm to associate all initial positions in gt to positions in simulated,
19-
and check that everyone matching respects the tolerance and there are no duplicates.
20-
"""
21-
22-
sim_pos = np.stack([simulated["x"][:, 0, 0], simulated["y"][:, 0, 0], simulated["z"][:, 0, 0]], axis=1).astype(
23-
np.float64
24-
)
8+
def align_trajectories(simulated, ground_truth):
9+
# Idea is to use the (scenario_id, id) pair to reindex simulated_trajectories in order to align it with GT
10+
gt_scenario_ids = ground_truth["scenario_id"][:, 0]
11+
sim_scenario_ids = simulated["scenario_id"][:, 0, 0]
2512

26-
gt_pos = np.stack(
27-
[ground_truth["x"][:, 0, 0], ground_truth["y"][:, 0, 0], ground_truth["z"][:, 0, 0]], axis=1
28-
).astype(np.float64)
13+
gt_ids = ground_truth["id"][:, 0]
14+
sim_ids = simulated["id"][:, 0, 0]
2915

30-
tree = cKDTree(sim_pos)
16+
lookup = {(s_id, a_id): idx for idx, (s_id, a_id) in enumerate(zip(sim_scenario_ids, sim_ids))}
3117

32-
dists, indices = tree.query(gt_pos, k=1)
18+
try:
19+
indices = [lookup[(s, i)] for (s, i) in zip(gt_scenario_ids, gt_ids)]
20+
indices = np.array(indices, dtype=int)
21+
except KeyError:
22+
print("An agent present in the GT is missing in your simulation")
23+
raise
3324

34-
tol_check = dists <= tolerance
25+
sim_traj = {k: v[indices] for k, v in simulated.items()}
3526

36-
if not np.all(tol_check):
37-
max_dist = np.max(dists)
38-
raise ValueError(f"Didn't find a match for {np.sum(~tol_check)} agents, tolerance broken by {max_dist}m.")
39-
40-
if len(set(indices)) != len(indices):
41-
raise ValueError("Duplicate matching found, I am sorry but this likely indicates that your data is wrong")
42-
43-
reordered_sim = {}
44-
for key, val in simulated.items():
45-
reordered_sim[key] = val[indices]
46-
return reordered_sim
27+
return sim_traj
4728

4829

4930
def check_alignment(simulated, ground_truth, tolerance=1e-4):
@@ -72,8 +53,7 @@ def evaluate_trajectories(simulated_trajectory_file, args):
7253
"""
7354
env_name = "puffer_drive"
7455
args["env"]["map_dir"] = args["eval"]["map_dir"]
75-
args["env"]["num_maps"] = args["eval"]["num_maps"]
76-
args["env"]["use_all_maps"] = True
56+
args["env"]["num_maps"] = args["eval"]["wosac_num_maps"]
7757
dataset_name = args["env"]["map_dir"].split("/")[-1]
7858

7959
print(f"Running WOSAC realism evaluation with {dataset_name} dataset. \n")
@@ -97,30 +77,26 @@ def evaluate_trajectories(simulated_trajectory_file, args):
9777

9878
print(f"Number of scenarios: {len(np.unique(gt_trajectories['scenario_id']))}")
9979
print(f"Number of controlled agents: {num_agents_gt}")
100-
print(f"Number of evaluated agents: {np.sum(gt_trajectories['id'] >= 0)}")
80+
print(f"Number of evaluated agents: {gt_trajectories['is_track_to_predict'].sum()}")
10181

10282
print(f"Loading simulated trajectories from {simulated_trajectory_file}...")
10383
with open(simulated_trajectory_file, "rb") as f:
10484
sim_trajectories = pickle.load(f)
10585

106-
if sim_trajectories["x"].shape[0] != gt_trajectories["x"].shape[0]:
107-
print("\nThe number of agents in simulated and ground truth trajectories do not match.")
108-
print("This is okay if you are running this script on a subset of the val dataset")
109-
print("But please also check that in drive.h MAX_AGENTS is set to 256 and recompile")
110-
111-
if not check_alignment(sim_trajectories, gt_trajectories):
112-
print("\nTrajectories are not aligned, trying to align them, if it fails consider changing the tolerance.")
113-
sim_trajectories = align_trajectories_by_initial_position(sim_trajectories, gt_trajectories)
114-
assert check_alignment(sim_trajectories, gt_trajectories), (
115-
"There might be an issue with the way you generated your data."
116-
)
117-
print("Alignment successful")
118-
else:
119-
sim_trajectories = {k: v[:num_agents_gt] for k, v in sim_trajectories.items()}
120-
121-
# Evaluator code expects to have matching ids between gt and sim trajectories
122-
# Since alignment is checked it is safe to do that
123-
sim_trajectories["id"][:] = gt_trajectories["id"][..., None]
86+
num_agents_sim = sim_trajectories["x"].shape[0]
87+
assert num_agents_sim >= num_agents_gt, (
88+
"There is less agents in your simulation than in the GT, so the computation won't be valid"
89+
)
90+
91+
if num_agents_sim > num_agents_gt:
92+
print("If you are evaluating on a subset of your trajectories it is fine.")
93+
print("\n Else, you should consider changing the value of MAX_AGENTS in drive.h and compile")
94+
95+
sim_trajectories = align_trajectories(sim_trajectories, gt_trajectories)
96+
97+
assert check_alignment(sim_trajectories, gt_trajectories), (
98+
"There might be an issue with the way you generated your data."
99+
)
124100

125101
agent_state = vecenv.driver_env.get_global_agent_state()
126102
road_edge_polylines = vecenv.driver_env.get_road_edge_polylines()

0 commit comments

Comments
 (0)