LiveRL

LiveRL

Run Training

Results & Artifacts

Where logs, checkpoints, and trajectories land

A run produces several streams of output, all under the repo root (the runtime artifacts are gitignored).

Per-run outputs

OutputPathNotes
Training loglogs/<exp>.logverl + Harbor integration log (contains the step: metric lines)
Throughput loglogs/<exp>_vllm.logfiltered throughput-only log (dashboard skips it)
Trajectoriesharbor_trials/<project>/<exp>/step_*/<session>/proxy_trajectory.jsonper-trial agent traffic (token ids / masks / logprobs)
Checkpointscheckpoints/<project>/<exp>/global_step_N/actor FSDP shards, every save_freq steps

Live state

  • The process table — pgrep -af 'sync_1node_cc|fully_async|main_ppo' tells you whether a run is live right now.
  • vLLM readiness — query the Ray-registered vllm_server_* actors (see Inference Stack).

Do not commit checkpoints

FSDP shards are large and stay where verl wrote them. Archive metadata and small trajectory samples — never the checkpoints themselves.

Visualizing

The dashboard reads the training logs and renders reward, KL, MFU, entropy, and response-length curves, plus a browser over the per-trial trajectory JSON.

On this page