These standards are lightweight and repo-specific. They are meant to help keep training code, utilities, and Android code aligned.
- Prefer small, explicit command surfaces over hidden behavior.
- Keep filenames, manifest formats, and output paths stable unless there is a clear migration plan.
- Update docs when changing user-facing commands, paths, or asset names.
- Keep CLI scripts argument-driven and runnable from the repo root.
- Prefer readable orchestration over clever metaprogramming in data and training workflows.
- Preserve CSV schema compatibility when possible.
- Fail early on missing files, invalid manifests, or dependency gaps.
- Keep processing behavior explicit and configuration-driven.
- Preserve CPU fallback behavior when touching
ModelManager. - Treat budget-device memory limits as a first-class constraint.
- Keep WorkManager payload keys and repository mapping logic in sync.
- Avoid mixing curated datasets with staged raw corpora by accident.
- Keep benchmark outputs clearly named and grouped.
- Use manifests instead of hand-assembled pair lists where possible.
- Prefer ASCII unless a file already intentionally uses Unicode.
- Remove stale examples instead of leaving placeholders that no longer run.
- Link docs to code that exists now, not code that used to exist.