Add analysis/replicate.sh — single-command driver for the primary analysis#1
Merged
Merged
Conversation
…lysis Mirrors the revisions/replicate.sh layout (phases + DRY_RUN + PHASES filter) but drives the original-submission pipeline under analysis/strains/ (00–04) and analysis/cell_lines/ (00–07), via the shared analysis/_downloads/ download. OPENAI_API_KEY is resolved from the macOS Keychain with ~/openai/access_key.txt and pre-set env var as fallbacks (inst/gpt.py already supports the file path). The 02 / 05 R scripts that issue OpenAI Batch jobs already block on completion via R/run_batches.R, so the bash driver does not need a poll loop of its own. Smoke-tested under DRY_RUN=1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new top-level driver script (analysis/replicate.sh) intended to run the “primary/original-submission” analysis pipeline end-to-end from the repo root, including key verification, shared downloads, strain pipeline, and cell-line pipeline.
Changes:
- Introduces
analysis/replicate.shwith phase-based execution (keys,downloads,strains,cells) and aPHASESfilter. - Adds a
DRY_RUNmode to print the ordered command list instead of executing. - Implements a macOS Keychain helper for resolving an OpenAI key into
OPENAI_API_KEY.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+2
to
+5
| # replicate.sh — single-command end-to-end replication of the *primary* | ||
| # analysis (the one reported in the original submission). This is kept | ||
| # separate from revisions/replicate.sh, which drives the revision-round | ||
| # pipeline. Run from the repository root. |
Comment on lines
+55
to
+61
| run() { | ||
| if [[ "${DRY_RUN:-0}" == "1" ]]; then | ||
| printf " [dry-run] %s\n" "$*" | ||
| else | ||
| printf " [run] %s\n" "$*" | ||
| eval "$@" | ||
| fi |
Comment on lines
+99
to
+116
| # keys: ensure the OpenAI key is reachable. inst/gpt.py reads (in order) | ||
| # ~/openai/access_key.txt, then OPENAI_API_KEY env. We resolve from | ||
| # Keychain into OPENAI_API_KEY when neither is already set. | ||
| # --------------------------------------------------------------------------- | ||
| verify_keys() { | ||
| if [[ "${DRY_RUN:-0}" == "1" ]]; then | ||
| printf " [dry-run] resolve OPENAI_API_KEY (Keychain / env / ~/openai/access_key.txt)\n" | ||
| return | ||
| fi | ||
| if keychain_export OPENAI_API_KEY \ | ||
| "${OPENAI_KEYCHAIN_ENTRY:-}" \ | ||
| "OPENAI_API_KEY" "openai" "OpenAI" "openai-api-key"; then | ||
| echo " ok: OPENAI_API_KEY resolved" | ||
| return | ||
| fi | ||
| if [[ -f "$HOME/openai/access_key.txt" ]]; then | ||
| echo " ok: ~/openai/access_key.txt present (inst/gpt.py will read it)" | ||
| return |
| ANA="$REPO_ROOT/analysis" | ||
| RSCRIPT="${RSCRIPT:-Rscript}" | ||
|
|
||
| PHASES="${PHASES:-keys,downloads,strains,cells}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
analysis/replicate.sh, a single-command driver for the original-submission pipeline. It mirrors the layout ofrevisions/replicate.sh(phases,DRY_RUN,PHASESfilter, Keychain-based key resolution) but drives the scripts underanalysis/strains/(00–04) andanalysis/cell_lines/(00–07), preceded by the sharedanalysis/_downloads/download.R.keys→downloads→strains→cells.OPENAI_API_KEYresolves from the macOS Keychain, then a pre-set env var, then~/openai/access_key.txt(the existinginst/gpt.pyalready supports the file path).R/run_batches.R, so the bash driver does not need its own poll loop.revisions/replicate.shso the original and revision-round drivers do not drift.Test plan
DRY_RUN=1 ./analysis/replicate.shprints the full ordered command list across all four phases.data-raw/strain_data/main_frame.rdsanddata-raw/cell_line_data/cell_line_inputs_second_pass.rdsfrom a cold start.🤖 Generated with Claude Code