Skip to content

api-driven notion operations#126

Merged
luandro merged 199 commits intomainfrom
feat/notion-api-service
Feb 25, 2026
Merged

api-driven notion operations#126
luandro merged 199 commits intomainfrom
feat/notion-api-service

Conversation

@luandro
Copy link
Copy Markdown
Contributor

@luandro luandro commented Feb 6, 2026

solves #120 but using a VPS instead of CloudFlare workers.

Summary

  • expose Notion operations via a Bun API with job queue, persistence, and cancellation
  • add GitHub status reporting and API auth/audit hooks
  • add PRD outlining API-first strategy with review stages

Testing

  • not run (not requested)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a31aecb435

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/notion-api/modules.ts Outdated
Comment thread scripts/api-server/index.ts Outdated
Comment thread api-server/job-tracker.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 6, 2026

🚀 Preview Deployment

Your documentation preview is ready!

Preview URL: https://pr-126.comapeo-docs.pages.dev

🔄 Content: Regenerated 5 pages from Notion (script changes detected)

💡 Tip: Add label fetch-all-pages to test with full content, or fetch-10-pages for broader coverage.

This preview will update automatically when you push new commits to this PR.


Built with commit 654b3cf

luandro added a commit that referenced this pull request Feb 8, 2026
- Use Notion pages.retrieve API directly in fetchPage instead of invalid
  database query filter on non-existent "id" property
- Add process tracking to job cancellation so DELETE handler actually
  kills running child processes and prevents status overwrite
- Delete persisted job files during cleanup to prevent old jobs from
  reappearing after restart
@luandro luandro changed the title feat(api-server): api-driven notion operations solves #120: api-driven notion operations Feb 8, 2026
@luandro luandro changed the title solves #120: api-driven notion operations api-driven notion operations Feb 8, 2026
luandro added a commit that referenced this pull request Feb 9, 2026
luandro added a commit that referenced this pull request Feb 9, 2026
- PRD.md: comprehensive review checklist with 16 tasks covering security,
  architecture, testing, and production readiness
- PRD_DOCKER_IMAGE.md: specification for Docker Hub GitHub Action workflow
  with multi-platform support, path filtering, and PR preview tags
luandro added a commit that referenced this pull request Feb 9, 2026
- API_REVIEW.md: 16-task review plan for PR #126 with complexity levels
  and priority ordering. Covers security, architecture, reliability, and
  production readiness concerns
- TEST_IMPROVEMENT.md: comprehensive test improvement plan addressing
  20 failing tests, code duplication issues, and missing HTTP integration
  coverage
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 9, 2026

🐳 Docker Image Published

Your Docker image has been built and pushed for this PR.

Image Reference: docker.io/communityfirst/comapeo-docs-api:pr-126

Platforms: linux/amd64, linux/arm64

Testing

To test this image:

docker pull docker.io/communityfirst/comapeo-docs-api:pr-126
docker run -p 3001:3001 docker.io/communityfirst/comapeo-docs-api:pr-126

Built with commit 654b3cf

@digidem digidem deleted a comment from greptile-apps Bot Feb 12, 2026
@digidem digidem deleted a comment from greptile-apps Bot Feb 12, 2026
@luandro
Copy link
Copy Markdown
Contributor Author

luandro commented Feb 12, 2026

✅ PR #129 Merge Readiness Checklist (updated)

Review findings / correctness

  • Lock acquisition retries only on EEXIST and fails fast on non-contention errors.
  • Cancellation is honored while waiting for lock.
  • Concurrent initialization is serialized to avoid init races.
  • Content-repo lock/error/cancel/init paths are covered by focused tests.

Docker / env completeness

  • .env.example now documents required content-repo env vars for mutating jobs:
    • GITHUB_REPO_URL
    • GITHUB_TOKEN
    • GIT_AUTHOR_NAME
    • GIT_AUTHOR_EMAIL
  • .env.example now documents optional content-repo behavior vars:
    • GITHUB_CONTENT_BRANCH
    • WORKDIR
    • COMMIT_MESSAGE_PREFIX
    • ALLOW_EMPTY_COMMITS
  • docker-compose.yml now passes the required + optional content-repo env vars into the API container.
  • Compose comments clearly indicate mutating jobs requiring this config (notion:fetch, notion:fetch-all, notion:translate).

Final validation

  • Targeted Docker config test suite passes.
  • Optional: run one end-to-end docker compose smoke test with real/semi-real credentials to verify push path in your target environment.

@digidem digidem deleted a comment from greptile-apps Bot Feb 12, 2026
@digidem digidem deleted a comment from greptile-apps Bot Feb 12, 2026
luandro added a commit that referenced this pull request Feb 13, 2026
- Use Notion pages.retrieve API directly in fetchPage instead of invalid
  database query filter on non-existent "id" property
- Add process tracking to job cancellation so DELETE handler actually
  kills running child processes and prevents status overwrite
- Delete persisted job files during cleanup to prevent old jobs from
  reappearing after restart
luandro added a commit that referenced this pull request Feb 13, 2026
- PRD.md: comprehensive review checklist with 16 tasks covering security,
  architecture, testing, and production readiness
- PRD_DOCKER_IMAGE.md: specification for Docker Hub GitHub Action workflow
  with multi-platform support, path filtering, and PR preview tags
luandro added a commit that referenced this pull request Feb 13, 2026
- API_REVIEW.md: 16-task review plan for PR #126 with complexity levels
  and priority ordering. Covers security, architecture, reliability, and
  production readiness concerns
- TEST_IMPROVEMENT.md: comprehensive test improvement plan addressing
  20 failing tests, code duplication issues, and missing HTTP integration
  coverage
@luandro luandro force-pushed the feat/notion-api-service branch from f451b74 to 0d2004b Compare February 13, 2026 01:41
Comment thread api-server/auth.ts
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Feb 17, 2026

Code Review Summary

Status: 1 New Issue Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 1
WARNING 0
SUGGESTION 0
Issue Details (click to expand)

CRITICAL

File Line Issue
scripts/ci-validation/docker-publish-workflow.test.ts 47, 48, 57, 58 Typo in path filter - bun.lockb* should be bun.lock* - This test validates the workflow file but has the same typo that was fixed in the actual workflow. The test will fail if the expected values are corrected to match the now-correct workflow.

Previously Flagged Issues (from earlier reviews)

The following issues were identified in earlier reviews:

File Issue Status
scripts/notion-api/modules.ts P1: Invalid Notion filter for fetchPage - uses rich_text filter on non-existent property Not addressed
scripts/api-server/index.ts P2: Cancellation doesn't stop running jobs - only prevents new jobs Not addressed
api-server/job-tracker.ts (Line ~412) P2: Old jobs reappear after restart - cleanup only removes from memory, not persisted storage Not addressed
api-server/auth.ts (Line 56) Documentation error - comment says "bcrypt hashing" but implementation uses SHA-256 Not addressed
.github/workflows/docker-publish.yml Typo bun.lockb* FIXED - Now correctly uses bun.lock*
docker-compose.yml (Line ~100) Healthcheck doesn't set process exit code - returns value but process doesn't exit with that code Not addressed
api-server/content-repo.ts (Line ~165) withAskPass spreads full process.env to git child processes (unlike job-executor which uses whitelist) Not addressed

Files Reviewed (multiple files)

  • .github/workflows/docker-publish.yml - Fixed typo, now uses correct bun.lock*
  • scripts/ci-validation/docker-publish-workflow.test.ts - NEW ISSUE: Still has typo bun.lockb*
  • api-server/*.ts - Validated proper error handling, auth, and input validation
  • docker-compose.yml - Reviewed healthcheck configuration
  • Other workflow and config files reviewed

@digidem digidem deleted a comment from greptile-apps Bot Feb 18, 2026
luandro added a commit that referenced this pull request Feb 20, 2026
- Use Notion pages.retrieve API directly in fetchPage instead of invalid
  database query filter on non-existent "id" property
- Add process tracking to job cancellation so DELETE handler actually
  kills running child processes and prevents status overwrite
- Delete persisted job files during cleanup to prevent old jobs from
  reappearing after restart
@luandro luandro force-pushed the feat/notion-api-service branch from 86c30bb to a151429 Compare February 20, 2026 11:45
luandro added a commit that referenced this pull request Feb 20, 2026
- PRD.md: comprehensive review checklist with 16 tasks covering security,
  architecture, testing, and production readiness
- PRD_DOCKER_IMAGE.md: specification for Docker Hub GitHub Action workflow
  with multi-platform support, path filtering, and PR preview tags
luandro added a commit that referenced this pull request Feb 20, 2026
- API_REVIEW.md: 16-task review plan for PR #126 with complexity levels
  and priority ordering. Covers security, architecture, reliability, and
  production readiness concerns
- TEST_IMPROVEMENT.md: comprehensive test improvement plan addressing
  20 failing tests, code duplication issues, and missing HTTP integration
  coverage
@digidem digidem deleted a comment from greptile-apps Bot Feb 20, 2026
@digidem digidem deleted a comment from greptile-apps Bot Feb 20, 2026
@luandro luandro force-pushed the feat/notion-api-service branch from 574c7e9 to ba4e040 Compare February 20, 2026 12:35
@digidem digidem deleted a comment from greptile-apps Bot Feb 20, 2026
luandro and others added 19 commits February 24, 2026 10:46
- Fix axios mock in imageProcessing.test.ts to return proper response objects
- Fix node:fs mock to use proper factory functions
- Fix typo: bun.lockb* -> bun.lock* in docker-publish-workflow.test.ts
…nc audit logging

- Replace all sync node:fs calls in job-persistence.ts with async node:fs/promises
  equivalents (readFile, writeFile, appendFile, access, mkdir, stat, rename, unlink);
  add write-serialization mutex (logWritePromise) and waitForPendingWrites() for tests
- Change jobOptionsSchema from .strip() to .strict() so unknown keys (e.g. typo drRun)
  are rejected rather than silently dropped
- Fix audit.ts to use self-contained sync rotation (private rotateIfNeeded()) and
  synchronous appendFileSync, keeping log() fully synchronous for test compatibility;
  waitForPendingWrites() now returns Promise.resolve() as a compatibility no-op
- Update all callers in job-tracker.ts and all test files to await the now-async
  job-persistence functions; fix Zod type narrowing in validation-schemas.test.ts

Relates to: GitHub issue #150 (async-mutex follow-up tracked separately)
- Add loadPromise and waitForLoad() to JobTracker for async loading
- Fix 6 tests that accessed jobs before persistence loading completed
- Fix 2 tests expecting strict() schema to pass with unknown keys

The JobTracker was calling loadPersistedJobs() without tracking the
promise, causing race conditions when tests destroyed/recreated the
tracker and immediately accessed jobs.
- Add envOverrides parameter to runJobProcess for workdir-specific paths
- Pass CONTENT_PATH, I18N_PATH, IMAGES_PATH when running via runContentTask
- Add detached: true and unref() to enable proper process group cleanup

🤖 Generated with [Qoder][https://qoder.com]
- Add SIGTERM -> SIGKILL escalation pattern matching job-executor.ts
- Wait 5s after SIGTERM before escalating to SIGKILL
- Add 1s fail-safe before rejecting promise
- Track processExited to prevent double-rejection

🤖 Generated with [Qoder][https://qoder.com]
- Import fetchNotionBlocks to fetch page blocks
- Import n2m (notion-to-md) for markdown conversion
- Fetch blocks and convert to markdown string
- Return content field with full markdown

🤖 Generated with [Qoder][https://qoder.com]
- Add unref: vi.fn() to mock child processes
- Supports testing with detached: true spawn option

🤖 Generated with [Qoder][https://qoder.com]
…eout

- Register abort controller as kill handler for fetch jobs so DELETE
  /jobs/:id properly cancels the running fetch via AbortSignal, which
  in turn triggers the SIGTERM→SIGKILL chain in runGenerationScript
- Add detached: true to spawn in runGenerationScript so process group
  kills (process.kill(-pid)) work correctly and grandchildren (e.g.
  pngquant workers) are reaped on timeout or cancellation
…te header

- Remove in-process fetchAllNotionData preflight from runFetchJob; the
  CLI script (notion-fetch-all) already fetches the full database, so
  calling it twice doubled all Notion API requests and rate-limit exposure
- notion-fetch-all/index.ts now emits a JSON summary line to stdout
  ({candidateIds, pagesProcessed}) before exit; fetch-job-runner reads it
  via extractLastJsonLine to obtain candidateIds and page count
- Remove the now-dead withTimeout helper from fetch-job-runner.ts
- Add missing WWW-Authenticate: Bearer header to both 401 response
  branches in request-handler.ts, matching createAuthErrorResponse and
  satisfying RFC 7235
- fix(api-server/openapi): use createJobRequestSchema (object) instead of
  createJobTypeSchema (string union) for POST /jobs request body schema

- fix(api-server/content-repo): replace hardcoded "content"/"origin/content"
  strings in ensureRemoteContentBranchExists, pushContentBranchWithRetry, and
  verifyRemoteHeadMatchesLocal with config.contentBranch so GITHUB_CONTENT_BRANCH
  overrides are respected

- fix(scripts/notion-count-pages): mirror applyFetchAllTransform logic from
  fetchAll.ts by using resolveChildrenByStatus instead of a direct status
  equality filter, so child pages of matching parents are counted correctly

- fix(api-server/audit): convert synchronous file I/O (appendFileSync, statSync,
  renameSync, unlinkSync) to async operations via a sequential write queue;
  implement waitForPendingWrites() to drain the queue on shutdown
rmdirSync fails on non-empty directories; rmSync with recursive+force
handles that case correctly. Also removes the now-unused rmdirSync import.
…ore job creation

- Replace O(N) hash-per-key loop in authenticate() with a single SHA-256
  hash computed once, then an O(1) Map.get() lookup against stored key hashes
- Remove now-redundant verifyKey()/timingSafeEqual() helper; the Map lookup
  already confirms hash equality by construction
- Add validateContentRepoConfig() to content-repo.ts to surface config errors
  at a defined boundary rather than as unhandled exceptions
- Pre-validate content repo config in handleCreateJob() before acquiring locks
  or spawning processes, returning HTTP 500 with a descriptive message on failure
- Restore process.env after each content-repo test to prevent env variable
  leakage between test cases
- Strip GitHub token from persisted jobs before writing to disk;
  empty string placeholder preserves the field type without leaking secrets
- Replace hardcoded production IP in SETUP.md with a generic note
  to avoid exposing infrastructure details in the repository
- Route startup warning to stderr with a more descriptive message
  clarifying that all endpoints are publicly accessible
- Add X-Auth-Disabled response header on non-public endpoints when
  running without credentials, so misconfiguration is visible to clients
Add cross-field superRefine validation so startup fails with a clear
error when neither database identifier is provided, rather than
proceeding silently with no content source configured.
Replace hardcoded /app/workspace/repo with ${WORKDIR:-/app/workspace/repo}
so git operations work when operators customize the WORKDIR path.
Add prominent WARNING comment in docker-compose.yml explaining that
the API is fully open when no API_KEY_* variables are set.

Co-authored-by: Junie <junie@jetbrains.com>
- fix: make ApiKeyAuth constructor private; add static resetInstance() for tests
- fix: enforce MAX_REQUEST_SIZE via content-length header check (413 response)
- fix: add PAYLOAD_TOO_LARGE to ErrorCode enum and getErrorCodeForStatus map
- fix: add --watch flag to api:server:dev script for hot-reload
- fix: add ALLOWED_ORIGINS to docker-compose.yml environment section
- fix: expand api-validate.yml trigger paths to include api-server/** and package.json
- chore: remove implementation-plan-fetch-ready.md from repo root
…English pages

Add getEnglishTitle() to pageGrouping.ts and use it in generateBlocks.ts
so that code.json entries are keyed by the English original title rather
than a potentially non-English mainTitle, matching Docusaurus @translate
expectations. Silently skips translation entries when no English page exists.

Also fix createMockPageFamily() to pre-generate and link child IDs via
subItems so groupPagesByLang can properly relate pages in tests.
@luandro luandro force-pushed the feat/notion-api-service branch from 6efa332 to 2b62949 Compare February 24, 2026 13:46
The gitleaks pre-commit hook is not available in CI environment,
causing the workflow to fail. Using --no-verify is standard practice
for automated commits in CI/CD pipelines.
Comment thread scripts/ci-validation/docker-publish-workflow.test.ts
Comment thread scripts/ci-validation/docker-publish-workflow.test.ts
@luandro luandro merged commit 067dbcd into main Feb 25, 2026
6 checks passed
@luandro luandro deleted the feat/notion-api-service branch February 25, 2026 03:36
@github-actions
Copy link
Copy Markdown
Contributor

🧹 Preview Deployment Cleanup

The preview deployment for this PR has been cleaned up.

Preview URL was: https://pr-126.comapeo-docs.pages.dev


Note: Cloudflare Pages deployments follow automatic retention policies. Old previews are cleaned up automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant