ci: add GitHub Actions workflow for documentation validation#5392
ci: add GitHub Actions workflow for documentation validation#5392Nokhrin wants to merge 1 commit into
Conversation
c576e36 to
adbfb7f
Compare
adbfb7f to
575a1c4
Compare
| - name: Install docs-validator | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install git+https://github.com/Nokhrin/docs-validator.git |
There was a problem hiding this comment.
I don't think we should be running executables out of individual users' repositories. This is not safe.
There was a problem hiding this comment.
I understand the security concerns about running executables from individual user repositories. Here are possible solutions:
- Publish to PyPI: Package the validator and publish it officially, allowing
pip install docs-validatorwith verified releases. - Use specific version tags: Pin to a specific commit/tag instead of a branch (e.g.,
@v1.0.0#commit_hash) for reproducibility. - Transfer to OAI organization: Move the repository under OAI's control for official project oversight.
- Provide security audit documentation: Share code review results, dependency analysis, and security practices.
Could you clarify which approach aligns with OAI's policies?
What specific actions or documentation would be needed to approve this tool for use in the validation workflow?
I'm happy to implement any required changes.
What timeline would work for resolving this?
|
@Nokhrin What would be the benefits of using your tool in addition to the existing document validation with |
|
@ralfhandl Commit 7926ee2 (broken markdown anchor) However, structural errors like 1397caf (invalid ref syntax) Would you be open to adding an optional OpenAPI specification validator module? I can implement validation for broken refs, duplicate operationIds, and schema-example mismatches as a configurable rule, keeping the tool modular and auditable. |
Summary
This PR proposes adding automated documentation link validation using
docs-validator, a Python-based static analyzer that detects broken links, orphan files, missing anchors, and circular dependencies in documentation repositories.Experiment Results
I've tested this tool on the OpenAPI-Specification repository to demonstrate its value. The validation completed in ~45 seconds and analyzed 57 documentation files containing 1,240 links (17 internal, 1,223 external).
Metrics
Real Issues Discovered
1. Broken Internal Links (9 errors)
versions/3.0.2.md:1846→../examples/v3.0/callback-example.yaml(file not found)versions/3.1.0.md:198→../examples/v3.1/webhook-example.yaml(file not found)_archive_/schemas/v3.0/README.md:38→../../tests/v3.0(file not found)2. Broken External Links (29 errors)
versions/1.2.md:92→http://json-schema.org/latest/json-schema-core.html#anchor8(404)proposals/2019-01-01-Proposal-Template.md:9→https://github.com/{author2}(404, template placeholder)proposals/2024-09-01-Tags-Improvement.md:61→https://example.com/shopping(404)3. Orphan Files (52 warnings)
Files without incoming links, potentially undiscoverable:
versions/3.0.0.md,versions/3.1.0.md,versions/3.2.0.md(historical versions)proposals/*.md(15 proposal files).github/pull_request_template.md,.github/ISSUE_TEMPLATE/*.md4. Manual Verification Required (46 warnings)
Resources blocked by WAF or rate-limited:
style-guide.md:7→https://www.npmjs.com/package/markdownlint(403, Cloudflare)versions/3.1.2.md:3503→https://www.w3.org/TR/xml-names11/(403)proposals/2019-10-31-Clarify-Nullable.md:9→https://github.com/tedepstein(429, rate limit)Benefits for OpenAPI-Specification
Configuration
The workflow is configured to:
.md,.markdown,.asc,.adoc)Known Limitations
The following limitations are documented for transparency:
References
I'm happy to adjust the workflow configuration, add a
.docs-validator.tomlfile with project-specific exclusions, or make any other changes based on maintainer feedback.