Skip to content

Conversation

@zacharykeeping
Copy link
Member

@zacharykeeping zacharykeeping commented Jan 14, 2026

#984

This adds improvements to how we normalise and compare URLs, and fixes an issue where relative links might not be handled correctly.

Also adds a bunch of tests to test the URL parsing and comparison.

Screenshot 2026-01-15 at 1 29 01 pm

Figure: New scan with these changes show a reduction in broken links, and relative links no longer show incorrectly

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves URL normalization and fixes relative link handling in the SSW Link Auditor. The changes address issues where URLs with similar prefixes were incorrectly matched (e.g., /api matching /api-v2) and where relative links were not resolved correctly when the base URL didn't end with a trailing slash.

Changes:

  • Fixed isSameOriginAndPath function to properly handle path prefix matching by checking for path separators, query strings, or fragments
  • Improved parseUrl function to correctly handle relative URLs when the base URL represents a document (no trailing slash) versus a directory
  • Added comprehensive test coverage for URL parsing and comparison logic

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
docker/sswlinkauditor.go Updated isSameOriginAndPath to prevent false positive matches and improved parseUrl to handle relative URLs from documents vs directories
docker/sswlinkauditor_test.go Added new test file with comprehensive test cases for URL parsing and path comparison functions
Comments suppressed due to low confidence (1)

docker/sswlinkauditor.go:313

  • Using strings.Index(url, \"#\") > 0 will not remove fragments at position 0 (e.g., #section). Although such URLs are likely filtered elsewhere, this should use >= 0 or strings.Contains for correctness. However, since fragments at position 0 represent pure anchor links and are probably already filtered by the caller, this may be intentional.
	if strings.Index(url, "#") > 0 {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants