-
-
Notifications
You must be signed in to change notification settings - Fork 621
Expose delims #596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose delims #596
Conversation
Add functions to retrieve delimiter character and length for EMPH,
STRONG, and CODE nodes. This enables use cases like rendering markdown
with delimiters included inside styled elements.
New API:
- cmark_parse_document_for_delimiters(): Parse with source position tracking
- cmark_node_get_delimiter_info(): Get delimiter char and length
- cmark_node_get_delim_char(): Convenience for just the character
- cmark_node_get_delim_length(): Get length without source text
Key design decisions:
- Computes delimiter info on-demand from source positions (no parsing changes)
- Requires CMARK_OPT_SOURCEPOS for accurate multi-line position tracking
- All new code isolated in cmark_delim.{h,c} to minimize merge conflicts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new public API to retrieve delimiter information (character and length) for inline markdown nodes (EMPH, STRONG, CODE), enabling use cases where delimiters need to be included in the rendered output. The implementation computes delimiter info on-demand from source positions, avoiding any modifications to the core parsing logic.
Key changes:
- New delimiter query API with three functions for different use cases
- On-demand computation from source positions using CMARK_OPT_SOURCEPOS
- Comprehensive test coverage including edge cases and different line endings
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/cmark_delim.h | New header defining public API for delimiter information retrieval with comprehensive documentation |
| src/cmark_delim.c | Implementation of delimiter query functions including helper for line/column to index conversion |
| src/CMakeLists.txt | Build configuration updates to include new source and header files |
| api_test/main.c | Integration of delimiter tests into the test suite |
| api_test/delim_test.h | Test header declaring delimiter test runner function |
| api_test/delim_test.c | Comprehensive test suite covering all node types, edge cases, and line ending styles |
| api_test/CMakeLists.txt | Build configuration for including delimiter test files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * | ||
| * For EMPH, always returns 1. | ||
| * For STRONG, always returns 2. | ||
| * For CODE, returns 0 (use cmark_node_get_delimiter_info to get actual backtick count). |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "use cmark_node_get_delimiter_info to get actual backtick count" but the phrasing could be clearer. It should explicitly state that for CODE nodes, the delimiter length varies depending on the number of backticks used in the source, which is why source text is required. Consider rephrasing to: "For CODE, returns 0 because the backtick count varies and requires examining the source text (use cmark_node_get_delimiter_info instead)."
| * For CODE, returns 0 (use cmark_node_get_delimiter_info to get actual backtick count). | |
| * For CODE, returns 0 because the backtick count varies and requires examining the source text (use cmark_node_get_delimiter_info instead). |
| * @return The delimiter char ('*', '_', or '`'), or 0 if not applicable or invalid options | ||
| */ | ||
| CMARK_EXPORT int cmark_node_get_delim_char( | ||
| cmark_node *node, | ||
| const char *source, | ||
| size_t source_len, | ||
| int options); |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns 0 to indicate both an error condition and when the character at the delimiter position is NUL. While this is unlikely to cause issues in practice (since valid delimiters are '*', '_', or '`'), it creates ambiguity in the API. Consider documenting this limitation more explicitly, or having the function return -1 for errors (though this would require changing the return type or using a different convention).
| void test_delimiters(test_batch_runner *runner) { | ||
| test_emph_asterisk(runner); | ||
| test_emph_underscore(runner); | ||
| test_strong_asterisk(runner); | ||
| test_strong_underscore(runner); | ||
| test_code_single_backtick(runner); | ||
| test_code_double_backtick(runner); | ||
| test_delim_length_convenience(runner); | ||
| test_delim_char_convenience(runner); | ||
| test_null_handling(runner); | ||
| test_nested_emphasis(runner); | ||
| test_multiline(runner); | ||
| test_multiline_code_then_emph(runner); | ||
| test_cr_line_endings(runner); | ||
| test_crlf_line_endings(runner); | ||
| } |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case for code spans with more than 2 backticks (e.g., "code") to ensure the backwards counting logic correctly handles longer delimiter sequences. While the logic appears sound, explicit test coverage would increase confidence.
No description provided.