feat(csv): make parseLine the synchronous primitive (refs #3765)#7118
Open
MukundaKatta wants to merge 2 commits intodenoland:mainfrom
Open
feat(csv): make parseLine the synchronous primitive (refs #3765)#7118MukundaKatta wants to merge 2 commits intodenoland:mainfrom
MukundaKatta wants to merge 2 commits intodenoland:mainfrom
Conversation
Refactor the CSV parser so a single synchronous parseLine handles all field-level rules, with parse() (sync) and CsvParseStream (async) becoming thin line-iteration shells on top of it. - _io.ts: introduce sync parseLine; rewrite the existing async parseRecord as a thin reader.readLine accumulator that delegates to parseLine. Error column tracking now resolves through embedded newlines so error messages stay correct for multi-line quoted records. - parse.ts: drop the duplicate field-parsing loop that lived inside Parser.#parseRecord; both Parser and the new public parseLine share the same primitive. Public parseLine has the simple (line, options) -> string[] signature requested in denoland#3765, including BOM strip and trailing CR/LF/CRLF normalization. - parse_test.ts: add 12 parseLine-specific tests covering happy path, custom separator, escapes, BOM, trailing newlines, multi-line quoted body, lazyQuotes, comment lines, and unclosed-field error. All 133 existing parse + parse_stream tests still pass; new tests bring the total to 145.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7118 +/- ##
=======================================
Coverage 94.61% 94.61%
=======================================
Files 634 634
Lines 51799 51769 -30
Branches 9329 9327 -2
=======================================
- Hits 49009 48982 -27
+ Misses 2216 2211 -5
- Partials 574 576 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The deno_lint no-unused-vars check flagged the parameter on parseLine (and the matching one on parseRecord and Parser.#parseRecord) — it was threaded through but never read inside the function bodies because the locate() helper computes line offsets from embedded newlines in the joined fullLine instead. Removing the param simplifies the call sites without changing behavior: all 145 parse + parse_stream + parseLine tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
parseLinethe actual internal CSV primitive that bothparse()andCsvParseStreambuild on, addressing the design feedback from feat(csv): add parseLine() convenience for single-line CSV records (refs #3765) #7114 (closed) and aligning with suggestion: investigate simpler CSV-parsing APIs #3765's intent.Parser.#parseRecordinparse.ts— bothparse()(sync) and the streaming path now share one set of field/quote rules.parseLine(line, options) -> string[]is the simple shape suggestion: investigate simpler CSV-parsing APIs #3765 asked for, with BOM strip and trailing CR/LF/CRLF normalization.What changed
csv/_io.ts: new syncparseLinecarries the whole field-parsing state machine (separator, quotes, escapes, lazyQuotes, comment, trim). The existing asyncparseRecordbecomes a small wrapper that pulls more lines from theLineReaderand re-callsparseLineuntil a record completes. Error column tracking maps absolute positions in the joined input back to (line, column) so multi-line quoted records still report the right line.csv/parse.ts: dropParser.#parseRecord's duplicate field loop;Parsernow defers toparseLinefrom_io.ts. Add the publicparseLineexport with a clean(line, options)signature.csv/parse_test.ts: 12 new tests pin parseLine behavior (happy path, custom separator, escaped quotes, BOM, trailing newline, multi-line quoted body, lazyQuotes, comment, unclosed-field error).Test plan
parseLine's public surface matches suggestion: investigate simpler CSV-parsing APIs #3765's spirit and that the(line, options)shape is what was wanted.cc @bartlomieju — this replaces #7114 with the design you sketched in the review there.