Skip to content

feat(csv): make parseLine the synchronous primitive (refs #3765)#7118

Open
MukundaKatta wants to merge 2 commits intodenoland:mainfrom
MukundaKatta:feat/csv-parse-line-primitive
Open

feat(csv): make parseLine the synchronous primitive (refs #3765)#7118
MukundaKatta wants to merge 2 commits intodenoland:mainfrom
MukundaKatta:feat/csv-parse-line-primitive

Conversation

@MukundaKatta
Copy link
Copy Markdown

Summary

What changed

  • csv/_io.ts: new sync parseLine carries the whole field-parsing state machine (separator, quotes, escapes, lazyQuotes, comment, trim). The existing async parseRecord becomes a small wrapper that pulls more lines from the LineReader and re-calls parseLine until a record completes. Error column tracking maps absolute positions in the joined input back to (line, column) so multi-line quoted records still report the right line.
  • csv/parse.ts: drop Parser.#parseRecord's duplicate field loop; Parser now defers to parseLine from _io.ts. Add the public parseLine export with a clean (line, options) signature.
  • csv/parse_test.ts: 12 new tests pin parseLine behavior (happy path, custom separator, escaped quotes, BOM, trailing newline, multi-line quoted body, lazyQuotes, comment, unclosed-field error).

Test plan

  • All 133 existing parse + parse_stream steps still pass (145 total with the new parseLine tests).
  • Existing error-message assertions (StartLine1, StartLine2, ParseErrorLine, OddQuotes, etc.) preserved with no changes to test expectations.
  • Reviewer to confirm parseLine's public surface matches suggestion: investigate simpler CSV-parsing APIs #3765's spirit and that the (line, options) shape is what was wanted.

cc @bartlomieju — this replaces #7114 with the design you sketched in the review there.

Refactor the CSV parser so a single synchronous parseLine handles all
field-level rules, with parse() (sync) and CsvParseStream (async)
becoming thin line-iteration shells on top of it.

- _io.ts: introduce sync parseLine; rewrite the existing async
  parseRecord as a thin reader.readLine accumulator that delegates to
  parseLine. Error column tracking now resolves through embedded
  newlines so error messages stay correct for multi-line quoted records.
- parse.ts: drop the duplicate field-parsing loop that lived inside
  Parser.#parseRecord; both Parser and the new public parseLine share
  the same primitive. Public parseLine has the simple
  (line, options) -> string[] signature requested in denoland#3765, including
  BOM strip and trailing CR/LF/CRLF normalization.
- parse_test.ts: add 12 parseLine-specific tests covering happy path,
  custom separator, escapes, BOM, trailing newlines, multi-line quoted
  body, lazyQuotes, comment lines, and unclosed-field error.

All 133 existing parse + parse_stream tests still pass; new tests bring
the total to 145.
@github-actions github-actions Bot added the csv label Apr 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

❌ Patch coverage is 91.96429% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.61%. Comparing base (cd03740) to head (ccb627d).

Files with missing lines Patch % Lines
csv/parse.ts 82.60% 6 Missing and 2 partials ⚠️
csv/_io.ts 98.48% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7118   +/-   ##
=======================================
  Coverage   94.61%   94.61%           
=======================================
  Files         634      634           
  Lines       51799    51769   -30     
  Branches     9329     9327    -2     
=======================================
- Hits        49009    48982   -27     
+ Misses       2216     2211    -5     
- Partials      574      576    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The deno_lint no-unused-vars check flagged the parameter on parseLine
(and the matching one on parseRecord and Parser.#parseRecord) — it was
threaded through but never read inside the function bodies because the
locate() helper computes line offsets from embedded newlines in the
joined fullLine instead.

Removing the param simplifies the call sites without changing behavior:
all 145 parse + parse_stream + parseLine tests still pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant