Fix #210: handle percent-suffixed table/cell widths in WmlToHtmlConverter#211
Merged
Conversation
…rter
convertDocxToHtml threw FormatException ("Format_InvalidStringWithValue,
100%") for any table whose w:tblW or w:tcW used w:type="pct" with a
percent-suffixed value such as w:w="100%" / w:w="50%". This is the form the
`docx` npm library emits for WidthType.PERCENTAGE, and it is valid per the
OOXML ST_TblWidth / ST_MeasurementOrPercent schema, which permits either a
plain integer in fiftieths-of-a-percent OR a "<number>%" string. The
converter cast the w:w attribute straight to int, which throws on "100%";
DXA (twips) widths were unaffected because they are always plain integers.
Width parsing for w:tblW and w:tcW now routes through a single
ParseTblWidthValue helper that tolerates the percent-suffixed form: an
explicit "100%" is treated as a literal percentage, while a bare integer
under pct is still interpreted as fiftieths of a percent (5000 -> 100%).
Non-numeric values are ignored gracefully instead of throwing.
Tests: HcTablePercentageWidthTests (in-memory docx exercising pct string,
pct integer, dxa, and garbage widths). Documents the corner case in
docs/ooxml_corner_cases.md and adds a CHANGELOG entry.
The previous commit landed the tests, CHANGELOG, and corner-case doc but the WmlToHtmlConverter.cs edits did not take (the file had been touched between read and write), so the actual percentage-width fix was missing and the new HcTablePercentageWidthTests would have failed. This applies the intended change: route w:tblW / w:tcW width parsing through ParseTblWidthValue so a percent-suffixed value such as w:w="100%" no longer throws FormatException.
…nt-width fixtures The in-memory docx built by CreateDocxWithTableWidth lacked a StyleDefinitionsPart and DocumentSettingsPart, so ConvertToHtml threw ArgumentNullException (FormattingAssembler dereferences StyleDefinitionsPart; CalculateSpanWidthForTabs dereferences DocumentSettingsPart) before any table-width parsing ran. All four HcTablePercentageWidthTests failed in CI for this reason, not because of the width fix. Supply both parts so the tests exercise the actual percent/dxa/integer/garbage width paths.
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #210.
convertDocxToHtml(i.e.WmlToHtmlConverter) threwFormatException—Conversion failed: Format_InvalidStringWithValue, 100%— for any document containing a table whose width (table-levelw:tblWor cell-levelw:tcW) was expressed as a percentage withw:type="pct"and a percent-suffixed value such asw:w="100%"/w:w="50%". DXA (twips) widths converted fine.Root cause
The
w:wattribute onw:tblW/w:tcWhas OOXML schema typeST_TblWidth(a union overST_MeasurementOrPercent+ST_DecimalNumber). Underw:type="pct"the value may be expressed two schema-valid ways:w:w="5000"w:w="100%"Microsoft Word writes the integer-fiftieths form. The widely used
docxJS library (used in the issue's repro) writes the percent-suffixed string form forWidthType.PERCENTAGE. Docxodus only handled the integer form — it cast the attribute straight toint((int)tblW.Attribute(W._w)/(int) tcPr...tcW...), and(int)"100%"throws. DXA widths were unaffected because they are always plain integers.Fix
Width parsing for
w:tblWandw:tcWnow routes through a single helper,ParseTblWidthValue(XAttribute, out bool isExplicitPercent), that:%and reports it viaisExplicitPercent,decimal.TryParse(invariant culture) so a non-numeric/garbage value returnsnulland is skipped instead of throwing,"100%"as a literal percentage, while a bare integer underpctis still divided by 50 (fiftieths → percent), preserving existing behavior (5000→100%).The three call sites (table
pct, tabledxa, celldxa+pct) were updated to use it. DXA output ({n}pt) and integer-pctoutput are unchanged.Tests
New
Docxodus.Tests/HtmlConverterTablePercentageWidthTests.cs(HcTablePercentageWidthTests) builds minimal in-memory.docxdocuments (no fixtures on disk) with a 2×2 table and asserts:100%/50%percent-suffixed strings — conversion does not throw (the convertDocxToHtml throws Format_InvalidStringWithValue on tables with percentage widths (w:type="pct") #210 regression) and the HTML carrieswidth: 100%(table) andwidth: 50.0%(cells).5000/2500integer-fiftieths — still yieldswidth: 100%/width: 50.0%.9000/4500DXA — still yieldswidth: 450pt/width: 225pt(regression guard).Docs
CHANGELOG.md— entry under[Unreleased] → Fixed.docs/ooxml_corner_cases.md— documents the percent-suffixedw:wcorner case with a minimal reproducer and a Word/LibreOffice/Docxodus comparison table.Notes on similar patterns
Swept the codebase for the same risky cast.
tblIndalready uses a safe(decimal?)cast and is DXA-only;gridCol/tcWreads inRevisionProcessor.csare DXA-only in practice. The user-facing crash was confined to the threeWmlToHtmlConverterwidth casts, which now share one tolerant parser.