Skip to content

Commit 6085226

Browse files
committed
Add range checks for narrowing integer casts in xml::Reader
1 parent 25179bc commit 6085226

3 files changed

Lines changed: 429 additions & 8 deletions

File tree

MR_DESCRIPTION.md

Lines changed: 385 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,385 @@
1+
# Fix 30 bugs found during code review
2+
3+
## Summary
4+
5+
This MR fixes 30 bugs across core types, JSON, XML, TOML, MsgPack, BSON, Avro, Cap'n Proto, and CLI modules. Each bug has a regression test that fails before the fix and passes after.
6+
7+
**Stats:** 39 files changed, 1197 insertions, 52 deletions.
8+
9+
---
10+
11+
## Discussion: Breaking API Changes
12+
13+
The following 3 fixes change public API behavior. They are correct from a safety standpoint, but may break downstream code. Please review whether a deprecation period or alternative approach is preferred.
14+
15+
### 1. `Validator::get()`/`value()` return mutable reference
16+
17+
> **Breaking change:** Removes non-const overloads of `get()`, `value()`, `operator*()`, `operator()()`.
18+
> Code like `validator.get() = new_value` will no longer compile.
19+
>
20+
> **Conflicts with API convention:** PR #613 established that wrapper types should provide `operator*`, `.value()`, `.get()`, and `operator()`. This fix removes the non-const variants to prevent mutation that bypasses validation. An alternative would be to keep non-const overloads but route them through re-validation (i.e. a validating setter).
21+
22+
**File:** `include/rfl/Validator.hpp:98-99, 114-115`
23+
24+
**Problem:** Non-const `get()`, `value()`, `operator*()`, `operator()()` return `T&`, allowing direct write that bypasses validation.
25+
26+
**Test:** `tests/generic/test_validator.cpp``validator_get_does_not_allow_invalid_mutation`. Uses `std::is_assignable_v` to verify `get()` returns `const T&`.
27+
28+
**Fix:** Removed all non-const overloads (`get()`, `value()`, `operator*()`, `operator()()`).
29+
30+
---
31+
32+
### 2. `Literal(const std::string&)` is implicit
33+
34+
> **Breaking change:** Adds `explicit` to the string constructor.
35+
> Code like `rfl::Literal<"a","b"> lit = my_string;` will no longer compile; must use `rfl::Literal<"a","b">(my_string)` instead.
36+
37+
**File:** `include/rfl/Literal.hpp:46`
38+
39+
**Problem:** Non-explicit constructor allows implicit `std::string``Literal` conversion, which throws on invalid input. Surprising for users.
40+
41+
**Test:** `tests/generic/test_literal.cpp``literal_constructor_is_explicit`. Uses `std::is_convertible_v` to verify implicit conversion is disabled.
42+
43+
**Fix:** Added `explicit` keyword.
44+
45+
---
46+
47+
### 3. `TaggedUnion` — typo `discrimininator_`
48+
49+
> **Breaking change:** Renames the public static member `discrimininator_``discriminator_`.
50+
> Code referencing `TaggedUnion<...>::discrimininator_` will no longer compile.
51+
> A `[[deprecated]]` alias could be added to ease migration if desired.
52+
53+
**File:** `include/rfl/TaggedUnion.hpp:15`
54+
55+
**Problem:** Extra 'n' — `discrimininator_` instead of `discriminator_`.
56+
57+
**Test:** `tests/generic/test_tagged_union.cpp``tagged_union_discriminator_spelling`. Uses concepts to check member name existence.
58+
59+
**Fix:** Renamed to `discriminator_`. Also fixed the typo in the error message string literal `"discrimininator"``"discriminator"` in `Parser_tagged_union.hpp:151`.
60+
61+
---
62+
63+
## Bugs Fixed
64+
65+
### 4. `Tuple::operator<=>` always returns `equivalent`
66+
67+
**File:** `include/rfl/Tuple.hpp:107`
68+
69+
**Problem:** The condition `_ordering != equivalent` is checked on first iteration where `_ordering` is initialized to `equivalent` — the body never executes. The operator returns `equivalent` for any two tuples.
70+
71+
**Test:** `tests/generic/test_tuple.cpp``tuple_spaceship_not_always_equivalent`, `tuple_spaceship_less`. Verifies `Tuple<int>(1) <=> Tuple<int>(2)` returns `less`, not `equivalent`.
72+
73+
**Fix:** Changed `!=` to `==` so the comparison body executes when ordering is still undetermined.
74+
75+
---
76+
77+
### 5. `Result::operator=(const Result<U>&)` does not update `success_`
78+
79+
**File:** `include/rfl/Result.hpp:210-216`
80+
81+
**Problem:** The converting assignment copies `t_or_err_` but never updates `success_`. After assigning an error `Result<U>` into a value `Result<T>`, `success_` remains `true``.value()` reads garbage.
82+
83+
**Test:** `tests/generic/test_result.cpp``result_cross_type_assign_error_clears_success`. Assigns error `Result<long>` to value `Result<int>`, checks the target is falsy.
84+
85+
**Fix:** Replaced raw union copy with proper `destroy()` + `success_` update + `move_from_other()`.
86+
87+
---
88+
89+
### 6. `Validator::operator=(U&&)` is `noexcept` but throws
90+
91+
**File:** `include/rfl/Validator.hpp:77`
92+
93+
**Problem:** The operator calls `.value()` which throws `std::runtime_error` on validation failure. Marked `noexcept` — any validation error calls `std::terminate`.
94+
95+
**Test:** `tests/generic/test_validator.cpp``validator_assign_rvalue_operator_is_noexcept_but_can_throw`. Uses `std::is_nothrow_assignable_v` to verify the operator is no longer `noexcept`.
96+
97+
**Fix:** Removed `noexcept` specifier.
98+
99+
---
100+
101+
### 7. `Generic::to_int()` silently truncates `int64_t` to `int`
102+
103+
**File:** `include/rfl/Generic.hpp:175-188`
104+
105+
**Problem:** `static_cast<int>(_v)` with no range check. Values outside `[INT_MIN, INT_MAX]` are silently truncated.
106+
107+
**Test:** `tests/generic/test_generic.cpp``generic_to_int_rejects_overflow`, `generic_to_int_rejects_underflow`. Passes `INT_MAX+1` and `INT_MIN-1`, expects failure.
108+
109+
**Fix:** Added range check before the cast, returning an error for out-of-range values.
110+
111+
---
112+
113+
### 8. `Box`/`Ref` `operator<=>` compares pointer addresses
114+
115+
**Files:** `include/rfl/Box.hpp:130`, `include/rfl/Ref.hpp:118`
116+
117+
**Problem:** `_b1.ptr() <=> _b2.ptr()` compares memory addresses instead of pointed-to values. Two `Box<int>(42)` with different allocations compare as unequal.
118+
119+
**Test:** `tests/generic/test_box.cpp``box_spaceship_compares_values_not_pointers`. `tests/generic/test_ref.cpp``ref_spaceship_compares_values_not_pointers`.
120+
121+
**Fix:** Changed to `*_b1 <=> *_b2` (dereference before compare).
122+
123+
---
124+
125+
### 9. `Field`/`Flatten`/`Skip` cross-type move constructor copies instead of moving
126+
127+
**Files:** `include/rfl/Field.hpp:37`, `include/rfl/Flatten.hpp:32`, `include/rfl/internal/Skip.hpp:43`
128+
129+
**Problem:** `value_(_field.get())``get()` returns lvalue reference, so the copy constructor is called instead of move.
130+
131+
**Test:** `tests/generic/test_field.cpp``field_cross_type_move_does_not_copy`. `tests/generic/test_flatten.cpp``flatten_cross_type_move_does_not_copy`. `tests/generic/test_skip.cpp``skip_cross_type_move_does_not_copy`. Use `MoveTracker` types that count copies vs moves.
132+
133+
**Fix:** Changed to `value_(std::move(_field.get()))` in all three types.
134+
135+
---
136+
137+
### 10. `Result::value_or(T&&)` returns dangling rvalue reference
138+
139+
**File:** `include/rfl/Result.hpp:295`
140+
141+
**Problem:** Returns `T&&` instead of `T`. If the default argument is a temporary, the caller gets a dangling reference.
142+
143+
**Test:** `tests/generic/test_result.cpp``result_value_or_returns_value_not_rvalue_ref`. Checks return type is `T`, not `T&&`.
144+
145+
**Fix:** Changed return type from `T&&` to `T`.
146+
147+
---
148+
149+
### 11. `Result::error() &&` returns `Error&` instead of `Error&&`
150+
151+
**File:** `include/rfl/Result.hpp:335`
152+
153+
**Problem:** The rvalue-qualified overload returns `Error&` but `get_err() &&` returns `Error&&`. Prevents move semantics on rvalue Results.
154+
155+
**Test:** `tests/generic/test_result.cpp``result_error_rvalue_qualified_returns_rvalue_ref`. Checks return type is `Error&&`.
156+
157+
**Fix:** Changed return type from `Error&` to `Error&&`.
158+
159+
---
160+
161+
### 12. `Object::difference_type` is unsigned
162+
163+
**File:** `include/rfl/Object.hpp:29`
164+
165+
**Problem:** `using difference_type = typename DataType::size_type``size_type` is unsigned, but the C++ standard requires `difference_type` to be signed.
166+
167+
**Test:** `tests/generic/test_object.cpp``object_difference_type_is_signed`. Uses `std::is_signed_v`.
168+
169+
**Fix:** Changed to `typename DataType::difference_type`.
170+
171+
---
172+
173+
### 13. `transform_camel_case` prepends `_` to names starting with uppercase
174+
175+
**File:** `include/rfl/internal/transform_case.hpp:60-73`
176+
177+
**Problem:** When `_i == 0` and the first character is uppercase, the function unconditionally prepends `'_'`. `"MyField"` becomes `"_my_field"` instead of `"my_field"`.
178+
179+
**Test:** `tests/json/test_regression_bugs.cpp``camel_case_to_snake_case_no_leading_underscore`. Serializes a struct with `CamelCaseToSnakeCase` and checks output.
180+
181+
**Fix:** Added `sizeof...(chars) > 0` guard so `'_'` is only inserted between characters, not at the beginning.
182+
183+
---
184+
185+
### 14. `Timestamp::to_time_t()` double-subtracts timezone offset
186+
187+
**File:** `include/rfl/Timestamp.hpp:120`
188+
189+
**Problem:** `timegm(&tm) - tm_.tm_gmtoff``timegm()` already returns UTC. Subtracting `tm_gmtoff` applies the timezone correction twice. Non-Windows only.
190+
191+
**Test:** `tests/json/test_regression_bugs.cpp``timestamp_to_time_t_no_double_timezone_correction`. Sets `tm_gmtoff = 3600` and verifies `to_time_t()` returns the correct UTC epoch.
192+
193+
**Fix:** Removed `- tm_.tm_gmtoff`.
194+
195+
---
196+
197+
### 15. `FieldVariantParser` — wrong error message
198+
199+
**File:** `include/rfl/parsing/FieldVariantParser.hpp:47-50`
200+
201+
**Problem:** When zero matching fields are found, the error says "found more than one" instead of "found none".
202+
203+
**Test:** `tests/json/test_regression_bugs.cpp``field_variant_empty_object_error_message`. Parses `"{}"` and checks the error message does not contain "more than one".
204+
205+
**Fix:** Changed message to "found none".
206+
207+
---
208+
209+
### 16. `json::Writer` reads past `string_view` bounds
210+
211+
**File:** `src/rfl/json/Writer.cpp:68, :91, :113`
212+
213+
**Problem:** `yyjson_mut_strcpy(_name.data())` reads until `\0`, but `string_view::data()` is not null-terminated. Reads past buffer for substrings.
214+
215+
**Test:** `tests/json/test_regression_bugs.cpp``json_writer_handles_non_null_terminated_field_names`. Creates a `string_view` substring and verifies the JSON key matches the view's length.
216+
217+
**Fix:** Changed all three call sites from `yyjson_mut_strcpy` to `yyjson_mut_strncpy` with explicit size.
218+
219+
---
220+
221+
### 17. `cli::Reader``stoull` accepts negative numbers for unsigned types
222+
223+
**File:** `include/rfl/cli/Reader.hpp:77`
224+
225+
**Problem:** `std::stoull("-1")` wraps to `ULLONG_MAX` without error. `static_cast<uint16_t>` wraps again to 65535.
226+
227+
**Test:** `tests/cli/test_regression_bugs.cpp``cli_rejects_negative_for_unsigned`. Passes `--port=-1` and expects failure.
228+
229+
**Fix:** Added explicit check for leading `-` character, returning an error for negative input.
230+
231+
---
232+
233+
### 18. `cli::Reader` — no range check on unsigned narrowing cast
234+
235+
**File:** `include/rfl/cli/Reader.hpp:77`
236+
237+
**Problem:** `static_cast<T>(stoull(str))` silently truncates. `stoull("99999")` for `uint16_t` produces 34463.
238+
239+
**Test:** `tests/cli/test_regression_bugs.cpp``cli_rejects_out_of_range_for_narrow_type`. Passes `--port=99999` for `uint16_t` and expects failure.
240+
241+
**Fix:** Added `std::numeric_limits<T>::max()` bounds check before the cast.
242+
243+
---
244+
245+
### 19. `cli::Reader` — no range check on signed narrowing cast
246+
247+
**File:** `include/rfl/cli/Reader.hpp:101-102`
248+
249+
**Problem:** `static_cast<T>(stoll(str))` silently truncates for `int8_t`/`int16_t`/`int32_t`. `stoll("200")` cast to `int8_t` wraps to -56.
250+
251+
**Test:** `tests/cli/test_regression_bugs.cpp``cli_rejects_out_of_range_for_signed_narrow_type`, `cli_rejects_large_negative_for_signed_narrow_type`. Passes `--level=200` and `--level=-200` for `int8_t` and expects failure.
252+
253+
**Fix:** Added `std::numeric_limits<T>::min()`/`max()` bounds check before the cast.
254+
255+
---
256+
257+
### 20. `cli::Reader``std::stod` is locale-dependent
258+
259+
**File:** `include/rfl/cli/Reader.hpp:64`
260+
261+
**Problem:** `std::stod` uses the C locale. In locales with comma decimal separator (e.g. `de_DE`), `"3.14"` parses as `3.0`.
262+
263+
**Test:** `tests/cli/test_regression_bugs.cpp``cli_float_parsing_ignores_locale`. Sets `de_DE` locale, parses `--rate=3.14`, checks result is `3.14`.
264+
265+
**Fix:** Replaced `std::stod` with `std::from_chars` which is locale-independent.
266+
267+
---
268+
269+
### 21. `capnproto::read(istream, schema)` — infinite recursion
270+
271+
**File:** `include/rfl/capnproto/read.hpp:79-82`
272+
273+
**Problem:** The overload `read(std::istream&, const Schema<T>&)` calls `read<T, Ps...>(_stream, _schema)` — resolves to itself. Stack overflow.
274+
275+
**Test:** `tests/capnproto/compile_fail/capnproto_read_istream_schema.cpp` — instantiates the overload. Before the fix, GCC rejects it as "use of auto before deduction" (infinite recursion detected at compile time). After the fix, the file compiles successfully, verifying the overload resolves correctly.
276+
277+
**Fix:** Read bytes from stream into a vector, then call `read(bytes.data(), bytes.size(), _schema)`.
278+
279+
---
280+
281+
### 22. `avro::SchemaImpl` move-assignment — double-decrement / use-after-free
282+
283+
**File:** `src/rfl/avro/SchemaImpl.cpp:34-42`
284+
285+
**Problem:** Move-assignment neither releases the old `iface_` (leak) nor nullifies `_other.iface_` (double-decrement when `_other`'s destructor runs).
286+
287+
**Test:** `tests/avro/test_regression_bugs.cpp``schema_impl_move_assignment_no_double_free`. Move-assigns two `SchemaImpl` objects, verifies source `iface_` is null after move.
288+
289+
**Fix:** Added `avro_value_iface_decref`/`avro_schema_decref` for the old iface before reassignment, and `_other.iface_ = nullptr` after.
290+
291+
---
292+
293+
### 23. `xml::Reader` uses `stoi` for all integer types
294+
295+
**File:** `include/rfl/xml/Reader.hpp:99-105`
296+
297+
**Problem:** `std::stoi` returns `int` (32-bit). For `int64_t`/`uint64_t` fields, values beyond `INT_MAX` throw `out_of_range` or are silently truncated. Additionally, `static_cast<T>` after `stoll`/`stoull` silently truncates for narrow types like `int16_t`/`uint16_t`.
298+
299+
**Test:** `tests/xml/test_regression_bugs.cpp``read_int64_above_int32_max`, `read_int64_large_negative`, `read_uint64_large_value`, `read_rejects_out_of_range_for_int16`, `read_rejects_negative_for_uint16`, `read_rejects_large_negative_for_int16`.
300+
301+
**Fix:** Replaced with `std::from_chars` which parses directly into the target type, handling overflow, negative-for-unsigned, and narrowing without manual checks.
302+
303+
---
304+
305+
### 24. `toml::Writer``at_path()` dereferences as path, crashes on dotted keys
306+
307+
**File:** `src/rfl/toml/Writer.cpp:26, :40`
308+
309+
**Problem:** After `emplace("a.b", ...)`, `at_path("a.b")` interprets the dot as a path separator and looks for `table["a"]["b"]` which doesn't exist — returns invalid node — `as_table()`/`as_array()` returns nullptr.
310+
311+
**Test:** `tests/toml/test_regression_bugs.cpp``field_name_with_dot_does_not_crash`, `array_field_name_with_dot_does_not_crash`. Use `rfl::Rename<"a.b", ...>` to create dotted field names.
312+
313+
**Fix:** Replaced `at_path(_name)` with `operator[](_name)` which does literal key lookup.
314+
315+
---
316+
317+
### 25. `msgpack::read` ignores `msgpack_unpack` return value
318+
319+
**File:** `include/rfl/msgpack/read.hpp:35-36`
320+
321+
**Problem:** If `msgpack_unpack` fails, the return value is not checked. `deserialized` remains uninitialized — garbage is parsed, producing misleading downstream errors like "Could not cast to a map".
322+
323+
**Test:** `tests/msgpack/test_regression_bugs.cpp``read_single_invalid_byte_returns_meaningful_error`, `read_truncated_data_returns_meaningful_error`. Pass invalid/truncated data and verify the error doesn't mention "cast".
324+
325+
**Fix:** Check return value; return `"Failed to unpack msgpack data."` error on failure. Accept both `MSGPACK_UNPACK_SUCCESS` and `MSGPACK_UNPACK_EXTRA_BYTES`.
326+
327+
---
328+
329+
### 26. `msgpack::Writer` stores `uint64_t` as signed `int64`
330+
331+
**File:** `include/rfl/msgpack/Writer.hpp:133-134`
332+
333+
**Problem:** All integers are packed via `msgpack_pack_int64(static_cast<int64_t>(...))`. For `uint64_t` values above `INT64_MAX`, the cast wraps to negative — msgpack stores them as `NEGATIVE_INTEGER`.
334+
335+
**Test:** `tests/msgpack/test_regression_bugs.cpp``uint64_above_int64_max_stored_as_positive`, `uint64_max_stored_as_positive`. Verify raw msgpack wire type is `POSITIVE_INTEGER`.
336+
337+
**Fix:** Added a separate `std::is_unsigned` branch that calls `msgpack_pack_uint64`.
338+
339+
---
340+
341+
### 27. `bson::Reader` rejects integer BSON values for float fields
342+
343+
**File:** `include/rfl/bson/Reader.hpp:127-132`
344+
345+
**Problem:** The float branch checks `btype != BSON_TYPE_DOUBLE` and rejects everything else — even though the error message claims `int32`/`int64`/`date_time` are valid.
346+
347+
**Test:** `tests/bson/test_regression_bugs.cpp``read_int64_into_double_field`, `read_large_int64_into_double_field`. Write an int struct, read it back as a double struct.
348+
349+
**Fix:** Changed to a `switch` accepting `BSON_TYPE_DOUBLE`, `INT32`, `INT64`, and `DATE_TIME`.
350+
351+
---
352+
353+
### 28. `delete`/`delete[]` mismatch in test
354+
355+
**File:** `tests/json/test_pointer_fields.cpp`
356+
357+
**Problem:** A raw pointer field allocated with `new[]` was freed with `delete` instead of `delete[]`.
358+
359+
**Test:** Pre-existing test; fix prevents undefined behavior under sanitizers.
360+
361+
**Fix:** Changed `delete` to `delete[]`.
362+
363+
---
364+
365+
### 29. `Flatten(U&&)` universal reference constructor copies instead of forwarding
366+
367+
**File:** `include/rfl/Flatten.hpp:40`
368+
369+
**Problem:** `value_(_value)``_value` is a named parameter (lvalue), so the copy constructor is always called even when an rvalue is passed. Should be `value_(std::forward<U>(_value))`.
370+
371+
**Test:** `tests/generic/test_flatten.cpp``flatten_universal_ref_ctor_forwards_rvalue`. Constructs `Flatten<FlatBase>` from `FlatDerived&&` and verifies rvalue conversion is used.
372+
373+
**Fix:** Changed to `value_(std::forward<U>(_value))`.
374+
375+
---
376+
377+
### 30. `Flatten::operator=(Flatten<U>&&)` forwards `Flatten<U>` instead of `U`
378+
379+
**File:** `include/rfl/Flatten.hpp:103-105`
380+
381+
**Problem:** `std::forward<U>(_f)` forwards the `Flatten<U>` object, not `U`. Assignment `value_ = Flatten<U>` looks for implicit conversion which uses `get()` (lvalue ref) — copy instead of move. Causes hard compile error for types without `Flatten<U>``Type` conversion.
382+
383+
**Test:** `tests/generic/compile_fail/flatten_cross_type_move_assign.cpp` — compile test that instantiates the cross-type move assignment. Verifies compilation succeeds after the fix.
384+
385+
**Fix:** Changed to `value_ = std::move(_f.get())`.

0 commit comments

Comments
 (0)