Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 8 additions & 54 deletions docs/followups.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@ IN-PROGRESS / DONE.

These are not bugs that produce wrong output today; they are seams worth
closing when the surrounding code is next touched. Resolved items are moved to
`docs/old/old-followups.md` once they ship (most recently the device codegen-
quality gap vs `nvcc` — predication, FMA, alignment — which was item 2 here).
`docs/old/old-followups.md` once they ship (most recently the MAXWORD32 /
MAXWORD64 parity constants, which was item 5 here — the wide unsigned types now
predeclare `MAXWORD32` / `MAXWORD64` alongside `MAXINT32` / `MAXINT64`, gated on
`wide-integers`; before that, the wide same-width WORD/INTEGER signedness mix,
item 6, where `_check_word_int_mix` now covers `WORD32`/`INTEGER32` and
`WORD64`/`INTEGER64` at equal rank under the same `strict-word-int` discipline).


---
Expand Down Expand Up @@ -51,7 +55,7 @@ variant-record long-form `NEW` behavior.

---

## 3. WORD/INTEGER constant exemption: fold constant expressions [OPEN]
## 2. WORD/INTEGER constant exemption: fold constant expressions [OPEN]

**Where.** `type_checker.py::_is_constant_integer_expr` (consulted by
`_check_word_int_assign` and `_check_word_int_mix`).
Expand Down Expand Up @@ -81,7 +85,7 @@ confirm `tests/test_word_int_strictness.py` still rejects genuine variables.

---

## 4. ODD(WORD) is rejected but should be accepted [OPEN]
## 3. ODD(WORD) is rejected but should be accepted [OPEN]

**Where.** `builtins_registry.py` registers `ODD` as `FunctionType('ODD',
[('n', INTEGER_TYPE)], BOOLEAN_TYPE)`; the argument check rejects a WORD actual.
Expand All @@ -103,53 +107,3 @@ signedness-independent.
**How to verify.** Flip `TestManualKnownGaps::test_odd_accepts_word_is_a_known_gap`
to assert ACCEPT (and add a build-and-run parity check for `ODD(WORD)` vs
`ODD(INTEGER)`).

---

## 5. MAXWORD32 / MAXWORD64 parity constants [OPEN]

**Where.** `builtins_registry.py` (constant registration) and
`codegen/base.py` (`self.constants` seeding), alongside `MAXINT32`/`MAXINT64`.

**What.** The wide *signed* types ship with `MAXINT32`/`MAXINT64`, but the new
wide *unsigned* types `WORD32`/`WORD64` do not yet have `MAXWORD32`
(`4294967295`) / `MAXWORD64` (`18446744073709551615`) predeclared constants.

**Why it matters.** Minor parity gap only. The types are fully usable without
them (literals, variables, widening, arithmetic, and unsigned `WRITE` all work);
this is a convenience constant, deferred to avoid the unsigned-constant width
selection in the codegen const path (`MAXWORD64` needs an i64 whose value exceeds
the signed i64 max).

**Suggested resolution.** Seed `MAXWORD32`/`MAXWORD64` in both the checker
registry (gated on `wide-integers`, as `MAXINT32`/`MAXINT64` are) and the codegen
constants, verifying `_const_ir` emits them at i32/i64 with the correct unsigned
bit pattern (follow how `MAXWORD = 65535` is already emitted as an i16).

**How to verify.** Add a build-and-run check that `WRITELN(MAXWORD32)` prints
`4294967295` and `WRITELN(MAXWORD64)` prints `18446744073709551615`.

---

## 6. WORD32/INTEGER32 (same-width) signedness mix is undiagnosed [OPEN]

**Where.** `type_checker.py::_check_word_int_mix` (fires only for the 16-bit
`WORD`/`INTEGER` pair); `type_system.py::binary_op_result_type` resolves a
same-width unsigned/signed mix to the unsigned type.

**What.** The vintage WORD/INTEGER (16-bit) expression mix warns (and errors
under `-f strict-word-int`). The analogous *wide* same-width mixes
(`WORD32`/`INTEGER32`, `WORD64`/`INTEGER64`) silently resolve to the unsigned
type with no diagnostic.

**Why it matters.** These are extension types outside the 1981 manual, so there
is no vintage rule to conform to; leaving them undiagnosed is a deliberate, safe
default. But a user who opted into `-f strict-word-int` might reasonably expect
the same signedness discipline at all widths.

**Suggested resolution.** If desired, generalize `_check_word_int_mix` to the
full WORD-family/INTEGER-family at equal rank, keeping the INTEGER-constant
exemption, behind the existing `strict-word-int` flag.

**How to verify.** Add matrix rows for `WORD32 + INTEGER32 variable` asserting a
warning by default and an error under `strict-word-int`.
84 changes: 84 additions & 0 deletions docs/old/old-followups.md
Original file line number Diff line number Diff line change
Expand Up @@ -369,3 +369,87 @@ already rejects an addrspace mismatch), and it is now hardened to raise loudly
under `is_device_module` instead of silently dropping a segment — implementing
this item's "treat the seg→flat path as a type error in device context"
resolution. Covered by `tests/test_device_ads_no_segment.py`.

---

## 6. WORD32/INTEGER32 (same-width) signedness mix is undiagnosed [DONE]

**Where.** `type_checker.py::_check_word_int_mix` (the same-width unsigned/signed
diagnostic) and `type_system.py::binary_op_result_type` (which resolves a
same-width mix to the unsigned type).

**What.** The vintage WORD/INTEGER (16-bit) expression mix warned (and errored
under `-f strict-word-int`), but the analogous *wide* same-width mixes
(`WORD32`/`INTEGER32`, `WORD64`/`INTEGER64`) silently resolved to the unsigned
type with no diagnostic — even under `strict-word-int`. The check was hard-wired
to the rank-0 pair (`a_t == WORD_TYPE and b_t == INTEGER_TYPE`), so it never fired
for the wide extension types.

**Why it mattered.** The wide types are extensions outside the 1981 manual, so
leaving them undiagnosed was a safe default rather than a wrong result. But a
same-width unsigned/signed mix carries the identical "which signedness does the
arithmetic use?" ambiguity at every width, and a user who opted into
`-f strict-word-int` could reasonably expect the same signedness discipline
across the whole integer family.

**Resolution.** `_check_word_int_mix` now generalizes to the full
WORD-family/INTEGER-family at **equal rank** (WORD/INTEGER, WORD32/INTEGER32,
WORD64/INTEGER64) via small `_WORD_FAMILY_RANK`/`_INT_FAMILY_RANK` maps. The
behavior is uniform across widths: a warning by default, a hard error under
`-f strict-word-int`, and the INTEGER-constant exemption preserved at every
width. Unequal-width mixes are deliberately **not** flagged — there the wider
operand's signedness wins unambiguously (e.g. `WORD(16) + INTEGER32 ->
INTEGER32`), so there is no coin-flip to warn about. The 16-bit behavior is
byte-for-byte unchanged. The stale "wide-type mixes are not diagnosed" comment in
`binary_op_result_type` was corrected.

**How verified.** New rows in `tests/test_conversion_matrix.py`
(`word32_plus_int32_var_default` ACCEPT-with-warning, `word32_plus_int32_var_strict`
REJECT, `word64_plus_int64_var_strict` REJECT, the constant-exemption row, and an
unequal-width clean row) plus a new `TestWideSameWidthMix` class in
`tests/test_word_int_strictness.py` (warns by default, errors under strict, holds
the constant exemption, leaves unequal-width mixes clean). The existing 16-bit
WORD/INTEGER strictness tests remain green, confirming no regression.

---

## 5. MAXWORD32 / MAXWORD64 parity constants [DONE]

**Where.** `builtins_registry.py` (constant registration), `codegen/base.py`
(`self.constants` / `self.constant_types` seeding), `codegen/constfold.py`
(`_const_ir` width selection), and `codegen/io_write_read.py` (`_pas_type`, the
WRITE signed/unsigned format selector).

**What.** The wide *signed* types shipped with `MAXINT32`/`MAXINT64`, but the wide
*unsigned* types `WORD32`/`WORD64` had no `MAXWORD32` (`4294967295`) /
`MAXWORD64` (`18446744073709551615`) predeclared constants. They are now seeded
on both the checker and codegen sides, gated on `wide-integers` exactly like
`MAXINT32`/`MAXINT64`, and carry full `WORD32`/`WORD64` type identity.

**Why it mattered.** Minor parity gap only — the types were already fully usable
without them. The deferral was about the unsigned-constant width selection in the
codegen const path: `MAXWORD64` is `2**64-1`, which exceeds the signed i64 max,
so it cannot fall through to the i32 default in `_const_ir`.

**Resolution.** `_is...`/registration mirrors `MAXINT32`/`MAXINT64`:
`builtins_registry.py` registers both constants under `wide-integers` (with the
WORD32/WORD64 types now imported), and `codegen/base.py` seeds their magnitudes
into `self.constants`. `_const_ir` emits `MAXWORD64` at i64 alongside `MAXINT64`
(the all-ones bit pattern); `MAXWORD32` emits at the i32 default, which already
held its value. One step beyond the original touchpoints was required: WRITE
picks signed vs unsigned formatting from the argument's Pascal type via
`_pas_type`, and builtin constants are not seeded into the codegen scope, so
`_pas_type` returned `None` and both constants formatted signed (printing `-1`).
A small `self.constant_types` tag map (seeded alongside `self.constants`, gated
identically) now lets `_pas_type` recover the `WORD32`/`WORD64` tag so the wide
unsigned max constants print unsigned. (`MAXWORD` only ever printed correctly by
luck — `65535` fits in a positive signed i32 — which is why the high-bit-set wide
constants exposed the gap.)

**How verified.** New `TestWideMaxConstants` (gating + WORD32/WORD64 type
identity: same-type ACCEPT, WORD32->WORD64 widen ACCEPT, INTEGER assign REJECT,
WORD64->WORD32 narrow REJECT) and `TestWideMaxConstantsRun` (build-and-run:
`WRITELN(MAXWORD32)` prints `4294967295`, `WRITELN(MAXWORD64)` prints
`18446744073709551615`, and a round-trip through WORD32/WORD64 variables) in
`tests/test_wide_unsigned_types.py`. Full suite green: `971 passed, 1 skipped,
115 subtests passed`.
7 changes: 6 additions & 1 deletion src/pascal1981/builtins_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

from .symbol_table import Symbol
from .features import is_extended
from .type_system import (BOOLEAN_TYPE, CHAR_TYPE, INTEGER32_TYPE, INTEGER64_TYPE, INTEGER_TYPE, REAL_TYPE, WORD_TYPE, EnumType, FileType, FunctionType, LStringType, PointerType,
from .type_system import (BOOLEAN_TYPE, CHAR_TYPE, INTEGER32_TYPE, INTEGER64_TYPE, INTEGER_TYPE, REAL_TYPE, WORD32_TYPE, WORD64_TYPE, WORD_TYPE, EnumType, FileType, FunctionType, LStringType, PointerType,
ProcedureType, RecordType, StringType)

# Lists of all built-in function and procedure names
Expand Down Expand Up @@ -150,6 +150,11 @@ def define_builtin(name: str, symbol_type, kind: str):
if features and features.get('wide-integers', False):
define_builtin('MAXINT32', INTEGER32_TYPE, 'const')
define_builtin('MAXINT64', INTEGER64_TYPE, 'const')
# Unsigned siblings of MAXINT32/MAXINT64 for the wide WORD types.
# MAXWORD32 = 2**32-1, MAXWORD64 = 2**64-1. Gated identically so the
# wide signed/unsigned max-constant surfaces never drift apart.
define_builtin('MAXWORD32', WORD32_TYPE, 'const')
define_builtin('MAXWORD64', WORD64_TYPE, 'const')
define_builtin('NULL', LStringType(0), 'const')
filemodes_type = EnumType(['SEQUENTIAL', 'TERMINAL', 'DIRECT'], name='FILEMODES')
define_builtin('SEQUENTIAL', filemodes_type, 'const')
Expand Down
15 changes: 15 additions & 0 deletions src/pascal1981/codegen/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,21 @@ def __init__(self,
if self.feature_enabled('wide-integers'):
self.constants['MAXINT32'] = 2147483647
self.constants['MAXINT64'] = 9223372036854775807
# Unsigned wide max constants (see builtins_registry). Stored as
# their true positive magnitudes; _const_ir emits MAXWORD64 at i64
# (the value exceeds the signed i64 max, so it must not fall through
# to the i32 default) and MAXWORD32 at i32.
self.constants['MAXWORD32'] = 4294967295
self.constants['MAXWORD64'] = 18446744073709551615
# Pascal-type tags for builtin constants that are not seeded into the
# codegen scope as symbols. Consulted by the WRITE path (_pas_type) to
# pick signed vs unsigned formatting: MAXWORD32/MAXWORD64 have the high
# bit set, so they must format unsigned or they print as -1. Keyed
# UPPER; values are Pascal type names matched by the formatter.
self.constant_types: Dict[str, str] = {}
if self.feature_enabled('wide-integers'):
self.constant_types['MAXWORD32'] = 'WORD32'
self.constant_types['MAXWORD64'] = 'WORD64'
self.type_aliases: Dict[str, Type] = {} # compile-time type aliases, keyed UPPER
# Seed the C-ABI fixed-width aliases (Phase 1 of the C-FFI plan) so a
# foreign `[C]` routine spelled with CINT/CLONG/CPTR/etc. lowers through
Expand Down
5 changes: 4 additions & 1 deletion src/pascal1981/codegen/constfold.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@ def _const_ir(self, name_upper: str) -> ir.Constant:
return ir.Constant(ir.DoubleType(), v)
if name_upper == 'MAXINT':
return ir.Constant(ir.IntType(16), int(v))
if name_upper == 'MAXINT64':
if name_upper in ('MAXINT64', 'MAXWORD64'):
# MAXWORD64 = 2**64-1 exceeds the signed i64 max; it must be emitted
# at i64 width (the bit pattern is all-ones) rather than falling
# through to the i32 default, which would not hold the value.
return ir.Constant(ir.IntType(64), int(v))
return ir.Constant(ir.IntType(32), int(v))

Expand Down
8 changes: 8 additions & 0 deletions src/pascal1981/codegen/io_write_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,14 @@ def _pas_type(self, expr) -> Optional[object]:
if isinstance(expr, (Identifier, Designator)):
sym = self.scope.lookup(expr.name) or self.scope.lookup(expr.name.upper())
ty = getattr(sym, 'type_expr', None) if sym else None
if ty is None and (not isinstance(expr, Designator) or not expr.selectors):
# Builtin constants (e.g. MAXWORD32/MAXWORD64) are not seeded
# into the codegen scope, so fall back to their recorded Pascal
# type tag. This drives unsigned WRITE formatting for the wide
# unsigned max constants, which would otherwise print as -1.
tag = self.constant_types.get(expr.name.upper())
if tag is not None:
return NamedType(tag, None)
if isinstance(expr, Designator) and expr.selectors and ty is not None:
cur = ty
for sel in expr.selectors:
Expand Down
52 changes: 44 additions & 8 deletions src/pascal1981/type_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2303,22 +2303,58 @@ def _check_word_int_assign(self, value_type, target_type, value_expr, node) -> N
"constants change to WORD; convert a signed INTEGER value with "
"WRD(...)", node)

def _check_word_int_mix(self, left_type, right_type, left_expr, right_expr, op, node) -> None:
"""Diagnose a WORD/INTEGER mix in an arithmetic or bitwise expression.
# Equal-rank unsigned/signed integer pairs that carry the WORD/INTEGER
# signedness ambiguity. Rank == index into each tuple: rank 0 is the vintage
# 16-bit WORD/INTEGER pair the manual actually rules on; ranks 1 and 2 are the
# wide extension types (WORD32/INTEGER32, WORD64/INTEGER64), which the manual
# does not cover but which inherit the same "which signedness does the
# arithmetic use?" hazard. `binary_op_result_type` resolves every one of
# these same-width mixes to the unsigned member, so the diagnostic below is
# what makes that silent choice visible (and, under -f strict-word-int,
# refusable). Membership is tested with ``==`` rather than a dict keyed by
# type instance, because some operand types (e.g. SetType) are unhashable.
_WORD_FAMILY_BY_RANK = (WORD_TYPE, WORD32_TYPE, WORD64_TYPE)
_INT_FAMILY_BY_RANK = (INTEGER_TYPE, INTEGER32_TYPE, INTEGER64_TYPE)

@staticmethod
def _family_rank(t, family) -> Optional[int]:
"""Rank (0/1/2) of ``t`` within ``family``, or None if it is not a member.

Allowed when the INTEGER operand is a constant (it changes to WORD).
Otherwise a warning by default (the vintage compiler arbitrarily picks
signed or unsigned arithmetic), promoted to a hard error under
-f strict-word-int.
Uses equality, not a hash lookup, so unhashable operand types (SetType,
ArrayType, ...) simply compare unequal instead of raising.
"""
for rank, member in enumerate(family):
if t == member:
return rank
return None

def _check_word_int_mix(self, left_type, right_type, left_expr, right_expr, op, node) -> None:
"""Diagnose an unsigned/signed (WORD-family/INTEGER-family) mix at equal
width in an arithmetic or bitwise expression.

Covers the vintage 16-bit WORD/INTEGER pair *and* the equal-width wide
extension pairs WORD32/INTEGER32 and WORD64/INTEGER64. A mix at unequal
width is not flagged here: there the wider operand's signedness
unambiguously wins (see `binary_op_result_type`), so there is no
signedness coin-flip to warn about.

Allowed when the signed (INTEGER-family) operand is a constant (it
changes to the unsigned type). Otherwise a warning by default (the
vintage compiler arbitrarily picks signed or unsigned arithmetic),
promoted to a hard error under -f strict-word-int.
"""
ARITH_BITWISE = {'PLUS', 'MINUS', 'MUL', 'DIV', 'MOD', 'AND', 'OR', 'XOR'}
if op not in ARITH_BITWISE:
return
for a_t, b_t, b_e in ((left_type, right_type, right_expr),
(right_type, left_type, left_expr)):
if a_t == WORD_TYPE and b_t == INTEGER_TYPE:
# a_t is the unsigned (WORD-family) operand, b_t the signed
# (INTEGER-family) operand; only flag them at equal width/rank.
a_rank = self._family_rank(a_t, self._WORD_FAMILY_BY_RANK)
b_rank = self._family_rank(b_t, self._INT_FAMILY_BY_RANK)
if a_rank is not None and a_rank == b_rank:
if self._is_constant_integer_expr(b_e):
return # constant INTEGER changes to WORD: clean
return # constant INTEGER changes to the WORD type: clean
msg = ("WORD and INTEGER values cannot be mixed in an expression "
"unless the INTEGER operand is a constant; convert "
"explicitly with WRD(...) or ORD(...)")
Expand Down
6 changes: 4 additions & 2 deletions src/pascal1981/type_system.py
Original file line number Diff line number Diff line change
Expand Up @@ -492,8 +492,10 @@ def binary_op_result_type(left_type: Type, op: str, right_type: Type) -> Optiona
# value zero-extends into the i32); when they are the SAME width, an unsigned
# operand makes the result unsigned (WORD + INTEGER -> WORD, WORD32 +
# INTEGER32 -> WORD32), consistent with the vintage rank-0 WORD/INTEGER rule.
# The vintage WORD/INTEGER (16-bit) mix is additionally diagnosed in the type
# checker; the wide-type mixes are extension territory and are not diagnosed.
# Every such same-width unsigned/signed mix (the vintage 16-bit pair and the
# wide WORD32/INTEGER32, WORD64/INTEGER64 pairs alike) is additionally
# diagnosed in the type checker (_check_word_int_mix): a warning by default,
# an error under -f strict-word-int, with the INTEGER-constant exemption.
int_rank = {IntegerType: 0, WordType: 0,
Integer32Type: 1, Word32Type: 1,
Integer64Type: 2, Word64Type: 2}
Expand Down
Loading