Skip to content

[BUG] Flaky cuOpt status when used as PuLP backend solver #960

@j3soon

Description

@j3soon

Describe the bug

When using cuOpt as PuLP backend, cuOpt may non-deterministically report Optimal/FeasibleFound status even if the solution is optimal (which corresponds to the Optimal/Not Solved status in PuLP).

Steps/Code to reproduce bug

Create pulp_cuopt_flaky_status_repro.py:

#!/usr/bin/env python3
"""Pure-PuLP repro: same model, mixed cuOpt statuses, same objective."""

import argparse
from collections import Counter

import pulp


class T(pulp.CUOPT):
    def actualSolve(self, lp, callback=None):
        self.buildSolverModel(lp)
        s = self.callSolver(lp, callback=callback)
        self.raw = int(s.get_termination_status()), str(s.get_termination_reason())
        r = self.findSolutionValues(lp, s)
        for v in lp._variables:
            v.modified = False
        for c in lp.constraints.values():
            c.modified = False
        return r


def eq(prob, name, expr, target, bounds):
    v = pulp.LpVariable(name, cat="Binary")
    lo, hi = bounds
    if target < lo or target > hi:
        prob += v == 0
        return v
    d = expr - target
    s = pulp.LpVariable(f"{name}_eq_side", cat="Binary")
    prob += d <= (hi - target) * (1 - v)
    prob += d >= (lo - target) * (1 - v)
    prob += d <= -1 + (hi - target + 1) * s + v
    prob += d >= 1 - (target - lo + 1) * (1 - s) - v
    return v


def build():
    p = pulp.LpProblem("cuopt_status_repro", pulp.LpMaximize)
    x = {(d, s, n): pulp.LpVariable(f"x_{d}_{s}_{n}", cat="Binary") for d in range(7) for s in range(3) for n in range(4)}
    for d in range(7):
        for n in range(4):
            if (d, n) != (3, 2):
                eq(p, f"off_{d}_{n}", sum(x[d, s, n] for s in range(3)), 0, (0, 3))
            if (d, n) not in {(0, 0), (6, 3)}:
                p += sum(x[d, s, n] for s in range(3)) <= 1
        for s in range(3):
            if (d, s) != (0, 0):
                p += sum(x[d, s, n] for n in range(4)) == 1
    p += sum(x[d, 0, 0] + x[d, 1, 1] + x[d, 2, 2] for d in range(7)) - sum(
        eq(p, f"m_{d}", x[d, 2, 2] + x[d + 1, 2, 2], 2, (0, 2)) for d in range(6)
    )
    return p


def main(repeat):
    mapped, raw = Counter(), Counter()
    for i in range(repeat):
        p = build()
        s = T(msg=False, gapRel=0.0, optimality_tolerance=1e-3)
        r = p.solve(s)
        mapped[pulp.LpStatus[r]] += 1
        raw[s.raw] += 1
        print(f"run={i} raw_code={s.raw[0]} raw_reason={s.raw[1]} pulp_status={pulp.LpStatus[r]} objective={pulp.value(p.objective)}")
    print(f"mapped_counts={dict(mapped)}")
    print(f"raw_counts={dict(raw)}")


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--repeat", type=int, default=100)
    main(parser.parse_args().repeat)

Then run in container (CUDA 12.9 or 13.0 can both reproduce):

# can also switch to: nvidia/cuopt:latest-cuda12.9-py3.13
docker run --rm -it --gpus all --entrypoint bash \
  -w /app \
  -v $(pwd):/app \
  nvidia/cuopt:latest-cuda13.0-py3.13
# in the container, run:
pip install pulp==3.3.0
python pulp_cuopt_flaky_status_repro.py

sample output on my Quadro RTX 6000 GPU:

/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:307: UserWarning: Casting variable_lower_bounds from int64 to float64
  super().set_variable_lower_bounds(variable_lower_bounds)
/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:325: UserWarning: Casting variable_upper_bounds from int64 to float64
  super().set_variable_upper_bounds(variable_upper_bounds)
/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:194: UserWarning: Casting A_values from int64 to float64
  super().set_csr_constraint_matrix(A_values, A_indices, A_offsets)
run=0 raw_code=1 raw_reason=Optimal pulp_status=Optimal objective=18.0 
...
run=95 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=96 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=97 raw_code=1 raw_reason=Optimal pulp_status=Optimal objective=18.0
run=98 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=99 raw_code=1 raw_reason=Optimal pulp_status=Optimal objective=18.0
mapped_counts={'Optimal': 57, 'Not Solved': 43}
raw_counts={(1, 'Optimal'): 57, (8, 'FeasibleFound'): 43}

Sample output on my DGX Spark:

/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:307: UserWarning: Casting variable_lower_bounds from int64 to float64
  super().set_variable_lower_bounds(variable_lower_bounds)
/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:325: UserWarning: Casting variable_upper_bounds from int64 to float64
  super().set_variable_upper_bounds(variable_upper_bounds)
/usr/local/lib/python3.13/dist-packages/cuopt/linear_programming/data_model/data_model.py:194: UserWarning: Casting A_values from int64 to float64
  super().set_csr_constraint_matrix(A_values, A_indices, A_offsets)
run=0 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=1 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=2 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=3 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=4 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
run=5 raw_code=1 raw_reason=Optimal pulp_status=Optimal objective=18.0
...
run=99 raw_code=8 raw_reason=FeasibleFound pulp_status=Not Solved objective=18.0
mapped_counts={'Not Solved': 98, 'Optimal': 2}
raw_counts={(8, 'FeasibleFound'): 98, (1, 'Optimal'): 2}

Expected behavior

The cuOpt solver status should always be Optimal:

...
run=99 raw_code=1 raw_reason=Optimal pulp_status=Optimal objective=18.0
mapped_counts={'Optimal': 100}
raw_counts={(1, 'Optimal'): 100}

Environment details (please complete the following information):

  • Environment location: Docker
  • Method of cuOpt install: Docker

Additional context

The repro code is condensed by Codex. Further removing constraints will become much difficult to reproduce (requires more run).

Metadata

Metadata

Assignees

Labels

awaiting responseThis expects a response from maintainer or contributor depending on who requested in last comment.bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions