Skip to content

[BugFix][Metax] release/2.5: TypeError calling post_process with line_break_id kwarg #7940

@linkeLi0421

Description

@linkeLi0421

Summary

On release/2.5, launching FastDeploy server with the MetaX backend
fails with:

TypeError: post_process() got an unexpected keyword argument 'line_break_id'

The bug is already fixed on develop (commit c369f71, PR #6521) but
was not cherry-picked to release/2.5.

Environment

  • FastDeploy: release/2.5 @ a60a3e630
  • Hardware: MetaX C500 64 GiB
  • Driver: MACA 3.3.0.15, MX-SMI 2.2.9
  • Python 3.10.10
  • paddlepaddle 3.4.0.dev20260127
  • paddle-metax-gpu 3.3.0.dev20260128+maca3.3.0.15

Steps to Reproduce

python -m fastdeploy.entrypoints.openai.api_server \
    --model /path/to/PaddleOCR-VL-1.5 \
    --port 8288 \
    --tensor-parallel-size 1 \
    --max-model-len 2048 \
    --max-num-seqs 4 \
    --gpu-memory-utilization 0.7 \
    --quantization wint8 \
    --enable-mm

Observed

Worker subprocess dies during graph_optimize_and_warm_up_model.
Full traceback in /root/log/workerlog.0:

File ".../fastdeploy/worker/worker_process.py", line 1244, in run_worker_proc
  worker_proc.graph_optimize_and_warm_up_model()
File ".../fastdeploy/worker/metax_worker.py", line 241, in graph_optimize_and_warm_up_model
  self.model_runner.capture_model()
File ".../fastdeploy/worker/metax_model_runner.py", line 2032, in capture_model
  self._dummy_run(...)
File ".../fastdeploy/worker/metax_model_runner.py", line 1903, in _dummy_run
  self._dummy_sampler_run(...)
File ".../fastdeploy/worker/metax_model_runner.py", line 1803, in _dummy_sampler_run
  post_process(
TypeError: post_process() got an unexpected keyword argument 'line_break_id'

Root Cause

metax_model_runner.py on release/2.5 calls
post_process(line_break_id=self.model_config.line_break_id, ...) at
lines 1708, 1813 and 2460. The signature of
fastdeploy/model_executor/pre_and_post_process.post_process() on the
same branch (line 621) does not include line_break_id — only
think_end_id, splitwise_role_is_decode, enable_entropy,
routing_replay_manager, sampling_mask_zmq_client.

The sibling gpu_model_runner.py on the same release/2.5 correctly
uses splitwise_role_is_decode=... at lines 1802, 1908 and 2567. The
MetaX runner was apparently missed when the post_process signature
changed.

This was already corrected on develop in commit c369f7139
(PR #6521). The fix simply swaps the kwarg.

Proposed Fix

Backport the 3-line portion of #6521 to release/2.5. PR is up at [PR #7939](https://github.com/PaddlePaddle/FastDeploy/pull/7939)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions