Integration-Automation · JE-Chen · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026
diff --git a/README/WHATS_NEW_zh-CN.md b/README/WHATS_NEW_zh-CN.md
@@ -1,5 +1,23 @@
 # 本次更新 — AutoControl
 
+## 本次更新 (2026-06-24) — 逐步评审特征 + 规则式步骤评分
+
+把为代理步骤评分所需的证据打包,并内建规则式评分器。完整参考:[`docs/source/Zh/doc/new_features/v177_features_doc.rst`](../docs/source/Zh/doc/new_features/v177_features_doc.rst)。
+
+- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`**(`AC_build_critic_record`、`AC_score_step`):`trajectory_eval` 对整条轨迹评分而无逐步证据;`agent_trace` 发出 span 而非质量;`agent_replay` 保存步骤却不评分。本功能把 `action_effect` + `observation_delta` + `postcondition` 组合成单一逐步记录,接着 `score_step_rule_based` 给出确定性的 `{outcome, process_score, reasons}`(不需模型),`to_judge_prompt` 把它渲染给可选的 LLM-as-judge。纯标准库聚合器;不导入 `PySide6`。
+
+## 本次更新 (2026-06-24) — 标题与正文分类 + 文档大纲
+
+以高度区分标题与正文,并建立文档大纲。完整参考:[`docs/source/Zh/doc/new_features/v176_features_doc.rst`](../docs/source/Zh/doc/new_features/v176_features_doc.rst)。
+
+- **`classify_lines` / `outline`**(`AC_classify_lines`、`AC_outline`):框架中没有功能把行高对应到标题层级或建立章节大纲——`ocr/structure` / `element_parse` 纯属位置性,`text_blocks` 不排序。本功能套用标准启发法:行高超过 `heading_ratio` × 中位行高者为标题,不同标题高度成为层级(最高 = 1)。`classify_lines` 为每行标记 `{box, text, role, level}`;`outline` 依序返回标题作为目录。纯标准库,作用于行字典;不导入 `PySide6`。
+
+## 本次更新 (2026-06-24) — 变化量序列的稳定检测
+
+判断 UI 何时安定下来——以纯粹、可测试的函数作用于变化序列。完整参考:[`docs/source/Zh/doc/new_features/v175_features_doc.rst`](../docs/source/Zh/doc/new_features/v175_features_doc.rst)。
+
+- **`settle_point` / `is_settled` / `SettleTracker`**(`AC_settle_point`):`smart_waits.wait_until_screen_stable` 把稳定逻辑包在 `time.sleep` 循环内、作用于实时帧——你无法喂记录好的序列,也无法单元测试该决策。本功能把它抽离:给定一串*变化量*(像素差 / 元素数差 / 0-1 digest 是否变),在变化量连续 `quiet_samples` 次维持 ≤ `max_churn` 时报告稳定(尖峰重置 run)。`settle_point` 返回稳定索引,`SettleTracker` 为供实时循环的增量形式。纯标准库,不需时钟、不需捕获;不导入 `PySide6`。
+
 ## 本次更新 (2026-06-24) — OCR 行的段落与列表分组
 
 把 OCR 行分组成段落,并检测项目符号 / 编号列表。完整参考:[`docs/source/Zh/doc/new_features/v174_features_doc.rst`](../docs/source/Zh/doc/new_features/v174_features_doc.rst)。

diff --git a/README/WHATS_NEW_zh-TW.md b/README/WHATS_NEW_zh-TW.md
@@ -1,5 +1,23 @@
 # 本次更新 — AutoControl
 
+## 本次更新 (2026-06-24) — 逐步評審特徵 + 規則式步驟評分
+
+把為代理步驟評分所需的證據打包,並內建規則式評分器。完整參考:[`docs/source/Zh/doc/new_features/v177_features_doc.rst`](../docs/source/Zh/doc/new_features/v177_features_doc.rst)。
+
+- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`**(`AC_build_critic_record`、`AC_score_step`):`trajectory_eval` 對整條軌跡評分而無逐步證據;`agent_trace` 發出 span 而非品質;`agent_replay` 保存步驟卻不評分。本功能把 `action_effect` + `observation_delta` + `postcondition` 組合成單一逐步記錄,接著 `score_step_rule_based` 給出確定性的 `{outcome, process_score, reasons}`(不需模型),`to_judge_prompt` 把它渲染給可選的 LLM-as-judge。純標準函式庫聚合器;不匯入 `PySide6`。
+
+## 本次更新 (2026-06-24) — 標題與內文分類 + 文件大綱
+
+以高度區分標題與內文,並建立文件大綱。完整參考:[`docs/source/Zh/doc/new_features/v176_features_doc.rst`](../docs/source/Zh/doc/new_features/v176_features_doc.rst)。
+
+- **`classify_lines` / `outline`**(`AC_classify_lines`、`AC_outline`):框架中沒有功能把行高對應到標題層級或建立章節大綱——`ocr/structure` / `element_parse` 純屬位置性,`text_blocks` 不排序。本功能套用標準啟發法:行高超過 `heading_ratio` × 中位行高者為標題,不同標題高度成為層級(最高 = 1)。`classify_lines` 為每行標記 `{box, text, role, level}`;`outline` 依序回傳標題作為目錄。純標準函式庫,作用於行字典;不匯入 `PySide6`。
+
+## 本次更新 (2026-06-24) — 變化量序列的穩定偵測
+
+判斷 UI 何時安定下來——以純粹、可測試的函式作用於變化序列。完整參考:[`docs/source/Zh/doc/new_features/v175_features_doc.rst`](../docs/source/Zh/doc/new_features/v175_features_doc.rst)。
+
+- **`settle_point` / `is_settled` / `SettleTracker`**(`AC_settle_point`):`smart_waits.wait_until_screen_stable` 把穩定邏輯包在 `time.sleep` 迴圈內、作用於即時幀——你無法餵記錄好的序列,也無法單元測試該決策。本功能把它抽離:給定一串*變化量*(像素差 / 元素數差 / 0-1 digest 是否變),在變化量連續 `quiet_samples` 次維持 ≤ `max_churn` 時回報穩定(尖峰重置 run)。`settle_point` 回傳穩定索引,`SettleTracker` 為供即時迴圈的增量形式。純標準函式庫,不需時鐘、不需擷取;不匯入 `PySide6`。
+
 ## 本次更新 (2026-06-24) — OCR 行的段落與清單分組
 
 把 OCR 行分組成段落,並偵測項目符號 / 編號清單。完整參考:[`docs/source/Zh/doc/new_features/v174_features_doc.rst`](../docs/source/Zh/doc/new_features/v174_features_doc.rst)。

diff --git a/WHATS_NEW.md b/WHATS_NEW.md
@@ -1,5 +1,23 @@
 # What's New — AutoControl
 
+## What's new (2026-06-24) — Per-Step Critic Features + Rule-Based Step Scorer
+
+Bundle the evidence to score an agent step, with a built-in rule-based scorer. Full reference: [`docs/source/Eng/doc/new_features/v177_features_doc.rst`](docs/source/Eng/doc/new_features/v177_features_doc.rst).
+
+- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`** (`AC_build_critic_record`, `AC_score_step`): `trajectory_eval` scores a whole trajectory with no per-step evidence; `agent_trace` emits spans not quality; `agent_replay` stores steps but doesn't score. This composes `action_effect` + `observation_delta` + `postcondition` into one per-step record, then `score_step_rule_based` gives a deterministic `{outcome, process_score, reasons}` (no model needed) and `to_judge_prompt` renders it for an optional LLM-as-judge. Pure-stdlib aggregator; no `PySide6`.
+
+## What's new (2026-06-24) — Heading vs Body Classification + Document Outline
+
+Tell headings from body text by height and build a document outline. Full reference: [`docs/source/Eng/doc/new_features/v176_features_doc.rst`](docs/source/Eng/doc/new_features/v176_features_doc.rst).
+
+- **`classify_lines` / `outline`** (`AC_classify_lines`, `AC_outline`): nothing mapped line height to heading levels or built a section outline — `ocr/structure` / `element_parse` are positional and `text_blocks` doesn't rank. This applies the standard heuristic: a line taller than `heading_ratio` × the median line height is a heading, and distinct heading heights become levels (tallest = 1). `classify_lines` tags each line `{box, text, role, level}`; `outline` returns the headings in order as a table of contents. Pure-stdlib over line dicts; no `PySide6`.
+
+## What's new (2026-06-24) — Settle Detection Over a Churn Series
+
+Decide when the UI has gone quiet — as a pure, testable function over a change series. Full reference: [`docs/source/Eng/doc/new_features/v175_features_doc.rst`](docs/source/Eng/doc/new_features/v175_features_doc.rst).
+
+- **`settle_point` / `is_settled` / `SettleTracker`** (`AC_settle_point`): `smart_waits.wait_until_screen_stable` bakes the settle logic inside a `time.sleep` loop over live frames — you can't feed it a recorded series or unit-test the decision. This extracts it: given a stream of *churn* values (pixel delta / element-count delta / 0-1 digest-changed), it reports when churn stayed ≤ `max_churn` for `quiet_samples` in a row (a spike resets the run). `settle_point` returns the settle index, `SettleTracker` is the incremental form for a live loop. Pure-stdlib, no clock, no capture; no `PySide6`.
+
 ## What's new (2026-06-24) — Paragraph & List Grouping of OCR Lines
 
 Group OCR lines into paragraphs and detect bulleted / numbered lists. Full reference: [`docs/source/Eng/doc/new_features/v174_features_doc.rst`](docs/source/Eng/doc/new_features/v174_features_doc.rst).

diff --git a/docs/source/Eng/doc/new_features/v175_features_doc.rst b/docs/source/Eng/doc/new_features/v175_features_doc.rst
@@ -0,0 +1,43 @@
+Settle Detection Over a Churn Series
+====================================
+
+``smart_waits.wait_until_screen_stable`` and ``actionability``'s stability check bake the
+settle logic *inside* a ``time.sleep`` polling loop over live pixel frames — you cannot feed
+them a recorded series of a11y-element counts or screen-diff metrics, and you cannot unit-test
+the *decision* independently of capture. ``settle_detector`` extracts that decision: it takes a
+stream of *churn* values (how much changed each sample — a pixel delta, an element-count delta,
+a digest-changed 0/1, anything) and reports when the churn has stayed at or below ``max_churn``
+for ``quiet_samples`` in a row. A spike resets the quiet run, so "settled then changed again"
+is handled.
+
+Pure-stdlib; deterministic and unit-testable on an injected series with no capture and no
+clock. Imports no ``PySide6``.
+
+Headless API
+------------
+
+.. code-block:: python
+
+    from je_auto_control import settle_point, is_settled, SettleTracker
+
+    churns = [5, 4, 0.5, 0.3, 0.2]          # per-frame change metric
+    settle_point(churns, quiet_samples=3, max_churn=1.0)   # -> 4
+    is_settled(churns, quiet_samples=3, max_churn=1.0)     # -> True
+
+    # incremental, for a live loop (you supply the churn each tick)
+    tracker = SettleTracker(quiet_samples=3, max_churn=1.0)
+    state = tracker.update(current_churn)
+    if state.settled:
+        observe_now()
+
+``settle_point`` returns the index at which the series first settles (or ``None``);
+``is_settled`` is the boolean. ``SettleTracker`` is the incremental form: ``update(churn)``
+returns a ``SettleState`` (``settled`` / ``quiet_run`` / ``churn``); ``reset`` clears the run
+(e.g. right after acting again).
+
+Executor command
+----------------
+
+``AC_settle_point`` (``churns`` / ``quiet_samples`` / ``max_churn`` → ``{settled, index}``) is
+exposed as the MCP tool ``ac_settle_point`` (read-only) and as the Script Builder command
+**Settle Point (churn series)** under **Flow**.
diff --git a/docs/source/Eng/doc/new_features/v176_features_doc.rst b/docs/source/Eng/doc/new_features/v176_features_doc.rst
@@ -0,0 +1,37 @@
+Heading vs Body Classification + Document Outline
+=================================================
+
+Nothing in the framework maps line height to heading levels or builds a section outline —
+``ocr/structure`` and ``element_parse`` are purely positional, and ``text_blocks`` groups
+paragraphs / lists but does not rank them. ``heading_segment`` adds the standard heuristic:
+a line whose height exceeds ``heading_ratio`` times the median line height is a heading, and
+distinct heading heights become heading *levels* (the tallest is level 1). From that it emits
+a flat document outline.
+
+Pure-stdlib over plain line dicts (text + bbox); fully unit-testable with no image and no OCR
+engine. Reuses ``table_grid_fill``'s box-bounds reader. Imports no ``PySide6``.
+
+Headless API
+------------
+
+.. code-block:: python
+
+    from je_auto_control import classify_lines, outline
+
+    for item in classify_lines(ocr_lines, heading_ratio=1.2):
+        print(item["role"], item["level"], item["text"])
+
+    for heading in outline(ocr_lines):
+        print("  " * (heading["level"] - 1) + heading["text"])
+
+``classify_lines`` tags each line ``{box, text, role, level}`` — ``role`` is ``"heading"`` or
+``"body"``, ``level`` is the heading level (1 = tallest, 0 for body). ``outline`` returns just
+the headings in top-to-bottom order as ``{level, text, top}`` — a document table of contents.
+
+Executor commands
+-----------------
+
+``AC_classify_lines`` (``lines`` / ``heading_ratio`` → ``{count, lines}``) and ``AC_outline``
+(``lines`` / ``heading_ratio`` → ``{count, headings}``). They are exposed as the MCP tools
+``ac_classify_lines`` / ``ac_outline`` (read-only) and as the Script Builder commands
+**Classify Headings vs Body** / **Document Outline** under **OCR**.
diff --git a/docs/source/Eng/doc/new_features/v177_features_doc.rst b/docs/source/Eng/doc/new_features/v177_features_doc.rst
@@ -0,0 +1,46 @@
+Per-Step Critic Features + Rule-Based Step Scorer
+=================================================
+
+Scoring an agent's step needs the evidence in one place — what the action was, what changed,
+whether it landed on target, whether the declared postcondition held. ``trajectory_eval``
+scores a *finished whole trajectory* against a static rubric and has no per-step evidence;
+``agent_trace`` emits OTel spans (tokens / latency), not decision quality; ``agent_replay``
+persists ``{obs, action, result}`` but does no scoring. ``critic_features`` is the missing
+per-step layer: it composes ``action_effect`` (did it do anything, where),
+``observation_delta`` (how much changed) and ``postcondition`` (did the expected outcome hold)
+into one compact record, and ships a deterministic rule-based scorer so the feature works fully
+headless — leaving the optional LLM-as-judge to the integrator.
+
+Pure-stdlib; composes existing pure modules; deterministic and unit-testable with no device.
+Imports no ``PySide6``.
+
+Headless API
+------------
+
+.. code-block:: python
+
+    from je_auto_control import (build_critic_record, score_step_rule_based,
+                                 to_judge_prompt)
+
+    record = build_critic_record({"type": "click", "x": 480, "y": 260},
+                                 before_elements, after_elements,
+                                 postcondition={"appears": {"role": "dialog"}})
+    score = score_step_rule_based(record)
+    # {"outcome": True, "process_score": 1.0, "reasons": [...]}
+
+    prompt = to_judge_prompt(record)   # compact text for an LLM-as-judge
+
+``build_critic_record`` returns ``{action, effect, delta_counts}`` plus a ``postcondition``
+report when a spec is given. ``score_step_rule_based`` returns ``{outcome, process_score,
+reasons}`` — ``outcome`` is a binary success (the action did something *and* any postcondition
+held), ``process_score`` is a 0..1 quality from the effect class (halved if the postcondition
+failed). ``to_judge_prompt`` renders the record for an external judge.
+
+Executor commands
+-----------------
+
+``AC_build_critic_record`` (``action`` / ``before`` / ``after`` / ``postcondition`` /
+``radius`` → the record) and ``AC_score_step`` (``record`` → ``{outcome, process_score,
+reasons}``). They are exposed as the MCP tools ``ac_build_critic_record`` / ``ac_score_step``
+(read-only) and as the Script Builder commands **Build Critic Record** / **Score Step
+(rule-based)** under **Native UI**.
diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst
@@ -197,6 +197,9 @@ Comprehensive guides for all AutoControl features.
    doc/new_features/v172_features_doc
    doc/new_features/v173_features_doc
    doc/new_features/v174_features_doc
+   doc/new_features/v175_features_doc
+   doc/new_features/v176_features_doc
+   doc/new_features/v177_features_doc
    doc/ocr_backends/ocr_backends_doc
    doc/observability/observability_doc
    doc/operations_layer/operations_layer_doc

diff --git a/docs/source/Zh/doc/new_features/v175_features_doc.rst b/docs/source/Zh/doc/new_features/v175_features_doc.rst
@@ -0,0 +1,39 @@
+變化量序列的穩定偵測
+====================
+
+``smart_waits.wait_until_screen_stable`` 與 ``actionability`` 的穩定檢查把穩定邏輯包在
+``time.sleep`` 輪詢迴圈內、作用於即時像素幀——你無法餵給它一段記錄好的 a11y 元素數或畫面
+差異指標序列,也無法獨立於擷取去單元測試那個*決策*。``settle_detector`` 把該決策抽離:它接收
+一串*變化量*(churn,每個樣本變了多少——像素差、元素數差、digest 是否變的 0/1,皆可),並在
+變化量連續 ``quiet_samples`` 次維持在 ``max_churn`` 以下時回報穩定。尖峰會重置 quiet run,因此
+「穩定後又變動」也能處理。
+
+純標準函式庫;確定性、可在注入序列上單元測試,不需擷取、不需時鐘。不匯入 ``PySide6``。
+
+無頭 API
+--------
+
+.. code-block:: python
+
+    from je_auto_control import settle_point, is_settled, SettleTracker
+
+    churns = [5, 4, 0.5, 0.3, 0.2]          # 每幀變化量指標
+    settle_point(churns, quiet_samples=3, max_churn=1.0)   # -> 4
+    is_settled(churns, quiet_samples=3, max_churn=1.0)     # -> True
+
+    # 增量版,供即時迴圈(你每 tick 提供 churn)
+    tracker = SettleTracker(quiet_samples=3, max_churn=1.0)
+    state = tracker.update(current_churn)
+    if state.settled:
+        observe_now()
+
+``settle_point`` 回傳序列首次穩定的索引(或 ``None``);``is_settled`` 為布林。``SettleTracker``
+為增量形式:``update(churn)`` 回傳 ``SettleState``(``settled`` / ``quiet_run`` / ``churn``);
+``reset`` 清除 run(例如在再次動作後)。
+
+執行器指令
+----------
+
+``AC_settle_point``(``churns`` / ``quiet_samples`` / ``max_churn`` → ``{settled, index}``)
+以 MCP 工具 ``ac_settle_point``(唯讀)及 Script Builder 指令 **Settle Point (churn series)**
+(位於 **Flow** 分類下)形式提供。
diff --git a/docs/source/Zh/doc/new_features/v176_features_doc.rst b/docs/source/Zh/doc/new_features/v176_features_doc.rst
@@ -0,0 +1,35 @@
+標題與內文分類 + 文件大綱
+==========================
+
+框架中沒有任何功能把行高對應到標題層級或建立章節大綱——``ocr/structure`` 與 ``element_parse``
+純屬位置性,``text_blocks`` 把段落 / 清單分組但不對其排序。``heading_segment`` 補上標準啟發法:
+行高超過 ``heading_ratio`` 乘以中位行高者為標題,且不同的標題高度成為標題*層級*(最高為第 1 級)。
+由此輸出扁平的文件大綱。
+
+純標準函式庫,作用於純行字典(text + bbox);可在無影像、無 OCR 引擎下完整單元測試。重用
+``table_grid_fill`` 的框邊界讀取器。不匯入 ``PySide6``。
+
+無頭 API
+--------
+
+.. code-block:: python
+
+    from je_auto_control import classify_lines, outline
+
+    for item in classify_lines(ocr_lines, heading_ratio=1.2):
+        print(item["role"], item["level"], item["text"])
+
+    for heading in outline(ocr_lines):
+        print("  " * (heading["level"] - 1) + heading["text"])
+
+``classify_lines`` 為每行標記 ``{box, text, role, level}``——``role`` 為 ``"heading"`` 或
+``"body"``,``level`` 為標題層級(1 = 最高,內文為 0)。``outline`` 只回傳依上到下順序的標題,
+為 ``{level, text, top}``——即文件目錄。
+
+執行器指令
+----------
+
+``AC_classify_lines``(``lines`` / ``heading_ratio`` → ``{count, lines}``)與 ``AC_outline``
+(``lines`` / ``heading_ratio`` → ``{count, headings}``)。兩者以 MCP 工具 ``ac_classify_lines`` /
+``ac_outline``(唯讀)及 Script Builder 指令 **Classify Headings vs Body** / **Document Outline**
+(位於 **OCR** 分類下)形式提供。