Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions README/WHATS_NEW_zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 逐步评审特征 + 规则式步骤评分

把为代理步骤评分所需的证据打包,并内建规则式评分器。完整参考:[`docs/source/Zh/doc/new_features/v177_features_doc.rst`](../docs/source/Zh/doc/new_features/v177_features_doc.rst)。

- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`**(`AC_build_critic_record`、`AC_score_step`):`trajectory_eval` 对整条轨迹评分而无逐步证据;`agent_trace` 发出 span 而非质量;`agent_replay` 保存步骤却不评分。本功能把 `action_effect` + `observation_delta` + `postcondition` 组合成单一逐步记录,接着 `score_step_rule_based` 给出确定性的 `{outcome, process_score, reasons}`(不需模型),`to_judge_prompt` 把它渲染给可选的 LLM-as-judge。纯标准库聚合器;不导入 `PySide6`。

## 本次更新 (2026-06-24) — 标题与正文分类 + 文档大纲

以高度区分标题与正文,并建立文档大纲。完整参考:[`docs/source/Zh/doc/new_features/v176_features_doc.rst`](../docs/source/Zh/doc/new_features/v176_features_doc.rst)。

- **`classify_lines` / `outline`**(`AC_classify_lines`、`AC_outline`):框架中没有功能把行高对应到标题层级或建立章节大纲——`ocr/structure` / `element_parse` 纯属位置性,`text_blocks` 不排序。本功能套用标准启发法:行高超过 `heading_ratio` × 中位行高者为标题,不同标题高度成为层级(最高 = 1)。`classify_lines` 为每行标记 `{box, text, role, level}`;`outline` 依序返回标题作为目录。纯标准库,作用于行字典;不导入 `PySide6`。

## 本次更新 (2026-06-24) — 变化量序列的稳定检测

判断 UI 何时安定下来——以纯粹、可测试的函数作用于变化序列。完整参考:[`docs/source/Zh/doc/new_features/v175_features_doc.rst`](../docs/source/Zh/doc/new_features/v175_features_doc.rst)。

- **`settle_point` / `is_settled` / `SettleTracker`**(`AC_settle_point`):`smart_waits.wait_until_screen_stable` 把稳定逻辑包在 `time.sleep` 循环内、作用于实时帧——你无法喂记录好的序列,也无法单元测试该决策。本功能把它抽离:给定一串*变化量*(像素差 / 元素数差 / 0-1 digest 是否变),在变化量连续 `quiet_samples` 次维持 ≤ `max_churn` 时报告稳定(尖峰重置 run)。`settle_point` 返回稳定索引,`SettleTracker` 为供实时循环的增量形式。纯标准库,不需时钟、不需捕获;不导入 `PySide6`。

## 本次更新 (2026-06-24) — OCR 行的段落与列表分组

把 OCR 行分组成段落,并检测项目符号 / 编号列表。完整参考:[`docs/source/Zh/doc/new_features/v174_features_doc.rst`](../docs/source/Zh/doc/new_features/v174_features_doc.rst)。
Expand Down
18 changes: 18 additions & 0 deletions README/WHATS_NEW_zh-TW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 逐步評審特徵 + 規則式步驟評分

把為代理步驟評分所需的證據打包,並內建規則式評分器。完整參考:[`docs/source/Zh/doc/new_features/v177_features_doc.rst`](../docs/source/Zh/doc/new_features/v177_features_doc.rst)。

- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`**(`AC_build_critic_record`、`AC_score_step`):`trajectory_eval` 對整條軌跡評分而無逐步證據;`agent_trace` 發出 span 而非品質;`agent_replay` 保存步驟卻不評分。本功能把 `action_effect` + `observation_delta` + `postcondition` 組合成單一逐步記錄,接著 `score_step_rule_based` 給出確定性的 `{outcome, process_score, reasons}`(不需模型),`to_judge_prompt` 把它渲染給可選的 LLM-as-judge。純標準函式庫聚合器;不匯入 `PySide6`。

## 本次更新 (2026-06-24) — 標題與內文分類 + 文件大綱

以高度區分標題與內文,並建立文件大綱。完整參考:[`docs/source/Zh/doc/new_features/v176_features_doc.rst`](../docs/source/Zh/doc/new_features/v176_features_doc.rst)。

- **`classify_lines` / `outline`**(`AC_classify_lines`、`AC_outline`):框架中沒有功能把行高對應到標題層級或建立章節大綱——`ocr/structure` / `element_parse` 純屬位置性,`text_blocks` 不排序。本功能套用標準啟發法:行高超過 `heading_ratio` × 中位行高者為標題,不同標題高度成為層級(最高 = 1)。`classify_lines` 為每行標記 `{box, text, role, level}`;`outline` 依序回傳標題作為目錄。純標準函式庫,作用於行字典;不匯入 `PySide6`。

## 本次更新 (2026-06-24) — 變化量序列的穩定偵測

判斷 UI 何時安定下來——以純粹、可測試的函式作用於變化序列。完整參考:[`docs/source/Zh/doc/new_features/v175_features_doc.rst`](../docs/source/Zh/doc/new_features/v175_features_doc.rst)。

- **`settle_point` / `is_settled` / `SettleTracker`**(`AC_settle_point`):`smart_waits.wait_until_screen_stable` 把穩定邏輯包在 `time.sleep` 迴圈內、作用於即時幀——你無法餵記錄好的序列,也無法單元測試該決策。本功能把它抽離:給定一串*變化量*(像素差 / 元素數差 / 0-1 digest 是否變),在變化量連續 `quiet_samples` 次維持 ≤ `max_churn` 時回報穩定(尖峰重置 run)。`settle_point` 回傳穩定索引,`SettleTracker` 為供即時迴圈的增量形式。純標準函式庫,不需時鐘、不需擷取;不匯入 `PySide6`。

## 本次更新 (2026-06-24) — OCR 行的段落與清單分組

把 OCR 行分組成段落,並偵測項目符號 / 編號清單。完整參考:[`docs/source/Zh/doc/new_features/v174_features_doc.rst`](../docs/source/Zh/doc/new_features/v174_features_doc.rst)。
Expand Down
18 changes: 18 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
# What's New — AutoControl

## What's new (2026-06-24) — Per-Step Critic Features + Rule-Based Step Scorer

Bundle the evidence to score an agent step, with a built-in rule-based scorer. Full reference: [`docs/source/Eng/doc/new_features/v177_features_doc.rst`](docs/source/Eng/doc/new_features/v177_features_doc.rst).

- **`build_critic_record` / `score_step_rule_based` / `to_judge_prompt`** (`AC_build_critic_record`, `AC_score_step`): `trajectory_eval` scores a whole trajectory with no per-step evidence; `agent_trace` emits spans not quality; `agent_replay` stores steps but doesn't score. This composes `action_effect` + `observation_delta` + `postcondition` into one per-step record, then `score_step_rule_based` gives a deterministic `{outcome, process_score, reasons}` (no model needed) and `to_judge_prompt` renders it for an optional LLM-as-judge. Pure-stdlib aggregator; no `PySide6`.

## What's new (2026-06-24) — Heading vs Body Classification + Document Outline

Tell headings from body text by height and build a document outline. Full reference: [`docs/source/Eng/doc/new_features/v176_features_doc.rst`](docs/source/Eng/doc/new_features/v176_features_doc.rst).

- **`classify_lines` / `outline`** (`AC_classify_lines`, `AC_outline`): nothing mapped line height to heading levels or built a section outline — `ocr/structure` / `element_parse` are positional and `text_blocks` doesn't rank. This applies the standard heuristic: a line taller than `heading_ratio` × the median line height is a heading, and distinct heading heights become levels (tallest = 1). `classify_lines` tags each line `{box, text, role, level}`; `outline` returns the headings in order as a table of contents. Pure-stdlib over line dicts; no `PySide6`.

## What's new (2026-06-24) — Settle Detection Over a Churn Series

Decide when the UI has gone quiet — as a pure, testable function over a change series. Full reference: [`docs/source/Eng/doc/new_features/v175_features_doc.rst`](docs/source/Eng/doc/new_features/v175_features_doc.rst).

- **`settle_point` / `is_settled` / `SettleTracker`** (`AC_settle_point`): `smart_waits.wait_until_screen_stable` bakes the settle logic inside a `time.sleep` loop over live frames — you can't feed it a recorded series or unit-test the decision. This extracts it: given a stream of *churn* values (pixel delta / element-count delta / 0-1 digest-changed), it reports when churn stayed ≤ `max_churn` for `quiet_samples` in a row (a spike resets the run). `settle_point` returns the settle index, `SettleTracker` is the incremental form for a live loop. Pure-stdlib, no clock, no capture; no `PySide6`.

## What's new (2026-06-24) — Paragraph & List Grouping of OCR Lines

Group OCR lines into paragraphs and detect bulleted / numbered lists. Full reference: [`docs/source/Eng/doc/new_features/v174_features_doc.rst`](docs/source/Eng/doc/new_features/v174_features_doc.rst).
Expand Down
43 changes: 43 additions & 0 deletions docs/source/Eng/doc/new_features/v175_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
Settle Detection Over a Churn Series
====================================

``smart_waits.wait_until_screen_stable`` and ``actionability``'s stability check bake the
settle logic *inside* a ``time.sleep`` polling loop over live pixel frames — you cannot feed
them a recorded series of a11y-element counts or screen-diff metrics, and you cannot unit-test
the *decision* independently of capture. ``settle_detector`` extracts that decision: it takes a
stream of *churn* values (how much changed each sample — a pixel delta, an element-count delta,
a digest-changed 0/1, anything) and reports when the churn has stayed at or below ``max_churn``
for ``quiet_samples`` in a row. A spike resets the quiet run, so "settled then changed again"
is handled.

Pure-stdlib; deterministic and unit-testable on an injected series with no capture and no
clock. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import settle_point, is_settled, SettleTracker

churns = [5, 4, 0.5, 0.3, 0.2] # per-frame change metric
settle_point(churns, quiet_samples=3, max_churn=1.0) # -> 4
is_settled(churns, quiet_samples=3, max_churn=1.0) # -> True

# incremental, for a live loop (you supply the churn each tick)
tracker = SettleTracker(quiet_samples=3, max_churn=1.0)
state = tracker.update(current_churn)
if state.settled:
observe_now()

``settle_point`` returns the index at which the series first settles (or ``None``);
``is_settled`` is the boolean. ``SettleTracker`` is the incremental form: ``update(churn)``
returns a ``SettleState`` (``settled`` / ``quiet_run`` / ``churn``); ``reset`` clears the run
(e.g. right after acting again).

Executor command
----------------

``AC_settle_point`` (``churns`` / ``quiet_samples`` / ``max_churn`` → ``{settled, index}``) is
exposed as the MCP tool ``ac_settle_point`` (read-only) and as the Script Builder command
**Settle Point (churn series)** under **Flow**.
37 changes: 37 additions & 0 deletions docs/source/Eng/doc/new_features/v176_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Heading vs Body Classification + Document Outline
=================================================

Nothing in the framework maps line height to heading levels or builds a section outline —
``ocr/structure`` and ``element_parse`` are purely positional, and ``text_blocks`` groups
paragraphs / lists but does not rank them. ``heading_segment`` adds the standard heuristic:
a line whose height exceeds ``heading_ratio`` times the median line height is a heading, and
distinct heading heights become heading *levels* (the tallest is level 1). From that it emits
a flat document outline.

Pure-stdlib over plain line dicts (text + bbox); fully unit-testable with no image and no OCR
engine. Reuses ``table_grid_fill``'s box-bounds reader. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import classify_lines, outline

for item in classify_lines(ocr_lines, heading_ratio=1.2):
print(item["role"], item["level"], item["text"])

for heading in outline(ocr_lines):
print(" " * (heading["level"] - 1) + heading["text"])

``classify_lines`` tags each line ``{box, text, role, level}`` — ``role`` is ``"heading"`` or
``"body"``, ``level`` is the heading level (1 = tallest, 0 for body). ``outline`` returns just
the headings in top-to-bottom order as ``{level, text, top}`` — a document table of contents.

Executor commands
-----------------

``AC_classify_lines`` (``lines`` / ``heading_ratio`` → ``{count, lines}``) and ``AC_outline``
(``lines`` / ``heading_ratio`` → ``{count, headings}``). They are exposed as the MCP tools
``ac_classify_lines`` / ``ac_outline`` (read-only) and as the Script Builder commands
**Classify Headings vs Body** / **Document Outline** under **OCR**.
46 changes: 46 additions & 0 deletions docs/source/Eng/doc/new_features/v177_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Per-Step Critic Features + Rule-Based Step Scorer
=================================================

Scoring an agent's step needs the evidence in one place — what the action was, what changed,
whether it landed on target, whether the declared postcondition held. ``trajectory_eval``
scores a *finished whole trajectory* against a static rubric and has no per-step evidence;
``agent_trace`` emits OTel spans (tokens / latency), not decision quality; ``agent_replay``
persists ``{obs, action, result}`` but does no scoring. ``critic_features`` is the missing
per-step layer: it composes ``action_effect`` (did it do anything, where),
``observation_delta`` (how much changed) and ``postcondition`` (did the expected outcome hold)
into one compact record, and ships a deterministic rule-based scorer so the feature works fully
headless — leaving the optional LLM-as-judge to the integrator.

Pure-stdlib; composes existing pure modules; deterministic and unit-testable with no device.
Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (build_critic_record, score_step_rule_based,
to_judge_prompt)

record = build_critic_record({"type": "click", "x": 480, "y": 260},
before_elements, after_elements,
postcondition={"appears": {"role": "dialog"}})
score = score_step_rule_based(record)
# {"outcome": True, "process_score": 1.0, "reasons": [...]}

prompt = to_judge_prompt(record) # compact text for an LLM-as-judge

``build_critic_record`` returns ``{action, effect, delta_counts}`` plus a ``postcondition``
report when a spec is given. ``score_step_rule_based`` returns ``{outcome, process_score,
reasons}`` — ``outcome`` is a binary success (the action did something *and* any postcondition
held), ``process_score`` is a 0..1 quality from the effect class (halved if the postcondition
failed). ``to_judge_prompt`` renders the record for an external judge.

Executor commands
-----------------

``AC_build_critic_record`` (``action`` / ``before`` / ``after`` / ``postcondition`` /
``radius`` → the record) and ``AC_score_step`` (``record`` → ``{outcome, process_score,
reasons}``). They are exposed as the MCP tools ``ac_build_critic_record`` / ``ac_score_step``
(read-only) and as the Script Builder commands **Build Critic Record** / **Score Step
(rule-based)** under **Native UI**.
3 changes: 3 additions & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,9 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v172_features_doc
doc/new_features/v173_features_doc
doc/new_features/v174_features_doc
doc/new_features/v175_features_doc
doc/new_features/v176_features_doc
doc/new_features/v177_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
39 changes: 39 additions & 0 deletions docs/source/Zh/doc/new_features/v175_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
變化量序列的穩定偵測
====================

``smart_waits.wait_until_screen_stable`` 與 ``actionability`` 的穩定檢查把穩定邏輯包在
``time.sleep`` 輪詢迴圈內、作用於即時像素幀——你無法餵給它一段記錄好的 a11y 元素數或畫面
差異指標序列,也無法獨立於擷取去單元測試那個*決策*。``settle_detector`` 把該決策抽離:它接收
一串*變化量*(churn,每個樣本變了多少——像素差、元素數差、digest 是否變的 0/1,皆可),並在
變化量連續 ``quiet_samples`` 次維持在 ``max_churn`` 以下時回報穩定。尖峰會重置 quiet run,因此
「穩定後又變動」也能處理。

純標準函式庫;確定性、可在注入序列上單元測試,不需擷取、不需時鐘。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import settle_point, is_settled, SettleTracker

churns = [5, 4, 0.5, 0.3, 0.2] # 每幀變化量指標
settle_point(churns, quiet_samples=3, max_churn=1.0) # -> 4
is_settled(churns, quiet_samples=3, max_churn=1.0) # -> True

# 增量版,供即時迴圈(你每 tick 提供 churn)
tracker = SettleTracker(quiet_samples=3, max_churn=1.0)
state = tracker.update(current_churn)
if state.settled:
observe_now()

``settle_point`` 回傳序列首次穩定的索引(或 ``None``);``is_settled`` 為布林。``SettleTracker``
為增量形式:``update(churn)`` 回傳 ``SettleState``(``settled`` / ``quiet_run`` / ``churn``);
``reset`` 清除 run(例如在再次動作後)。

執行器指令
----------

``AC_settle_point``(``churns`` / ``quiet_samples`` / ``max_churn`` → ``{settled, index}``)
以 MCP 工具 ``ac_settle_point``(唯讀)及 Script Builder 指令 **Settle Point (churn series)**
(位於 **Flow** 分類下)形式提供。
35 changes: 35 additions & 0 deletions docs/source/Zh/doc/new_features/v176_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
標題與內文分類 + 文件大綱
==========================

框架中沒有任何功能把行高對應到標題層級或建立章節大綱——``ocr/structure`` 與 ``element_parse``
純屬位置性,``text_blocks`` 把段落 / 清單分組但不對其排序。``heading_segment`` 補上標準啟發法:
行高超過 ``heading_ratio`` 乘以中位行高者為標題,且不同的標題高度成為標題*層級*(最高為第 1 級)。
由此輸出扁平的文件大綱。

純標準函式庫,作用於純行字典(text + bbox);可在無影像、無 OCR 引擎下完整單元測試。重用
``table_grid_fill`` 的框邊界讀取器。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import classify_lines, outline

for item in classify_lines(ocr_lines, heading_ratio=1.2):
print(item["role"], item["level"], item["text"])

for heading in outline(ocr_lines):
print(" " * (heading["level"] - 1) + heading["text"])

``classify_lines`` 為每行標記 ``{box, text, role, level}``——``role`` 為 ``"heading"`` 或
``"body"``,``level`` 為標題層級(1 = 最高,內文為 0)。``outline`` 只回傳依上到下順序的標題,
為 ``{level, text, top}``——即文件目錄。

執行器指令
----------

``AC_classify_lines``(``lines`` / ``heading_ratio`` → ``{count, lines}``)與 ``AC_outline``
(``lines`` / ``heading_ratio`` → ``{count, headings}``)。兩者以 MCP 工具 ``ac_classify_lines`` /
``ac_outline``(唯讀)及 Script Builder 指令 **Classify Headings vs Body** / **Document Outline**
(位於 **OCR** 分類下)形式提供。
Loading
Loading