Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 留白投影列推断(无框线表格)

靠留白间隙推断列来读取无框线表格。完整参考:[`docs/source/Zh/doc/new_features/v165_features_doc.rst`](../docs/source/Zh/doc/new_features/v165_features_doc.rst)。

- **`detect_borderless_table` / `column_gutters` / `assign_columns` / `vertical_projection`**(`AC_detect_borderless_table`、`AC_column_gutters`):`ocr/structure` 只有在每一行单元格左缘 x 都相符时才检测得到表格——对 ragged / 无框线 / 右对齐列都失败;`edge_lines.find_grid` 需要框线,而留白表格没有。本功能靠*间隙*找列:把 OCR 框投影到 x 轴,读出持续为空的垂直带作为 gutter,指派列索引,依间距分组成行,输出 `{n_rows, n_cols, rows, columns}`。纯标准库差分数组投影(不需 numpy);重用 `table_grid_fill` 的框读取器。不导入 `PySide6`。

## 本次更新 (2026-06-24) — 自动门槛模板匹配(对分数图做 Otsu)

不再手调 `min_score`——由分数图推导匹配门槛。完整参考:[`docs/source/Zh/doc/new_features/v164_features_doc.rst`](../docs/source/Zh/doc/new_features/v164_features_doc.rst)。
Expand Down
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-TW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 留白投影欄位偵測(無框線表格)

靠留白間隙推導欄位來讀取無框線表格。完整參考:[`docs/source/Zh/doc/new_features/v165_features_doc.rst`](../docs/source/Zh/doc/new_features/v165_features_doc.rst)。

- **`detect_borderless_table` / `column_gutters` / `assign_columns` / `vertical_projection`**(`AC_detect_borderless_table`、`AC_column_gutters`):`ocr/structure` 只有在每一列儲存格左緣 x 都相符時才偵測得到表格——對 ragged / 無框線 / 右對齊欄都失敗;`edge_lines.find_grid` 需要框線,而留白表格沒有。本功能靠*間隙*找欄位:把 OCR 框投影到 x 軸,讀出持續為空的垂直帶作為 gutter,指派欄索引,依間距分群成列,輸出 `{n_rows, n_cols, rows, columns}`。純標準函式庫差分陣列投影(不需 numpy);重用 `table_grid_fill` 的框讀取器。不匯入 `PySide6`。

## 本次更新 (2026-06-24) — 自動門檻樣板比對(對分數圖做 Otsu)

不再手調 `min_score`——由分數圖推導比對門檻。完整參考:[`docs/source/Zh/doc/new_features/v164_features_doc.rst`](../docs/source/Zh/doc/new_features/v164_features_doc.rst)。
Expand Down
6 changes: 6 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# What's New — AutoControl

## What's new (2026-06-24) — Whitespace-Projection Columns (Borderless Tables)

Read borderless tables by inferring columns from the whitespace gaps. Full reference: [`docs/source/Eng/doc/new_features/v165_features_doc.rst`](docs/source/Eng/doc/new_features/v165_features_doc.rst).

- **`detect_borderless_table` / `column_gutters` / `assign_columns` / `vertical_projection`** (`AC_detect_borderless_table`, `AC_column_gutters`): `ocr/structure` only detects a table when every row's cell-left-x matches — it fails on ragged / borderless / right-aligned columns; `edge_lines.find_grid` needs ruling lines a whitespace table doesn't have. This finds columns by the *gaps*: project OCR boxes onto the x-axis, read the persistent empty vertical bands as gutters, assign column indices, bucket rows by spacing, and emit `{n_rows, n_cols, rows, columns}`. Pure-stdlib difference-array projection (no numpy); reuses `table_grid_fill`'s box reader. No `PySide6`.

## What's new (2026-06-24) — Auto-Thresholded Template Matching (Otsu on the Score Map)

No more hand-tuned `min_score` — derive the match threshold from the score map. Full reference: [`docs/source/Eng/doc/new_features/v164_features_doc.rst`](docs/source/Eng/doc/new_features/v164_features_doc.rst).
Expand Down
46 changes: 46 additions & 0 deletions docs/source/Eng/doc/new_features/v165_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Whitespace-Projection Columns (Borderless Tables)
=================================================

``ocr/structure`` detects tables only when *every* row's cell-left-x matches within a
tolerance — it collapses on ragged or borderless tables, right-aligned numeric columns, or
any row with a missing cell. ``edge_lines.find_grid`` needs ruling lines, so a table drawn
purely with whitespace has no grid at all. ``column_layout`` finds columns the robust way the
layout-analysis literature uses: by the *gaps*. It projects the OCR boxes onto the x-axis (an
ink-density profile), reads off the persistent empty vertical bands as column gutters, assigns
each box a column index, and buckets rows by vertical spacing to emit a borderless table.

Pure-stdlib over plain box dicts (a difference-array projection — no numpy), so it is fully
unit-testable with no image and no OCR engine. Reuses ``table_grid_fill``'s box-bounds reader.
Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (detect_borderless_table, column_gutters,
assign_columns, vertical_projection)

table = detect_borderless_table(ocr_boxes)
# {"n_rows": 3, "n_cols": 2, "rows": [["Name","Age"],["Ann","30"],["Bob","25"]],
# "columns": [{"start": 70, "end": 120, "width": 50}]}

gutters = column_gutters(ocr_boxes, min_gap=8) # empty vertical bands
tagged = assign_columns(ocr_boxes) # each box + "column" index
profile = vertical_projection(ocr_boxes) # ink density per x

``vertical_projection`` returns the per-x ink-density profile; ``column_gutters`` returns the
interior empty bands ``[{start, end, width}]`` at least ``min_gap`` wide; ``assign_columns``
tags every box with a 0-based ``column``; ``detect_borderless_table`` combines columns (from
gutters) with rows (from vertical spacing) into ``{n_rows, n_cols, rows, columns}``, or
``None`` when fewer than ``min_cols`` columns / ``min_rows`` rows are found. Boxes accept
``{x, y, width, height}`` or ``{left, top, right, bottom}`` plus an optional ``text``.

Executor commands
-----------------

``AC_detect_borderless_table`` (``boxes`` / ``page_width`` / ``min_gap`` / ``min_cols`` /
``min_rows`` → ``{found, table}``) and ``AC_column_gutters`` (``boxes`` / ``page_width`` /
``min_gap`` → ``{count, gutters}``). They are exposed as the MCP tools
``ac_detect_borderless_table`` / ``ac_column_gutters`` (read-only) and as the Script Builder
commands **Detect Borderless Table** / **Column Gutters (whitespace)** under **OCR**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v162_features_doc
doc/new_features/v163_features_doc
doc/new_features/v164_features_doc
doc/new_features/v165_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
42 changes: 42 additions & 0 deletions docs/source/Zh/doc/new_features/v165_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
留白投影欄位偵測(無框線表格)
==============================

``ocr/structure`` 只有在*每一列*的儲存格左緣 x 都在容差內相符時才偵測得到表格——對 ragged
或無框線表格、右對齊數字欄、或任何缺格的列都會失敗。``edge_lines.find_grid`` 需要框線,
因此純以留白繪製的表格根本沒有網格。``column_layout`` 以版面分析文獻常用的穩健方法找欄位:
靠*間隙*。它把 OCR 框投影到 x 軸(墨水密度剖面),讀出持續為空的垂直帶作為欄間隙(gutter),
為每個框指派欄索引,並依垂直間距分群成列,輸出無框線表格。

純標準函式庫,作用於純框字典(差分陣列投影——不需 numpy),因此可在無影像、無 OCR 引擎下
完整單元測試。重用 ``table_grid_fill`` 的框邊界讀取器。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import (detect_borderless_table, column_gutters,
assign_columns, vertical_projection)

table = detect_borderless_table(ocr_boxes)
# {"n_rows": 3, "n_cols": 2, "rows": [["Name","Age"],["Ann","30"],["Bob","25"]],
# "columns": [{"start": 70, "end": 120, "width": 50}]}

gutters = column_gutters(ocr_boxes, min_gap=8) # 空白垂直帶
tagged = assign_columns(ocr_boxes) # 每個框 + "column" 索引
profile = vertical_projection(ocr_boxes) # 每個 x 的墨水密度

``vertical_projection`` 回傳每個 x 的墨水密度剖面;``column_gutters`` 回傳至少 ``min_gap`` 寬
的內部空白帶 ``[{start, end, width}]``;``assign_columns`` 為每個框標上 0 起算的 ``column``;
``detect_borderless_table`` 將欄(來自 gutter)與列(來自垂直間距)組合成
``{n_rows, n_cols, rows, columns}``,或在欄數少於 ``min_cols`` / 列數少於 ``min_rows`` 時回傳
``None``。框接受 ``{x, y, width, height}`` 或 ``{left, top, right, bottom}`` 加上可選 ``text``。

執行器指令
----------

``AC_detect_borderless_table``(``boxes`` / ``page_width`` / ``min_gap`` / ``min_cols`` /
``min_rows`` → ``{found, table}``)與 ``AC_column_gutters``(``boxes`` / ``page_width`` /
``min_gap`` → ``{count, gutters}``)。兩者以 MCP 工具 ``ac_detect_borderless_table`` /
``ac_column_gutters``(唯讀)及 Script Builder 指令 **Detect Borderless Table** /
**Column Gutters (whitespace)**(位於 **OCR** 分類下)形式提供。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v162_features_doc
doc/new_features/v163_features_doc
doc/new_features/v164_features_doc
doc/new_features/v165_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
8 changes: 8 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,10 @@
from je_auto_control.utils.observation_delta import (
delta_index, delta_observation, summarize_delta,
)
# Infer columns from vertical whitespace (borderless tables)
from je_auto_control.utils.column_layout import (
assign_columns, column_gutters, detect_borderless_table, vertical_projection,
)
# Locate on-screen regions by colour (mask + connected components)
from je_auto_control.utils.color_region import (
find_color_region, find_color_regions,
Expand Down Expand Up @@ -1230,6 +1234,10 @@ def start_autocontrol_gui(*args, **kwargs):
"delta_index",
"delta_observation",
"summarize_delta",
"vertical_projection",
"column_gutters",
"assign_columns",
"detect_borderless_table",
"find_color_region",
"find_color_regions",
"ssim_compare",
Expand Down
21 changes: 21 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -708,6 +708,27 @@ def _add_ocr_specs(specs: List[CommandSpec]) -> None:
),
description="Drop OCR text boxes into a ruling-line grid → addressable table.",
))
specs.append(CommandSpec(
"AC_detect_borderless_table", "OCR", "Detect Borderless Table",
fields=(
FieldSpec("boxes", FieldType.STRING,
placeholder='[{"x":10,"y":0,"width":60,"height":18,'
'"text":"Name"}]'),
FieldSpec("min_gap", FieldType.INT, optional=True, default=8),
FieldSpec("page_width", FieldType.INT, optional=True),
),
description="Infer a borderless table from OCR boxes via whitespace columns.",
))
specs.append(CommandSpec(
"AC_column_gutters", "OCR", "Column Gutters (whitespace)",
fields=(
FieldSpec("boxes", FieldType.STRING,
placeholder='[{"x":10,"y":0,"width":60,"height":18}]'),
FieldSpec("min_gap", FieldType.INT, optional=True, default=8),
FieldSpec("page_width", FieldType.INT, optional=True),
),
description="Find borderless-table column separators by whitespace projection.",
))
specs.append(CommandSpec(
"AC_scroll_to_find", "OCR", "Scroll Until Visible",
fields=(
Expand Down
9 changes: 9 additions & 0 deletions je_auto_control/utils/column_layout/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Infer columns from vertical whitespace, for borderless tables."""
from je_auto_control.utils.column_layout.column_layout import (
assign_columns, column_gutters, detect_borderless_table, vertical_projection,
)

__all__ = [
"vertical_projection", "column_gutters",
"assign_columns", "detect_borderless_table",
]
141 changes: 141 additions & 0 deletions je_auto_control/utils/column_layout/column_layout.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
"""Infer columns from vertical whitespace, for borderless tables.

``ocr/structure`` detects tables only when *every* row's cell-left-x matches within a
tolerance — it collapses on ragged or borderless tables, right-aligned numeric columns, or
any row with a missing cell. ``edge_lines.find_grid`` needs ruling lines, so a table drawn
purely with whitespace (no borders) has no grid at all. ``column_layout`` finds columns the
robust way the layout-analysis literature uses: by the *gaps*. It projects the OCR boxes onto
the x-axis (an ink-density profile), reads off the persistent empty vertical bands as column
gutters, and assigns each box a column index — then buckets rows by vertical spacing to emit a
borderless table.

Pure-stdlib over plain box dicts (no numpy needed — a difference-array projection), so it is
fully unit-testable with no image and no OCR engine. Reuses ``table_grid_fill``'s box-bounds
reader. Imports no ``PySide6``.
"""
from typing import Any, Dict, List, Optional, Sequence

from je_auto_control.utils.table_grid_fill.table_grid_fill import _box_bounds

Box = Dict[str, Any]


def _center_x(box: Box) -> int:
left, _, right, _ = _box_bounds(box)
return (left + right) // 2


def _center_y(box: Box) -> int:
_, top, _, bottom = _box_bounds(box)
return (top + bottom) // 2


def vertical_projection(boxes: Sequence[Box], *,
page_width: Optional[int] = None) -> List[int]:
"""Return the per-column ink-density profile (how many boxes cover each x)."""
bounds = [_box_bounds(box) for box in boxes]
width = int(page_width) if page_width else max((r for _, _, r, _ in bounds),
default=0)
diff = [0] * (width + 1)
for left, _, right, _ in bounds:
left, right = max(0, min(width, left)), max(0, min(width, right))
if right > left:
diff[left] += 1
diff[right] -= 1
profile, running = [], 0
for x in range(width):
running += diff[x]
profile.append(running)
return profile


def column_gutters(boxes: Sequence[Box], *, page_width: Optional[int] = None,
min_gap: int = 8) -> List[Dict[str, int]]:
"""Return the interior empty vertical bands (column separators) >= ``min_gap`` wide."""
profile = vertical_projection(boxes, page_width=page_width)
gutters: List[Dict[str, int]] = []
start: Optional[int] = None
for x, value in enumerate(profile):
if value == 0:
start = x if start is None else start
continue
if start is not None and start > 0 and x - start >= int(min_gap):
gutters.append({"start": start, "end": x, "width": x - start})
start = None
return gutters


def _column_cuts(gutters: Sequence[Dict[str, int]]) -> List[int]:
"""Return the x mid-points of the gutters — the column boundaries."""
return [(gutter["start"] + gutter["end"]) // 2 for gutter in gutters]


def _column_of(center_x: int, cuts: Sequence[int]) -> int:
"""Return the 0-based column index of a centre-x given the boundary cuts."""
index = 0
for cut in cuts:
if center_x < cut:
break
index += 1
return index


def assign_columns(boxes: Sequence[Box], *, page_width: Optional[int] = None,
min_gap: int = 8) -> List[Box]:
"""Return each box tagged with a ``column`` index derived from the gutters."""
cuts = _column_cuts(column_gutters(boxes, page_width=page_width, min_gap=min_gap))
return [dict(box, column=_column_of(_center_x(box), cuts)) for box in boxes]


def _row_gap(boxes: Sequence[Box], row_gap: Optional[int]) -> int:
"""Pick the row-split gap: caller value, or half the median box height."""
if row_gap is not None:
return int(row_gap)
heights = sorted((_box_bounds(b)[3] - _box_bounds(b)[1]) for b in boxes)
return max(1, heights[len(heights) // 2] // 2)


def _bucket_rows(boxes: Sequence[Box], gap: int) -> List[List[Box]]:
"""Group boxes into rows by vertical spacing; sort each row by column."""
ordered = sorted(boxes, key=_center_y)
rows: List[List[Box]] = [[ordered[0]]]
last = _center_y(ordered[0])
for box in ordered[1:]:
center = _center_y(box)
if center - last > gap:
rows.append([box])
else:
rows[-1].append(box)
last = center
for row in rows:
row.sort(key=lambda item: item.get("column", 0))
return rows


def detect_borderless_table(boxes: Sequence[Box], *,
page_width: Optional[int] = None, min_gap: int = 8,
row_gap: Optional[int] = None,
min_cols: int = 2,
min_rows: int = 2) -> Optional[Dict[str, Any]]:
"""Infer a borderless table from OCR boxes, or ``None`` if it is not tabular.

Columns come from whitespace gutters, rows from vertical spacing. Returns
``{n_rows, n_cols, rows:[[text]], columns:[gutter]}`` when at least ``min_cols``
columns and ``min_rows`` rows are found.
"""
if not boxes:
return None
tagged = assign_columns(boxes, page_width=page_width, min_gap=min_gap)
n_cols = max(box["column"] for box in tagged) + 1
if n_cols < int(min_cols):
return None
rows = _bucket_rows(tagged, _row_gap(tagged, row_gap))
if len(rows) < int(min_rows):
return None
table = [["" for _ in range(n_cols)] for _ in rows]
for r, row in enumerate(rows):
for box in row:
col = box["column"]
table[r][col] = f'{table[r][col]} {box.get("text", "")}'.strip()
return {"n_rows": len(rows), "n_cols": n_cols, "rows": table,
"columns": column_gutters(boxes, page_width=page_width, min_gap=min_gap)}
28 changes: 28 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -3407,6 +3407,32 @@ def _populate_table(grid: Any, text_boxes: Any, overlap: Any = 0.4) -> Dict[str,
return populate_table(grid, text_boxes, overlap=float(overlap))


def _column_gutters(boxes: Any, page_width: Any = None,
min_gap: Any = 8) -> Dict[str, Any]:
"""Adapter: interior whitespace column gutters from OCR boxes."""
import json
from je_auto_control.utils.column_layout import column_gutters
if isinstance(boxes, str):
boxes = json.loads(boxes)
gutters = column_gutters(boxes, page_width=int(page_width) if page_width
else None, min_gap=int(min_gap))
return {"count": len(gutters), "gutters": gutters}


def _detect_borderless_table(boxes: Any, page_width: Any = None, min_gap: Any = 8,
min_cols: Any = 2, min_rows: Any = 2) -> Dict[str, Any]:
"""Adapter: infer a borderless table from OCR boxes via whitespace columns."""
import json
from je_auto_control.utils.column_layout import detect_borderless_table
if isinstance(boxes, str):
boxes = json.loads(boxes)
table = detect_borderless_table(boxes,
page_width=int(page_width) if page_width else None,
min_gap=int(min_gap), min_cols=int(min_cols),
min_rows=int(min_rows))
return {"found": table is not None, "table": table}


def _find_color_region(rgb: Any, tolerance: Any = 20, min_area: Any = 50,
region: Any = None) -> Dict[str, Any]:
"""Adapter: locate coloured regions on the screen, largest first."""
Expand Down Expand Up @@ -5858,6 +5884,8 @@ def __init__(self):
"AC_cell_for_point": _cell_for_point,
"AC_point_for_cell": _point_for_cell,
"AC_populate_table": _populate_table,
"AC_column_gutters": _column_gutters,
"AC_detect_borderless_table": _detect_borderless_table,
"AC_ssim_compare": _ssim_compare,
"AC_ssim_changed_regions": _ssim_changed_regions,
"AC_feature_match": _feature_match,
Expand Down
Loading
Loading