diff --git a/WHATS_NEW.md b/WHATS_NEW.md index d82b808b..25fe4b24 100644 --- a/WHATS_NEW.md +++ b/WHATS_NEW.md @@ -2,6 +2,12 @@ ## What's new (2026-06-26) +### Theme-Invariant Matching (Light Template, Dark Mode) + +Find a button captured in light mode even after the app switches to dark mode. Full reference: [`docs/source/Eng/doc/new_features/v217_features_doc.rst`](docs/source/Eng/doc/new_features/v217_features_doc.rst). + +- **`normalize_theme` / `match_theme`** (`AC_match_theme`): `match_template` correlates raw pixel intensities, so a light-mode template scores terribly against the same control in dark mode — the polarity is inverted. The fix is to compare *structure*. `normalize_theme` maps an image to a polarity-invariant single channel (`sobel`/`laplacian` gradient magnitude — identical for an image and its colour inverse — or `zscore`); `match_theme` normalizes both the template and the screen, then locates the template via `visual_match.match_template`, finding it across a light/dark flip that defeats raw matching. cv2/numpy are imported lazily so the module stays importable everywhere. Fourth feature of the ROUND-15 perception lane. No `PySide6`. + ### Sample a Region's Text Contrast (WCAG) Grade the legibility of on-screen text when you only have a region, not the two colours. Full reference: [`docs/source/Eng/doc/new_features/v216_features_doc.rst`](docs/source/Eng/doc/new_features/v216_features_doc.rst). diff --git a/docs/source/Eng/doc/new_features/v217_features_doc.rst b/docs/source/Eng/doc/new_features/v217_features_doc.rst new file mode 100644 index 00000000..dccaecca --- /dev/null +++ b/docs/source/Eng/doc/new_features/v217_features_doc.rst @@ -0,0 +1,49 @@ +Theme-Invariant Matching (Light Template, Dark Mode) +==================================================== + +``match_template`` correlates raw pixel intensities, so a template captured in +light mode scores terribly against the same control in dark mode — the polarity +is inverted. The fix is to compare *structure* (edges, gradients), which is the +same regardless of which way the colours run. ``theme_normalize`` turns an image +into a polarity-invariant representation before matching. + +* :func:`normalize_theme` — map an image to a normalised single-channel image. + ``sobel`` (default) and ``laplacian`` use gradient magnitude, which is + identical for an image and its colour-inverse; ``zscore`` standardises + intensity. +* :func:`match_theme` — :func:`normalize_theme` both the template and the + haystack (the screen by default), then locate the template — finding it across + a light/dark theme flip that defeats raw matching. + +``cv2`` / ``numpy`` are imported lazily, so importing the module never requires +them, and the locating logic reuses :func:`visual_match.match_template`. Imports +no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import match_theme, normalize_theme + + # A button template grabbed in light mode, found in the dark-mode app: + hit = match_theme("save_button_light.png", method="sobel", min_score=0.4) + if hit and hit["score"] >= 0.5: + click(hit["x"] + hit["width"] // 2, hit["y"] + hit["height"] // 2) + + # The transform itself (e.g. to feed your own matcher): + edges = normalize_theme("template.png", method="sobel") + +Because gradient magnitude is identical for an image and its inverse, +``normalize_theme(img, "sobel")`` equals ``normalize_theme(255 - img, "sobel")`` +— that invariance is exactly what lets one template match both themes. Use +``min_score`` lower than for raw matching (structure correlation runs cooler). + +Executor commands +----------------- + +``AC_match_theme`` (``template`` + ``region`` ``[x, y, w, h]`` / ``method`` / +``min_score`` → ``{found, x, y, width, height, score}``) locates a template +across a theme flip. It is the matching read-only ``ac_match_theme`` MCP tool and +a Script Builder command under **Image**. :func:`normalize_theme` (which returns +an image array) is the Python-API surface. diff --git a/docs/source/Zh/doc/new_features/v217_features_doc.rst b/docs/source/Zh/doc/new_features/v217_features_doc.rst new file mode 100644 index 00000000..83c82f8c --- /dev/null +++ b/docs/source/Zh/doc/new_features/v217_features_doc.rst @@ -0,0 +1,41 @@ +主題不變比對(淺色模板、深色模式) +================================== + +``match_template`` 以原始像素強度相關比對,故在淺色模式擷取的模板,對深色模式下同一控制項評分極差—— +極性反轉了。修法是比較*結構*(邊緣、梯度),不論顏色走向如何皆相同。``theme_normalize`` 在比對前 +把影像轉成極性不變的表示。 + +* :func:`normalize_theme` ——把影像映射為正規化的單通道影像。``sobel``(預設)與 ``laplacian`` + 使用梯度幅值,對影像與其顏色反相版本相同;``zscore`` 將強度標準化。 +* :func:`match_theme` ——對模板與 haystack(預設為螢幕)都做 :func:`normalize_theme`,再定位模板—— + 即使在會擊敗原始比對的淺/深主題切換下也能找到。 + +``cv2`` / ``numpy`` 採延遲匯入,故匯入本模組永遠不需要它們,定位邏輯則重用 +:func:`visual_match.match_template`。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import match_theme, normalize_theme + + # 淺色模式擷取的按鈕模板,在深色模式的 app 中找到: + hit = match_theme("save_button_light.png", method="sobel", min_score=0.4) + if hit and hit["score"] >= 0.5: + click(hit["x"] + hit["width"] // 2, hit["y"] + hit["height"] // 2) + + # 轉換本身(例如餵給你自己的比對器): + edges = normalize_theme("template.png", method="sobel") + +由於梯度幅值對影像與其反相版本相同,``normalize_theme(img, "sobel")`` 等於 +``normalize_theme(255 - img, "sobel")``——正是這個不變性讓單一模板能比對兩種主題。 +``min_score`` 請設得比原始比對低(結構相關分數較低)。 + +執行器指令 +---------- + +``AC_match_theme``(``template`` 加上 ``region`` ``[x, y, w, h]`` / ``method`` / +``min_score`` → ``{found, x, y, width, height, score}``)跨主題切換定位模板。以對應的唯讀 +``ac_match_theme`` MCP 工具及 Script Builder 指令(位於 **Image** 分類下)形式提供。 +:func:`normalize_theme`(回傳影像陣列)則是 Python API 介面。 diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 4c0193c0..bfb1a716 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -141,6 +141,8 @@ from je_auto_control.utils.contrast_map import ( dominant_pair, grade_contrast, region_contrast, ) +# Theme-invariant matching so a light template matches dark mode +from je_auto_control.utils.theme_normalize import match_theme, normalize_theme # Rich clipboard formats — RTF + CSV/TSV codecs and Windows get / set from je_auto_control.utils.clipboard_rich_formats import ( build_rtf, csv_to_rows, get_clipboard_csv, get_clipboard_rtf, rows_to_csv, @@ -1768,6 +1770,7 @@ def start_autocontrol_gui(*args, **kwargs): "simulate_cvd", "colors_collide", "color_distance", "place_labels", "label_color", "grade_contrast", "dominant_pair", "region_contrast", + "normalize_theme", "match_theme", "build_rtf", "rtf_to_text", "rows_to_csv", "csv_to_rows", "set_clipboard_rtf", "get_clipboard_rtf", "set_clipboard_csv", "get_clipboard_csv", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index 42fcfc56..1d1f6f0f 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -4591,6 +4591,21 @@ def _add_work_queue_specs(specs: List[CommandSpec]) -> None: ), description="Sample a screen region and grade its text contrast.", )) + specs.append(CommandSpec( + "AC_match_theme", "Image", "Match (Theme-Invariant)", + fields=( + FieldSpec("template", FieldType.STRING, + placeholder="template image path"), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder="[x, y, w, h]"), + FieldSpec("method", FieldType.STRING, optional=True, + default="sobel", + placeholder="sobel / laplacian / zscore"), + FieldSpec("min_score", FieldType.FLOAT, optional=True, + default=0.5), + ), + description="Locate a template across a light/dark theme flip.", + )) specs.append(CommandSpec( "AC_normalize_ext", "Shell", "Normalize Extension", fields=( diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index 4456d9ae..6ba236f4 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -2900,6 +2900,18 @@ def _region_contrast(region: Any = None) -> Dict[str, Any]: return region_contrast(region=_coerce_region(region)) +def _match_theme(template: Any, region: Any = None, method: Any = "sobel", + min_score: Any = 0.5) -> Dict[str, Any]: + """Adapter: locate a template across a light/dark theme flip (device).""" + from je_auto_control.utils.theme_normalize import match_theme + match = match_theme(str(template), method=str(method), + min_score=float(min_score), + region=_coerce_region(region)) + if match is None: + return {"found": False} + return {"found": True, **match} + + def _normalize_ext(target: str) -> Dict[str, Any]: """Adapter: the lowercased extension of a path / bare ext (pure).""" from je_auto_control.utils.file_assoc import normalize_ext @@ -6938,6 +6950,7 @@ def __init__(self): "AC_grade_contrast": _grade_contrast, "AC_dominant_pair": _dominant_pair, "AC_region_contrast": _region_contrast, + "AC_match_theme": _match_theme, "AC_normalize_ext": _normalize_ext, "AC_file_association": _file_association, "AC_get_control_text": _get_control_text, diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index ab53fde9..4369522b 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -4091,6 +4091,22 @@ def img_histogram_tools() -> List[MCPTool]: handler=h.region_contrast, annotations=READ_ONLY, ), + MCPTool( + name="ac_match_theme", + description=("Locate a 'template' image on screen across a " + "light/dark theme flip, by matching gradient " + "structure ('method' sobel/laplacian/zscore). " + "'region' [x,y,w,h] clips the search. Returns {found, " + "x, y, width, height, score}."), + input_schema=schema({"template": {"type": "string"}, + "region": {"type": "array", + "items": {"type": "integer"}}, + "method": {"type": "string"}, + "min_score": {"type": "number"}}, + required=["template"]), + handler=h.match_theme, + annotations=READ_ONLY, + ), ] diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index ed3bd986..5006b182 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -769,6 +769,11 @@ def region_contrast(region=None): return _region_contrast(region) +def match_theme(template, region=None, method="sobel", min_score=0.5): + from je_auto_control.utils.executor.action_executor import _match_theme + return _match_theme(template, region, method, min_score) + + def normalize_ext(target): from je_auto_control.utils.executor.action_executor import _normalize_ext return _normalize_ext(target) diff --git a/je_auto_control/utils/theme_normalize/__init__.py b/je_auto_control/utils/theme_normalize/__init__.py new file mode 100644 index 00000000..1c75d90e --- /dev/null +++ b/je_auto_control/utils/theme_normalize/__init__.py @@ -0,0 +1,6 @@ +"""Theme-invariant image normalisation so light templates match dark mode.""" +from je_auto_control.utils.theme_normalize.theme_normalize import ( + THEME_METHODS, match_theme, normalize_theme, +) + +__all__ = ["normalize_theme", "match_theme", "THEME_METHODS"] diff --git a/je_auto_control/utils/theme_normalize/theme_normalize.py b/je_auto_control/utils/theme_normalize/theme_normalize.py new file mode 100644 index 00000000..742d9d67 --- /dev/null +++ b/je_auto_control/utils/theme_normalize/theme_normalize.py @@ -0,0 +1,86 @@ +"""Theme-invariant image normalisation so light templates match dark mode. + +``match_template`` correlates raw pixel intensities, so a template captured in +light mode scores terribly against the same control in dark mode — the polarity +is inverted. The fix is to compare *structure* (edges, gradients), which is the +same regardless of which way the colours run. ``theme_normalize`` turns an image +into a polarity-invariant representation before matching: + +* :func:`normalize_theme` — map an image to a normalised single-channel image. + ``sobel`` (default) and ``laplacian`` use gradient magnitude, which is + identical for an image and its inverse; ``zscore`` standardises intensity. +* :func:`match_theme` — :func:`normalize_theme` both the template and the + haystack (the screen by default), then locate the template — finding it across + a light/dark theme flip that defeats raw matching. + +cv2 / numpy are imported lazily, so importing this module never requires them +(the package stays importable everywhere) and the locating logic reuses +:func:`visual_match.match_template`. Imports no ``PySide6``. +""" +from typing import Any, Dict, Optional, Sequence + +# A normalisation method name. +THEME_METHODS = ("sobel", "laplacian", "zscore") + + +def _to_uint8(array: Any) -> Any: + """Rescale a float array to a 0..255 uint8 image.""" + import cv2 + return cv2.normalize(array, None, 0, 255, cv2.NORM_MINMAX).astype("uint8") + + +def _zscore(gray: Any) -> Any: + """Standardise intensity to zero mean / unit variance (not inversion-safe).""" + import numpy as np + std = float(gray.std()) + if std < 1e-9: + return np.zeros_like(gray) + return (gray - gray.mean()) / std + + +def normalize_theme(source: Any, *, method: str = "sobel") -> Any: + """Return ``source`` as a theme-normalised single-channel ``uint8`` image. + + ``sobel`` / ``laplacian`` return gradient magnitude — identical for an image + and its colour-inverted (dark-mode) twin — and ``zscore`` standardises + intensity. Raises ``ValueError`` for an unknown ``method``. + """ + import cv2 + import numpy as np + from je_auto_control.utils.visual_match.visual_match import _to_gray + gray = _to_gray(source).astype("float64") + if method == "sobel": + gx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3) + gy = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3) + result = np.sqrt(gx * gx + gy * gy) + elif method == "laplacian": + result = np.abs(cv2.Laplacian(gray, cv2.CV_64F, ksize=3)) + elif method == "zscore": + result = _zscore(gray) + else: + raise ValueError(f"unknown theme-normalize method: {method!r}") + return _to_uint8(result) + + +def match_theme(template: Any, *, haystack: Optional[Any] = None, + method: str = "sobel", min_score: float = 0.5, + region: Optional[Sequence[int]] = None + ) -> Optional[Dict[str, Any]]: + """Locate ``template`` in ``haystack`` after theme-normalising both. + + ``haystack`` defaults to a fresh screen grab (optionally clipped to + ``region``). Returns ``{x, y, width, height, score}`` for the best match at + or above ``min_score``, or ``None``. Robust to a light/dark theme flip that + defeats raw :func:`visual_match.match_template`. + """ + from je_auto_control.utils.visual_match import match_template + from je_auto_control.utils.visual_match.visual_match import _grab_gray + raw_haystack = haystack if haystack is not None else _grab_gray(region) + norm_template = normalize_theme(template, method=method) + norm_haystack = normalize_theme(raw_haystack, method=method) + match = match_template(norm_template, haystack=norm_haystack, + min_score=float(min_score)) + if match is None: + return None + return {"x": match.x, "y": match.y, "width": match.width, + "height": match.height, "score": round(float(match.score), 4)} diff --git a/test/unit_test/headless/test_theme_normalize_batch.py b/test/unit_test/headless/test_theme_normalize_batch.py new file mode 100644 index 00000000..10174c38 --- /dev/null +++ b/test/unit_test/headless/test_theme_normalize_batch.py @@ -0,0 +1,72 @@ +"""Headless tests for theme_normalize (cv2 behaviour + cv2-free wiring).""" +import pytest + +import je_auto_control as ac +from je_auto_control.utils.theme_normalize import match_theme, normalize_theme + + +# --- cv2 behaviour (gated per-function so wiring still runs without cv2) --- + +def test_normalize_theme_polarity_invariant(): + np = pytest.importorskip("numpy") + pytest.importorskip("cv2") + rng = np.random.default_rng(0) + image = rng.integers(0, 256, (60, 80)).astype("uint8") + light = normalize_theme(image, method="sobel") + dark = normalize_theme(255 - image, method="sobel") + assert light.shape == image.shape + # gradient magnitude is identical for an image and its colour inverse + assert np.array_equal(light, dark) + + +def test_normalize_theme_zscore_shape(): + np = pytest.importorskip("numpy") + pytest.importorskip("cv2") + rng = np.random.default_rng(1) + image = rng.integers(0, 256, (40, 40)).astype("uint8") + out = normalize_theme(image, method="zscore") + assert out.shape == image.shape + assert out.dtype == np.uint8 + + +def test_normalize_theme_unknown_method_raises(): + np = pytest.importorskip("numpy") + pytest.importorskip("cv2") + image = np.zeros((10, 10), dtype="uint8") + with pytest.raises(ValueError): + normalize_theme(image, method="bogus") + + +def test_match_theme_finds_template_across_inversion(): + np = pytest.importorskip("numpy") + pytest.importorskip("cv2") + haystack = np.full((100, 120), 128, dtype="uint8") + template = np.full((20, 20), 220, dtype="uint8") + template[5:15, 5:15] = 40 # internal edge structure + haystack[30:50, 40:60] = template # place at x=40, y=30 + dark_haystack = 255 - haystack # dark-mode: colours inverted + result = match_theme(template, haystack=dark_haystack, method="sobel", + min_score=0.3) + assert result is not None + assert abs(result["x"] - 40) <= 5 + assert abs(result["y"] - 30) <= 5 + + +# --- wiring (cv2-free: the module imports cv2 lazily) ---------------------- + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert "AC_match_theme" in known + from je_auto_control.utils.mcp_server.tools import ( + build_default_tool_registry, + ) + names = {t.name for t in build_default_tool_registry()} + assert "ac_match_theme" in names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert "AC_match_theme" in specs + + +def test_facade_exports(): + for name in ("normalize_theme", "match_theme"): + assert hasattr(ac, name) and name in ac.__all__