Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# What's New — AutoControl

## What's new (2026-06-24) — Keyboard Focus Order (Tab sequence / WCAG audit / set-focus)

Reason about keyboard navigation: the Tab order, a WCAG focus-order audit, and set-focus. Full reference: [`docs/source/Eng/doc/new_features/v184_features_doc.rst`](docs/source/Eng/doc/new_features/v184_features_doc.rst).

- **`is_interactive_role` / `tab_order` / `audit_focus_order` / `focus_control`** (`AC_tab_order`, `AC_audit_focus_order`, `AC_focus_control`): nothing reasoned about *keyboard* navigation — only mouse coordinates and element values. This adds the keyboard layer: `tab_order` returns the focusable elements in the order Tab visits them (reading order), `audit_focus_order` is a WCAG 2.4.x report (the sequence + flagged problems like a focusable element with no visible area), and `focus_control` sets keyboard focus via UIA `SetFocus`. The first three are pure functions over `AccessibilityElement` lists — `tab_order` reuses `element_parse.reading_order` and `is_interactive_role` reuses `ax_tree_walk.humanize_role`, so no logic is duplicated; `focus_control` dispatches the injectable backend seam (real `SetFocus` in the Windows backend). No `PySide6`.

## What's new (2026-06-24) — Readable, Addressable Accessibility Tree (role names + node paths)

Turn a raw `ControlType_50000` tree dump into readable roles with a stable path per node. Full reference: [`docs/source/Eng/doc/new_features/v183_features_doc.rst`](docs/source/Eng/doc/new_features/v183_features_doc.rst).
Expand Down
50 changes: 50 additions & 0 deletions docs/source/Eng/doc/new_features/v184_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Keyboard Focus Order (Tab sequence / WCAG audit / set-focus)
============================================================

Nothing in the toolkit reasoned about *keyboard* navigation — only mouse
coordinates and element values. ``focus_order`` adds the keyboard layer:

* :func:`is_interactive_role` — is a role one that normally takes keyboard focus,
* :func:`tab_order` — the focusable elements in the order ``Tab`` will visit them
(their reading order: top-to-bottom, left-to-right),
* :func:`audit_focus_order` — a WCAG 2.4.x focus-order report over a flat element
list (the sequence plus flagged problems, e.g. a focusable element with no
visible area — focus would land somewhere unseen),
* :func:`focus_control` — set the keyboard focus on a control (UIA ``SetFocus``).

The first three are pure functions over ``AccessibilityElement`` lists:
``tab_order`` reuses ``element_parse.reading_order`` for row banding and
``is_interactive_role`` reuses ``ax_tree_walk.humanize_role``, so no logic is
duplicated. ``focus_control`` is a thin dispatch onto the injectable
``accessibility.backends.get_backend()`` seam; the real ``SetFocus`` lives in the
Windows backend. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (list_accessibility_elements, tab_order,
audit_focus_order, focus_control)

elements = list_accessibility_elements(app_name="myapp.exe")
for el in tab_order(elements): # the Tab visiting order
print(el.name, el.role)

report = audit_focus_order(elements)
# {"order": [...], "issues": [...], "focusable_count": N, "issue_count": M}

focus_control(name="Username", role="edit") # put the cursor in the field

Focusability is role-based (the interactive roles: Button, Edit, CheckBox,
ComboBox, RadioButton, Hyperlink, ListItem, MenuItem, Slider, Tab/TabItem,
TreeItem, …). ``focus_control`` locates by ``name`` / ``role`` / ``app_name`` /
``automation_id`` like the other native-control actions and returns ``bool``.

Executor commands
-----------------

``AC_tab_order`` / ``AC_audit_focus_order`` (``app_name`` / ``max_results``) list
and audit the live app; ``AC_focus_control`` sets focus. They are exposed as the
matching ``ac_*`` MCP tools (the two reads read-only, ``ac_focus_control``
destructive) and as Script Builder commands under **Native UI**.
44 changes: 44 additions & 0 deletions docs/source/Zh/doc/new_features/v184_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
鍵盤焦點順序(Tab 序列 / WCAG 稽核 / 設定焦點)
==============================================

工具組原本不對*鍵盤*導覽做任何推理——只有滑鼠座標與元素值。``focus_order`` 補上鍵盤這一層:

* :func:`is_interactive_role` ——某角色是否通常會接受鍵盤焦點,
* :func:`tab_order` ——可聚焦元素依 ``Tab`` 鍵造訪的順序(即其閱讀順序:由上到下、由左到右),
* :func:`audit_focus_order` ——針對扁平元素清單的 WCAG 2.4.x 焦點順序報告(序列加上被標記的
問題,例如某可聚焦元素沒有可見面積——焦點會落在看不見的地方),
* :func:`focus_control` ——將鍵盤焦點設到某控制項上(UIA ``SetFocus``)。

前三者為針對 ``AccessibilityElement`` 清單的純函式:``tab_order`` 重用
``element_parse.reading_order`` 做列分群,``is_interactive_role`` 重用
``ax_tree_walk.humanize_role``,故無重複邏輯。``focus_control`` 是對可注入的
``accessibility.backends.get_backend()`` 接縫的薄分派;真正的 ``SetFocus`` 位於 Windows 後端。
不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import (list_accessibility_elements, tab_order,
audit_focus_order, focus_control)

elements = list_accessibility_elements(app_name="myapp.exe")
for el in tab_order(elements): # Tab 造訪順序
print(el.name, el.role)

report = audit_focus_order(elements)
# {"order": [...], "issues": [...], "focusable_count": N, "issue_count": M}

focus_control(name="Username", role="edit") # 把游標放進該欄位

可聚焦性以角色判定(互動角色:Button、Edit、CheckBox、ComboBox、RadioButton、Hyperlink、
ListItem、MenuItem、Slider、Tab/TabItem、TreeItem……)。``focus_control`` 與其他原生控制
動作一樣以 ``name`` / ``role`` / ``app_name`` / ``automation_id`` 定位,回傳 ``bool``。

執行器指令
----------

``AC_tab_order`` / ``AC_audit_focus_order``(``app_name`` / ``max_results``)列出並稽核存活的
應用程式;``AC_focus_control`` 設定焦點。三者皆以對應的 ``ac_*`` MCP 工具(兩個讀取為唯讀、
``ac_focus_control`` 為破壞性)及 Script Builder 指令(位於 **Native UI** 分類下)形式提供。
5 changes: 5 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@
assign_node_paths, control_type_name, find_by_path, humanize_role,
humanize_tree,
)
# Keyboard focus order (tab sequence / WCAG audit / set-focus)
from je_auto_control.utils.focus_order import (
audit_focus_order, focus_control, is_interactive_role, tab_order,
)
# VLM element locator (headless)
from je_auto_control.utils.vision import (
VLMNotAvailableError, click_by_description, locate_by_description,
Expand Down Expand Up @@ -1629,6 +1633,7 @@ def start_autocontrol_gui(*args, **kwargs):
"get_control_text", "get_selected_text", "get_visible_text",
"control_type_name", "humanize_role", "humanize_tree",
"assign_node_paths", "find_by_path",
"is_interactive_role", "tab_order", "audit_focus_order", "focus_control",
# VLM locator
"VLMNotAvailableError", "locate_by_description", "click_by_description",
"verify_description",
Expand Down
18 changes: 18 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -1550,6 +1550,24 @@ def _add_native_control_specs(specs: List[CommandSpec]) -> None:
fields=(FieldSpec("role", FieldType.STRING),),
description="Translate a raw UIA role (ControlType_50000) to a name.",
))
tree_fields = (FieldSpec("app_name", FieldType.STRING, optional=True),
FieldSpec("max_results", FieldType.INT, optional=True,
default=500))
specs.append(CommandSpec(
"AC_tab_order", "Native UI", "Keyboard Tab Order",
fields=tree_fields,
description="List focusable controls in keyboard Tab (reading) order.",
))
specs.append(CommandSpec(
"AC_audit_focus_order", "Native UI", "Audit Focus Order (WCAG)",
fields=tree_fields,
description="WCAG 2.4.x focus-order audit: tab sequence + flagged issues.",
))
specs.append(CommandSpec(
"AC_focus_control", "Native UI", "Set Keyboard Focus",
fields=fields,
description="Set keyboard focus on a control natively (UIA SetFocus).",
))


def _add_misc_specs(specs: List[CommandSpec]) -> None:
Expand Down
8 changes: 8 additions & 0 deletions je_auto_control/utils/accessibility/backends/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,14 @@
"""Return only the on-screen text of the control (TextPattern), or None."""
self._unsupported("visible_text")

# --- keyboard focus ----------------------------------------------------

def set_focus(self, name: Optional[str] = None, role: Optional[str] = None,

Check warning on line 128 in je_auto_control/utils/accessibility/backends/base.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Remove the unused function parameter "role".

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ73kdrgkXMtfYKNhvyj&open=AZ73kdrgkXMtfYKNhvyj&pullRequest=400

Check warning on line 128 in je_auto_control/utils/accessibility/backends/base.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Remove the unused function parameter "name".

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ73kdrgkXMtfYKNhvyk&open=AZ73kdrgkXMtfYKNhvyk&pullRequest=400
app_name: Optional[str] = None,

Check warning on line 129 in je_auto_control/utils/accessibility/backends/base.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Remove the unused function parameter "app_name".

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ73kdrgkXMtfYKNhvyl&open=AZ73kdrgkXMtfYKNhvyl&pullRequest=400
automation_id: Optional[str] = None) -> bool:

Check warning on line 130 in je_auto_control/utils/accessibility/backends/base.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Remove the unused function parameter "automation_id".

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ73kdrgkXMtfYKNhvyi&open=AZ73kdrgkXMtfYKNhvyi&pullRequest=400
"""Set keyboard focus on the matched control (SetFocus); True on success."""
self._unsupported("set_focus")

def _unsupported(self, operation: str):
"""Raise a clear error for an action this backend can't perform."""
raise AccessibilityNotAvailableError(
Expand Down
11 changes: 11 additions & 0 deletions je_auto_control/utils/accessibility/backends/windows_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,17 @@ def visible_text(self, name=None, role=None, app_name=None,
except (OSError, AttributeError):
return None

def set_focus(self, name=None, role=None, app_name=None,
automation_id=None) -> bool:
raw = self._find_raw(name, role, app_name, automation_id)
if not raw:
return False
try:
raw.SetFocus()
return True
except (OSError, AttributeError):
return False

@staticmethod
def _read_row(pattern, row: int, cols: int):
"""Read one grid row into a list of cell strings."""
Expand Down
32 changes: 32 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,35 @@ def _humanize_role(role: str) -> Dict[str, Any]:
return {"role": humanize_role(role)}


def _tab_order(app_name: Optional[str] = None,
max_results: int = 500) -> Dict[str, Any]:
"""Executor adapter: focusable elements in keyboard Tab order."""
from je_auto_control.utils.accessibility import list_accessibility_elements
from je_auto_control.utils.focus_order import tab_order
elements = list_accessibility_elements(app_name=app_name,
max_results=int(max_results))
return {"order": [el.to_dict() for el in tab_order(elements)]}


def _audit_focus_order(app_name: Optional[str] = None,
max_results: int = 500) -> Dict[str, Any]:
"""Executor adapter: WCAG focus-order audit over the app's elements."""
from je_auto_control.utils.accessibility import list_accessibility_elements
from je_auto_control.utils.focus_order import audit_focus_order
elements = list_accessibility_elements(app_name=app_name,
max_results=int(max_results))
return audit_focus_order(elements)


def _focus_control(name: Optional[str] = None, role: Optional[str] = None,
app_name: Optional[str] = None,
automation_id: Optional[str] = None) -> bool:
"""Executor adapter: set keyboard focus on a control (UIA SetFocus)."""
from je_auto_control.utils.focus_order import focus_control
return focus_control(name=name, role=role, app_name=app_name,
automation_id=automation_id)


def _a11y_record_start(app_name: Optional[str] = None,
poll_interval_s: float = 0.25,
min_movement_px: int = 8) -> Dict[str, Any]:
Expand Down Expand Up @@ -6127,6 +6156,9 @@ def __init__(self):
"AC_a11y_dump": _a11y_dump,
"AC_walk_tree": _walk_tree,
"AC_humanize_role": _humanize_role,
"AC_tab_order": _tab_order,
"AC_audit_focus_order": _audit_focus_order,
"AC_focus_control": _focus_control,
"AC_control_get_value": _control_get_value,
"AC_control_set_value": _control_set_value,
"AC_control_invoke": _control_invoke,
Expand Down
8 changes: 8 additions & 0 deletions je_auto_control/utils/focus_order/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""Keyboard focus order: expected Tab sequence, WCAG audit, and set-focus."""
from je_auto_control.utils.focus_order.focus_order import (
audit_focus_order, focus_control, is_interactive_role, tab_order,
)

__all__ = [
"is_interactive_role", "tab_order", "audit_focus_order", "focus_control",
]
87 changes: 87 additions & 0 deletions je_auto_control/utils/focus_order/focus_order.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
"""Keyboard focus order: expected Tab sequence, a WCAG audit, and set-focus.

Nothing in the toolkit reasons about *keyboard* navigation. ``focus_order`` adds:

* :func:`is_interactive_role` — is a role one that normally takes keyboard focus,
* :func:`tab_order` — the focusable elements in the order ``Tab`` will visit them
(their reading order: top-to-bottom, left-to-right),
* :func:`audit_focus_order` — a WCAG 2.4.x focus-order report over a flat element
list (the sequence plus flagged problems, e.g. a focusable element with no
visible area),
* :func:`focus_control` — set the keyboard focus on a control (device action).

The first three are pure functions over :class:`AccessibilityElement` lists —
``tab_order`` reuses :func:`element_parse.reading_order` for row banding and
``is_interactive_role`` reuses :func:`ax_tree_walk.humanize_role`, so no logic is
duplicated. ``focus_control`` is a thin dispatch onto the injectable
``accessibility.backends.get_backend()`` seam; the real ``SetFocus`` call lives in
the Windows backend. Imports no ``PySide6``.
"""
from typing import Any, Dict, List, Optional, Sequence, Union

from je_auto_control.utils.accessibility.element import AccessibilityElement
from je_auto_control.utils.ax_tree_walk import humanize_role
from je_auto_control.utils.element_parse import reading_order

# Roles that conventionally participate in keyboard tab navigation.
_INTERACTIVE_ROLES = frozenset({
"Button", "Calendar", "CheckBox", "ComboBox", "Edit", "Hyperlink",
"ListItem", "MenuItem", "RadioButton", "ScrollBar", "Slider", "Spinner",
"SplitButton", "Tab", "TabItem", "TreeItem", "DataItem", "Thumb",
})


def is_interactive_role(role: Union[str, int]) -> bool:
"""Return True if ``role`` is one that normally accepts keyboard focus."""
return humanize_role(role) in _INTERACTIVE_ROLES


def _box(element: AccessibilityElement, index: int) -> Dict[str, Any]:
left, top, width, height = element.bounds
return {"x": left, "y": top, "width": width, "height": height, "_idx": index}


def tab_order(elements: Sequence[AccessibilityElement], *,
row_tol: int = 12) -> List[AccessibilityElement]:
"""Return the focusable elements in the order ``Tab`` would visit them.

Filters to :func:`is_interactive_role` then orders by reading order (rows
within ``row_tol`` px share a row, ordered left-to-right).
"""
interactive = [el for el in elements if is_interactive_role(el.role)]
boxes = [_box(el, index) for index, el in enumerate(interactive)]
ordered = reading_order(boxes, row_tol=int(row_tol))
return [interactive[box["_idx"]] for box in ordered]


def audit_focus_order(elements: Sequence[AccessibilityElement], *,
row_tol: int = 12) -> Dict[str, Any]:
"""Return a WCAG 2.4.x focus-order report over a flat element list.

``order`` is the expected Tab sequence (``tab_index`` / ``name`` / ``role`` /
``bounds``); ``issues`` flags focusable elements with no visible area
(WCAG 2.4.7 Focus Visible — focus would land somewhere unseen).
"""
order = tab_order(elements, row_tol=row_tol)
sequence: List[Dict[str, Any]] = []
issues: List[Dict[str, Any]] = []
for tab_index, element in enumerate(order):
role = humanize_role(element.role)
_left, _top, width, height = element.bounds
sequence.append({"tab_index": tab_index, "name": element.name,
"role": role, "bounds": list(element.bounds)})
if width <= 0 or height <= 0:
issues.append({"tab_index": tab_index, "name": element.name,
"role": role, "issue": "zero_area_focusable",
"wcag": "2.4.7 Focus Visible"})
return {"order": sequence, "issues": issues,
"focusable_count": len(order), "issue_count": len(issues)}


def focus_control(name: Optional[str] = None, role: Optional[str] = None,
app_name: Optional[str] = None,
automation_id: Optional[str] = None) -> bool:
"""Set keyboard focus on the matched control (UIA SetFocus); True on success."""
from je_auto_control.utils.accessibility.backends import get_backend
return get_backend().set_focus(name=name, role=role, app_name=app_name,
automation_id=automation_id)
38 changes: 38 additions & 0 deletions je_auto_control/utils/mcp_server/tools/_factories.py
Original file line number Diff line number Diff line change
Expand Up @@ -1223,6 +1223,44 @@ def a11y_tree_tools() -> List[MCPTool]:
handler=h.humanize_role,
annotations=READ_ONLY,
),
MCPTool(
name="ac_tab_order",
description=("List the focusable controls in the order the keyboard "
"Tab key would visit them (reading order): "
"{order:[{name,role,bounds,center,...}]}."),
input_schema=schema({
"app_name": {"type": "string"},
"max_results": {"type": "integer"},
}),
handler=h.tab_order,
annotations=READ_ONLY,
),
MCPTool(
name="ac_audit_focus_order",
description=("WCAG 2.4.x focus-order audit over an app's controls: "
"{order, issues, focusable_count, issue_count}. Flags "
"focusable controls with no visible area."),
input_schema=schema({
"app_name": {"type": "string"},
"max_results": {"type": "integer"},
}),
handler=h.audit_focus_order,
annotations=READ_ONLY,
),
MCPTool(
name="ac_focus_control",
description=("Set keyboard focus on a control natively (UIA "
"SetFocus), located by name/role/app_name/"
"automation_id. Returns True on success."),
input_schema=schema({
"name": {"type": "string"},
"role": {"type": "string"},
"app_name": {"type": "string"},
"automation_id": {"type": "string"},
}),
handler=h.focus_control,
annotations=DESTRUCTIVE,
),
MCPTool(
name="ac_a11y_record_start",
description=("Start the polling accessibility recorder. "
Expand Down
15 changes: 15 additions & 0 deletions je_auto_control/utils/mcp_server/tools/_handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2897,6 +2897,21 @@ def humanize_role(role):
return _humanize_role(role)


def tab_order(app_name=None, max_results: int = 500):
from je_auto_control.utils.executor.action_executor import _tab_order
return _tab_order(app_name, max_results)


def audit_focus_order(app_name=None, max_results: int = 500):
from je_auto_control.utils.executor.action_executor import _audit_focus_order
return _audit_focus_order(app_name, max_results)


def focus_control(name=None, role=None, app_name=None, automation_id=None):
from je_auto_control.utils.executor.action_executor import _focus_control
return _focus_control(name, role, app_name, automation_id)


def a11y_record_start(app_name: Optional[str] = None,
poll_interval_s: float = 0.25,
min_movement_px: int = 8) -> Dict[str, Any]:
Expand Down
Loading
Loading