SikuliX API Docs Audit

This audit compares the documented behavior in the official SikuliX API docs with the current Go port implementation in packages/api/pkg/sikuli and packages/api/internal/grpcv1.

Scope:

Executive Summary

The largest parity gap is architectural: SikuliX documents Screen, Region, and Match as live, action-capable desktop objects. The current Go port exposes image-scoped search primitives in pkg/sikuli and moves live desktop work into a separate gRPC server plus controller types.

That means the Go port currently differs from the SikuliX docs in four material ways:

  1. The public Go Region API is image-driven, not live-screen-driven.
  2. The public Go Screen type is only a descriptor, not a live monitor/search/input surface.
  3. Match is a value object, not a Region-like object with direct action methods.
  4. Several documented SikuliX convenience APIs are either absent or split across separate controllers/RPCs.

Findings

Area SikuliX Docs Describe Current Go Port Impact
Region runtime model Region is a live search and action surface on the desktop Region methods require a source *Image; live-screen behavior exists only behind gRPC screen RPCs High
Screen surface Screen extends region behavior, represents monitors, supports capture and monitor selection Screen is only {ID, Bounds} in pkg/sikuli; no public capture/search/action methods High
Match semantics Match behaves like a Region and can be clicked/hovered/used directly Match is a plain struct (Rect, Score, Target, Index) High
Action methods Direct click, doubleClick, rightClick, hover, dragDrop, type, paste, wheel/mouse/key state operations Input is split into InputController plus gRPC RPCs; many direct convenience methods are absent High
Iteration model Finder.findAll() uses hasNext()/next() and destroy() lifecycle Go returns []Match; no iterator state or destroy() Medium
Exception/null semantics find() throws FindFailed; exists() returns null; setThrowException changes behavior Go uses (Match, bool, error) and sentinel errors; ThrowException / FindFailedThrows exist but are not wired into search behavior High
Multi-target helpers findAnyList, findBestList, getAll, related helper families are documented No equivalent helpers in pkg/sikuli or gRPC surface Medium
OCR helper surface Docs describe text(), findText(), collectLines(), collectWords() and related text workflows Go exposes ReadText and FindText only Medium
App/window model Docs describe richer App / Window workflows (window(), focusedWindow(), allWindows()) Go exposes Open, Focus, Close, IsRunning, ListWindows only Medium

Detailed Differences

1. Region is not a live desktop object in the public Go API

SikuliX documentation presents Region as the primary live desktop abstraction: search, wait, click, hover, type, paste, drag/drop, OCR, and observe all hang directly off the region or screen.

Current Go behavior:

Why this matters:

Relevant implementation:

2. Screen is only a descriptor, not a live action surface

SikuliX documentation describes Screen as a specialized region representing monitors, with monitor enumeration, monitor IDs, primary screen semantics, and capture helpers.

Current Go behavior:

Why this matters:

Relevant implementation:

3. Match is not Region-like

SikuliX documentation treats Match as a region-like result object that can be acted on directly.

Current Go behavior:

Why this matters:

Relevant implementation:

4. Direct action parity is incomplete

The Region docs describe direct action methods such as:

Current Go behavior:

Why this matters:

Relevant implementation:

5. Finder iteration and lifecycle differ

SikuliX Finder docs describe iterator-style behavior with findAll(), hasNext(), next(), and destroy().

Current Go behavior:

Why this matters:

Relevant implementation:

6. Miss and timeout semantics differ from the documented exception model

SikuliX docs describe:

Current Go behavior:

Why this matters:

Relevant implementation:

7. Multi-target search helpers documented in SikuliX are absent

The Region docs describe helper families such as findAnyList, findBestList, and getAll-style multi-target workflows.

Current Go behavior:

Why this matters:

Relevant implementation:

8. OCR helper surface is narrower than the docs

The Text and OCR docs describe a broader helper vocabulary, including plain text extraction, search, and collection helpers such as collectLines() and collectWords().

Current Go behavior:

Why this matters:

Relevant implementation:

9. App and window APIs are narrower than the documented SikuliX model

The App docs describe a richer app/window model around app instances, focused windows, and window-specific selection.

Current Go behavior:

Why this matters:

Relevant implementation:

Areas Where the Go Port Exposes Different or Additional Behavior

These are not gaps in the strict sense, but they are still behavior differences relative to the SikuliX docs:

Those differences are already covered at a higher level in behavioral-differences.md; this audit is focused on the public API behavior mismatches.

  1. Decide whether pkg/sikuli is intended to be a true SikuliX-shaped API or only a compatibility-oriented core for server/client wrappers.
  2. If true public parity is desired, prioritize:
  3. Keep java-to-go-mapping.md and this audit aligned. The generated mapping currently overstates parity in a few places where the core Go surface still differs materially from the docs.