This document explains what each benchmark scenario is intended to validate. Source manifest: find-on-screen-scenarios.example.json.
How to read this
Goal: What reliability/performance behavior we are testing.Stressors: What kinds of visual or geometric difficulty are intentionally introduced.What should pass: Expected matcher behavior when the scenario is healthy.
Scenario Types (Manifest)
vector_ui_baseline (kind=vector_ui)
- Goal: Validate baseline precision on clean, UI-like shapes.
- Stressors: Flat fills, crisp edges, light photometric jitter, structured decoys.
- What should pass: High IoU and tight bounding boxes with low false positives.
photo_clutter (kind=photographic)
- Goal: Validate matching against photographic content under clutter.
- Stressors: Real asset content, blur/noise/compression variation, mixed decoy placement, optional occlusion.
- What should pass: Stable positive matches even when local texture and photometric conditions drift.
repetitive_grid_camouflage (kind=repetitive_grid)
- Goal: Detect true target among many near-repeating distractors.
- Stressors: Dense repeated structures with high similarity and camouflage-like layouts.
- What should pass: Correct localization with minimized false positives in repetitive fields.
noise_stress_random (kind=noise_stress)
- Goal: Evaluate robustness under heavy corruption.
- Stressors: Gaussian + salt/pepper + compression artifacts, stronger photometric shifts, random decoy placement, occlusion.
- What should pass: Best-effort matching with controlled miss/false-positive rates in degraded inputs.
scale_rotate_sweep (kind=scale_rotate)
- Goal: Measure scale and rotation tolerance.
- Stressors: Broad scale range and large rotation range, mixed decoys, moderate occlusion.
- What should pass: Engines that claim transform robustness should maintain usable success at nontrivial scale/rotation offsets.
perspective_skew_sweep (kind=perspective_skew)
- Goal: Measure geometric invariance beyond pure rotation/scale.
- Stressors: Perspective corner shifts, skew, and additional photometric variation.
- What should pass: Feature-based/hybrid approaches should localize targets under projective distortions better than pure template-only modes.
orb_feature_rich (kind=orb_feature_rich)
- Goal: Validate feature detector/descriptor strengths in textured scenes.
- Stressors: Feature-rich synthetic textures, wider transform/photometric variation, clustered decoys.
- What should pass: ORB-family/feature engines should outperform template in success rate and maintain acceptable latency.
template_control_exact (kind=template_control)
- Goal: Control scenario for exact-ish template matching.
- Stressors: Minimal geometric change, constrained photometric variation, cleaner layout.
- What should pass: Template should perform strongly (high success, low false positives, low latency).
hybrid_gate_conflicts (kind=hybrid_gate)
- Goal: Verify hybrid policy selection logic under conflicting evidence across engines.
- Stressors: Mixed transforms/noise/decoys where some engines may disagree.
- What should pass: Hybrid should avoid obvious misses when at least one underlying engine has a strong, geometrically valid candidate.
multi_monitor_dpi_shift (kind=multi_monitor_dpi)
- Goal: Validate behavior across monitor profile differences (DPI/gamma/sharpness/color shift).
- Stressors: Round-robin monitor profiles with differing scale and photometric signatures.
- What should pass: Stable matching across profile changes without large regression in localization quality.
Default Fallback Pack (when no manifest is provided)
Fallback scenarios are generated in resolutionScenarioPack(...) inside:
packages/api/internal/grpcv1/find_on_screen_benchmark_test.go.
Per resolution, six scenarios are generated:
vector_r0_<res>: Baseline vector-like matching.photo_r90_<res>: Photo-style matching with mixed decoys.ui_r180_<res>: UI-style matching on darker palette.mix_resize_<xx>_<res>(down): Scale-down robustness.mix_resize_<xx>_<res>(up): Scale-up robustness.mix_rotate_<deg>_<res>: Rotation robustness with partial-acceptance behavior.
These fallback cases are intentionally simpler than manifest-driven scenarios and mainly serve as a deterministic baseline set.