-
Notifications
You must be signed in to change notification settings - Fork 1
Anthropic CUA Template update (Playwright -> Computer Controls) #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Verified the new templates work with recorded replays. Similar resulting video for python and typescript templates. replays.3.mp4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice migration overall — the templates read a lot simpler without Playwright/CDP plumbing and the replay recording flow is a good addition.
Main things I called out inline:
- Make session teardown a bit more robust (avoid leaving stale state around in TS; ensure Python cleanup always deletes the browser even if replay polling fails).
- Small TS tool polish: remove now-unused import, and default scroll coordinates to the tracked mouse position instead of hardcoding (0,0).
- Preserve unexpected Anthropic content blocks in
_response_to_paramsto avoid silently dropping future/new block types.
None of these block the PR, just small hardening/maintainability tweaks.
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
|
working on tembo / cursor comments now |
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
rgarcia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good refactor from Playwright to Kernel Computer Controls API. The Python implementation looks solid. Main issues are in the TypeScript side:
- Key mappings need to use X11 keysym names (Python has correct mappings to reference)
- Minor naming/cleanup items
The existing comments from previous review cover the session cleanup and scroll behavior edge cases well.
@rgarcia all fixed and tested the templates. tagging in case you want to do another check |
pkg/templates/typescript/anthropic-computer-use/tools/computer.ts
Outdated
Show resolved
Hide resolved
tnsardesai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. should we consider adding an api to get cursor position?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Fix remaining TS items + update Python template for Anthropic CUA to utilize computer controls instead of Playwright. Still do to: optimize click location issues.
…8 viewport and Claude Sonnet 4.5 Updates both TypeScript and Python Anthropic Computer Use templates: - Set viewport to 1024x768@60Hz (Anthropic recommended size) - Update model to claude-sonnet-4-5-20250929 - Fix coordinate alignment between browser viewport and computer tool dimensions Changes: - pkg/templates/typescript/anthropic-computer-use/ - tools/computer.ts: display_width_px=1024, display_height_px=768 - session.ts: viewport 1024x768@60Hz - index.ts: model updated to claude-sonnet-4-5-20250929 - pkg/templates/python/anthropic-computer-use/ - tools/computer.py: width=1024, height=768 - session.py: viewport 1024x768@60Hz - main.py: model updated to claude-sonnet-4-5-20250929 Test replays (magnitasks.com Kanban drag test - moved 5 items to Done): - TypeScript: https://proxy.iad-awesome-blackwell.onkernel.com:8443/browser/replays?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE4MDAwMTgyNTYsInNlc3Npb24iOnsiaWQiOiJmZDA3NGRxZjY5bnNlcjk4aDliNGtrb3giLCJjZHBQb3J0Ijo5MjIyLCJjZHBXc1BhdGgiOiIiLCJpbnN0YW5jZU5hbWUiOiJicm93c2VyLXN0ZWFsdGgtcHJvZHVjdGlvbi01LWFsbG93ZWQtaGFtbWVyaGVhZC00MjcxIiwiZnFkbiI6InF1aWV0LXRyZWUtM3kybnd6c2EucHJvZC1pYWQtdWtwLWJyb3dzZXJzLTAub25rZXJuZWwuYXBwIiwibWV0cm8iOiJodHRwczovL2FwaS5wcm9kLWlhZC11a3AtYnJvd3NlcnMtMC5vbmtlcm5lbC5ydW4vdjEiLCJ1c2VySWQiOiJ3ODdoNHd1dTRoazNmeHFyZW5iNzFrMnAiLCJvcmdJZCI6ImlxMnRmMjUzbWlsOWptOWhmZjI3bDhyMiIsInN0ZWFsdGgiOnRydWUsImhlYWRsZXNzIjpmYWxzZSwicmVwbGF5UHJlZml4IjoiczM6Ly9rZXJuZWwtYXBpLXByb2Qvc2Vzc2lvbnJlcGxheXMvaXEydGYyNTNtaWw5am05aGZmMjdsOHIyL2ZkMDc0ZHFmNjluc2VyOThoOWI0a2tveCIsImtlcm5lbEh0dHBTZXJ2ZXJQb3J0Ijo0NDQsInRpbWVvdXRTZWNvbmRzIjozMDAsImNyZWF0ZWRBdCI6IjIwMjYtMDEtMTVUMTM6MDQ6MTYuNzc2OTEwOTc5WiIsImltYWdlIjoib25rZXJuZWwva2VybmVsLWN1LXYyNTo5NmYzOGU0Iiwic3RlYWx0aFByb3h5SWRlbnRpZmllciI6Ijg3NTY1X25YREZGQDIxNi4yNDcuMTAyLjE1MDo2MTIzMiIsImxpdmVTbHVnIjoia3c5b0lBc1VzRkxlIiwicHJpdmF0ZUlQIjoiMTcyLjE2LjIuMjAxIiwidmlld3BvcnRXaWR0aCI6MTAyNCwidmlld3BvcnRIZWlnaHQiOjc2OCwidmlld3BvcnRSZWZyZXNoUmF0ZSI6NjB9fQ.GHE2BXg6qrtNMoqO6NvuJ9fbHTW15igfmXl7W-ls3Qg&replay_id=wipxrn813lmajv7ukdkuykoa - Python: https://proxy.iad-awesome-blackwell.onkernel.com:8443/browser/replays?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE4MDAwMTc4OTUsInNlc3Npb24iOnsiaWQiOiJseTVxOXQxa3F6YXR3NzE1N3lpYzl2M3IiLCJjZHBQb3J0Ijo5MjIyLCJjZHBXc1BhdGgiOiIiLCJpbnN0YW5jZU5hbWUiOiJicm93c2VyLXN0ZWFsdGgtcHJvZHVjdGlvbi01LXJlYWwtd2F0Y2htZW4tNTUxNCIsImZxZG4iOiJ0d2lsaWdodC1ib25vYm8tZGFvZTd5ZngucHJvZC1pYWQtdWtwLWJyb3dzZXJzLTAub25rZXJuZWwuYXBwIiwibWV0cm8iOiJodHRwczovL2FwaS5wcm9kLWlhZC11a3AtYnJvd3NlcnMtMC5vbmtlcm5lbC5ydW4vdjEiLCJ1c2VySWQiOiJ3ODdoNHd1dTRoazNmeHFyZW5iNzFrMnAiLCJvcmdJZCI6ImlxMnRmMjUzbWlsOWptOWhmZjI3bDhyMiIsInN0ZWFsdGgiOnRydWUsImhlYWRsZXNzIjpmYWxzZSwicmVwbGF5UHJlZml4IjoiczM6Ly9rZXJuZWwtYXBpLXByb2Qvc2Vzc2lvbnJlcGxheXMvaXEydGYyNTNtaWw5am05aGZmMjdsOHIyL2x5NXE5dDFrcXphdHc3MTU3eWljOXYzciIsImtlcm5lbEh0dHBTZXJ2ZXJQb3J0Ijo0NDQsInRpbWVvdXRTZWNvbmRzIjozMDAsImNyZWF0ZWRBdCI6IjIwMjYtMDEtMTVUMTI6NTg6MTUuMzk0MjQyNTc3WiIsImltYWdlIjoib25rZXJuZWwva2VybmVsLWN1LXYyNTo5NmYzOGU0Iiwic3RlYWx0aFByb3h5SWRlbnRpZmllciI6Ijg3NTY1X25YREZGQDE0MC4yMzMuMjQ5LjE3NDo2MTIzNCIsImxpdmVTbHVnIjoiak5DdGdpdHRreGtrIiwicHJpdmF0ZUlQIjoiMTcyLjE2LjcuMTMzIiwidmlld3BvcnRXaWR0aCI6MTAyNCwidmlld3BvcnRIZWlnaHQiOjc2OCwidmlld3BvcnRSZWZyZXNoUmF0ZSI6NjB9fQ._AhzTu1HwawrWwDgo66K3FZkEh4dpiOEVPmBTO4A21A&replay_id=pa0ha28zodehf1e1jyv1qibn Resolves KERNEL-725
Updated invokecommand example for the anthropic templates
…sistency TypeScript template: - Add xdotool-format key mappings for consistency with Python template - Rename methods from convertToOnKernelKey to convertToKernelKey - Fix scroll fallback to use lastMousePosition instead of [0, 0] - Fix scroll amount using ?? operator to handle zero correctly - Remove unused KeyboardUtils import - Fix error message: "OnKernel" → "Kernel" - Reset all state fields (liveViewUrl, replayViewUrl) on session stop - Handle replay recording failures gracefully with try/catch Python template: - Wrap cleanup in try/finally to ensure browser deletion on errors - Handle replay recording failures gracefully with try/except - Preserve unexpected Anthropic content block types in loop
…otool behavior Anthropic's reference implementation uses xdotool where each scroll_amount unit equals one scroll wheel click (~120 pixels). Previously: - TypeScript used the value directly - Python used a 10x multiplier Both now use 120x to match Anthropic's expected behavior for AI agents.
Wrap replay stopping logic in try/finally to ensure browser session is always deleted even if stopReplay() fails. This prevents resource leaks on the Kernel platform when replay recording is enabled and stopping fails. Matches the existing Python implementation behavior.
d8b7ec1 to
ac3baaa
Compare

Anthropic Computer Use Template Overhaul
This PR overhauls both the TypeScript and Python Anthropic Computer Use templates to use Kernel's Computer Controls API instead of Playwright for all browser interactions.
Why This Change
The previous implementation used Playwright directly, which required maintaining browser connections and handling lower-level browser automation. By migrating to Kernel's Computer Controls API, users get:
Architecture Overview
File Structure (TypeScript)
The Python template follows the same structure with equivalent modules.
Key Components
1. KernelBrowserSession (
session.ts/session.py)Manages the browser lifecycle as a context manager:
Features:
2. ComputerTool (
tools/computer.ts/tools/computer.py)Maps Anthropic's computer use actions to Kernel's Computer Controls API:
left_click,right_click,double_clickcomputer.clickMouse()mouse_movecomputer.moveMouse()left_click_dragcomputer.dragMouse()typecomputer.typeText()keycomputer.pressKey()scrollcomputer.scroll()screenshotcomputer.captureScreenshot()Key implementation details:
lastMousePositionto support drag operations from current positioncomputer_use_20241022andcomputer_use_20250124API versions3. Sampling Loop (
loop.ts/loop.py)Implements the Anthropic computer use prompt loop:
New Features
Replay Recording
Users can enable video replay recording by passing
record_replay: truein the payload:kernel invoke ts-anthropic-cua cua-task --payload '{"query": "...", "record_replay": true}'The response includes a
replay_urlfield with a link to view the recorded session.Known Limitations
Cursor Position: The
cursor_positionaction is not supported with Kernel's Computer Controls API. If the model attempts to use this action, an error is returned. This is a known limitation that does not significantly impact most workflows, as the model tracks cursor position through screenshots.Testing
Both templates have been tested with the magnitasks.com Kanban board task, which exercises:
left_click_drag)Updated Documentation
Note
Overhauls the Anthropic Computer Use templates to use Kernel’s Computer Controls API instead of Playwright, with built-in browser session management and optional replay recording.
tools/computer.ts,loop.ts,session.ts,index.ts) and Python (tools/computer.py,loop.py,session.py,main.py) templatesKernelBrowserSessionto manage browser lifecycle, live view, and replays (configurable viewport1024x768@60Hz; stop/poll forreplay_url)ToolCollectionwith Kernel client andsessionId; handle thinking blocks and tool_use routing; enable prompt cachingcursor_position; track last mouse position for drags; standardize typing delay and screenshot flow@onkernel/sdk/kernelto0.24.0); remove Playwright deps and code pathsrecord_replayflagpkg/create/templates.goto new invoke payloadsWritten by Cursor Bugbot for commit ac3baaa. This will update automatically on new commits. Configure here.