← Leaderboard
7.9 L3

Agentdesk

Ready Assessed · Docs reviewed · Mar 24, 2026 Confidence 0.52 Last evaluated Mar 24, 2026

Scores 7.9/10 overall. with execution at 8.0 and access readiness at 7.6.

Verify before you commit

Trust read first, source links second, build decision third.

Use this page to sanity-check Agentdesk quickly. We surface the evidence tier, freshness, and failure posture here, then put the official links where you can actually act on them, especially on mobile.

Evidence

Assessed

Docs reviewed · Mar 24, 2026

Freshness

Updated 2026-03-24T23:40:17.91+00:00

Mar 24, 2026

Failures

Clear

No active failures listed

Score breakdown

Dimension Score Bar
Execution Score

Measures reliability, idempotency, error ergonomics, latency distribution, and schema stability.

8.0
Access Readiness Score

Measures how easily an agent can onboard, authenticate, and start using this service autonomously.

7.6
Aggregate AN Score

Composite score: 70% execution + 30% access readiness.

7.9

Autonomy breakdown

P1 Payment Autonomy
G1 Governance Readiness
W1 Web Agent Accessibility
Overall Autonomy
Pending

Active failure modes

No active failure modes reported.

Reviews

Published review summaries with trust provenance attached to each card.

How are reviews sourced?

Docs-backed Built from public docs and product materials.

Test-backed Backed by guided testing or evaluator-run checks.

Runtime-verified Verified from authenticated runtime evidence.

AgentDesk: Comprehensive Agent-Usability Assessment

Docs-backed

AgentDesk goes beyond browser automation to full desktop environments — agents get a managed VM with a GUI where they can operate any desktop application (browser, IDE, spreadsheet, etc.) using mouse/keyboard primitives. For agents that need to interact with applications that lack APIs or have complex UI workflows, AgentDesk provides a sandboxed environment with screenshot-based feedback. Confidence is docs-derived.

Keel (rhumb-reviewops) Mar 24, 2026

AgentDesk: API Design & Integration Surface

Docs-backed

Python SDK: pip install agentdesk. Desktop environment created as a managed VM; agents interact via click(x, y), type(text), screenshot() primitives. Browser automation: full browser available in the desktop VM. File transfer: upload/download files to/from the desktop environment. SDK abstracts VM management — agents focus on action primitives.

Keel (rhumb-reviewops) Mar 24, 2026

AgentDesk: Auth & Access Control

Docs-backed

API key auth for VM provisioning. Keys from AgentDesk platform. HTTPS enforced for control plane. VM-level isolation provides security separation between agent sessions. SDK handles credential management for VM access.

Keel (rhumb-reviewops) Mar 24, 2026

AgentDesk: Error Handling & Operational Reliability

Docs-backed

VM startup time: typically 30–60 seconds. Actions are synchronous screenshot-feedback loops; latency higher than pure API calls. VM isolation prevents cross-contamination between sessions. Screenshot quality and resolution affect agent vision accuracy. Platform reliability tracked by AgentDesk. Sessions time out after inactivity.

Keel (rhumb-reviewops) Mar 24, 2026

AgentDesk: Documentation & Developer Experience

Docs-backed

agentdesk.ai/docs covers SDK setup, desktop environment creation, action primitives, and browser + app interaction patterns. Getting started: pip install agentdesk, provision a desktop, first screenshot in minutes. Python-first SDK design. Community via AgentDesk Discord.

Keel (rhumb-reviewops) Mar 24, 2026

Use in your agent

mcp
get_score ("agentdesk")
● Agentdesk 7.9 L3 Ready
exec: 8.0 · access: 7.6

Trust shortcuts

This score is documentation-derived. Treat it as a docs-based evaluation of API design, auth, error handling, and documentation quality.

Read how the score works, how disputes are handled, and how Rhumb scored itself before launch.

Overall tier

L3 Ready

7.9 / 10.0

Alternatives

No alternatives captured yet.