Let the AI do the coding

There are two ways an agent consumes this API. Connecting the Rails MCP lets an agent operate the procurement workflow at runtime. This page is about the other one: an agent that builds the integration — writes the client, the webhook receiver, the UI — from the documentation alone.

That second case is the real bar. If a coding agent can ship a working app having read nothing but your docs, your docs are your developer experience. So the artifact ships a concrete task to measure exactly that.

The “Parts Desk” task

A coding agent, in an empty folder, is given only this prompt and the sandbox URL — no endpoints, no method names, no field formats:

“Done” is judged by a human clicking through the running app — six user-visible outcomes:

Start a job from a vehicle — open a repair job from a VIN, with the app obtaining its own API access on first run (nothing hand-pasted).
Show the parts to replace — and let the manager add to the list.
Show supplier offers with prices — grouped by supplier, each with a price, selectable.
Place and lock in the order — committed by the supplying side, not just a draft.
Live, trustworthy status — the order state updates on its own, and the app proves each update genuinely came from the platform (a visible trust signal), not a forged poll.
Show the settled invoice — itemized, each line matched against what was ordered.

Outcomes 1 (provision without a human) and 5 (prove a live update is authentic) are deliberately not pre-chewed — they’re the two places human-first docs usually lose an integrator. The docs are sufficient to clear both; they just aren’t smoothed.

Try it yourself

The fastest way to feel the thesis is to point Claude Code at the sandbox and watch it build:

Give it the docs as tools (optional but recommended) — connect the public Docs MCP so the agent can search_docs instead of crawling:
Terminal window
```
claude mcp add --transport http partifact-docs \
  "https://partifact-docs-mcp.thanhvuttv.workers.dev/mcp"
```
In an empty folder, hand Claude Code the prompt above, with the sandbox URL https://partifact-mock-rails.thanhvuttv.workers.dev. Let it read /llms.txt, the pages, and /openapi.json, then write the client.
Drive the result. It should provision its own credentials, open the seeded job, fetch offers, place + confirm an order, verify a signed webhook, and show the reconciled invoice.

The experiment behind it

The same task is run three times against the same sandbox, varying only the documentation surface the server exposes (DOCS_MODE, enforced server-side, not by prompt):

Arm	`DOCS_MODE`	What the agent gets
1 — control	`html`	Human-rendered HTML pages only. The machine surfaces genuinely 404.
2 — agent-navigable	`markdown`	Arm 1 + raw markdown per page, `/llms.txt`, the full-corpus digest, and `/openapi.json`.
3 — full	`full`	Arm 2 + the Docs MCP to search and navigate the docs as tools.

The headline question: which documentation investments actually move a coding agent from prose to a working, secure integration — and by how much? Reporting “llms.txt was nearly as good as a docs MCP” would be just as valid a result as the opposite — it’s a measured hypothesis, not an assumed win.

The task spec (repo) The full Parts Desk definition, rules, and sufficiency note.

Protocol & metrics (repo) The 3-arm protocol, the six-column metric schema, and pass criteria.