K8s – A6h S5s

TL;DR: Early MCP servers that expose one tool per API endpoint work for small surfaces—then collapse under their own weight. As your API grows, switch to task‑first, opinionated, composite tools and sprinkle in utility tools that help agents work faster. Build for workflows, not for endpoints.

From endpoint wrappers to useful tools

When you first learn about MCP servers and agents, it’s tempting to mirror your API: generate a tool for every endpoint, wire it up, and let the agent figure it out. This feels productive—and it is, for small APIs.

But with large surfaces like OpenProject or Duende IdentityServer, the tool-per-endpoint approach turns into noise: hundreds of tools, context exhaustion, and long chains of brittle calls. You’re spending tokens describing plumbing rather than solving the user’s task.

Design principles that scale

Task first, not endpoint first. Start from the job-to-be-done and design a tool that completes it end-to-end.
Compose inside the server. Call multiple API endpoints from one tool. Keep coordination local; return a compact result.
Be opinionated. Accept sane defaults and hide incidental complexity. Provide escape hatches only when needed.
Add utilities, not just wrappers. Tools like “generate GUID”, “scan open docs”, or “summarize build logs” unlock workflows without touching external APIs.
Design for idempotency. Make reruns safe: dedupe by natural keys, tolerate partial progress, and support dry-runs.
Invest in observability. Log sub-steps, classify errors, and surface actionable summaries to the agent.

Pattern: Composite workflow tool

Example: creating a test user in Duende IdentityServer typically requires several steps—user creation, claims/roles, client/scopes, and secrets. Instead of six tools, expose identity.create_test_user that orchestrates everything.

// Pseudo-TS for an MCP tool handler
export async function createTestUser(input: {
  username: string;
  roles?: string[];
  clientId?: string;
  scopes?: string[];
  rotateSecret?: boolean;
}): Promise<{
  userId: string;
  clientId?: string;
  secret?: string;
  appliedScopes?: string[];
}> {
  // 1) Create or upsert user
  const user = await ids.users.upsert({ username: input.username });

  // 2) Attach roles/claims
  if (input.roles?.length) {
    await ids.users.assignRoles(user.id, input.roles);
  }

  // 3) Ensure client exists and scopes applied
  let clientId = input.clientId ?? `test-${user.id}`;
  await ids.clients.upsert({ clientId, allowedScopes: input.scopes ?? ["openid"] });

  // 4) Optionally rotate client secret
  let secret: string | undefined;
  if (input.rotateSecret) {
    secret = await ids.clients.rotateSecret(clientId);
  }

  // 5) Return a compact summary
  return { userId: user.id, clientId, secret, appliedScopes: input.scopes };
}

Benefits:

Fewer tool calls and tighter prompts.
Single place to enforce defaults, validation, and rollback.
Predictable output that agents can chain.

Pattern: Opinionated batches for project tools

In OpenProject, instead of exposing dozens of atomic actions, provide tools like:

openproject.create_sprint_skeleton: creates a version, a board, standard swimlanes, and pre-linked tasks.
openproject.bootstrap_feature: creates an epic, child tasks, relations, default assignee, and due dates.

These tools encode your team’s “how we work here” while keeping inputs minimal.

When to keep 1:1 endpoint tools

Long-tail maintenance/debugging where bespoke calls are rare but useful.
Exploratory or schema-driven generators when you need coverage but gate usage (e.g., behind a debug.* namespace).
Internal scaffolding used by your composite tools, not exposed to agents directly.

Practical checklist

What’s the user’s task? Name the tool with a verb: create_test_user, bootstrap_feature, archive_project.
What defaults unlock the 80% case? Make optional inputs truly optional.
What’s idempotent? Define natural keys and de-duplicate work.
What happens on step 3/5 failing? Plan partial results and compensating actions.
What should be logged or returned to help the next step? Prefer structured summaries over raw payloads.
How will you limit calls? Use timeouts, backoffs, pagination, and batch endpoints.

Utilities that earn their keep

Not everything needs an external API. High-leverage internal tools include:

generate_guid / shortid / slugify
scan_open_documents to build a context index for the agent
summarize_logs and extract_errors to reduce token usage
template_expand for config or code scaffolds

Wrap-up

As MCP ecosystems grow, the winning strategy isn’t more tools—it’s better tools. Design around outcomes, compose steps inside the server, and provide a few sharp utilities that make agents effective. Fewer calls, clearer contracts, happier users.

A little background, I played World of Warcraft during the Shadowlands expansion and there came a time that I began to care about min/maxing my character. For context, min/maxing is the process of minimizing undesirable qualities and maximizing desirable ones. Typically, when min/maxing a character, you pursue best-in-slot (BiS) items and use item attribute weights ( aka stats weights) to inform choices for preparing your character for various challenges. A typical tool that is used to calculate this information is an open source project called SimulationCraft. Since this application can be somewhat involved and require a moderate strength computer, not all players are able to make use of it and are unable to explore min/maxing. Introduce RaidBots, a service that allows a player to run SimulationCraft simulations to calculate their character’s ideal items and their stat weights (among many other things) through a website. It’s a well built and convenient tool providing free services as well as a paid tier for players that want more.

With that context, lets talk about me. I’m cheap… I investigated RaidBots and took advantage of their free tier. When using the free tier, when you request service, you get in line with all other free tier users. During low traffic times, the wait for services is about 1-2 minutes. During high traffic times however, I’ve waited as long as 20 minutes for a simulation to start. If I bumped up to the paid tier, I could skip this line. As a note, I understand why there’s a line for the free tier and the cost of the paid tiers is acceptable for the value you get. Now, I’ve been working on a personal project for a while and had needed a way to distribute Android application packages (apks) to my testing friends because I don’t want to wait till I finish the laundry list of requirements to distribute a beta application on the Play store. Additionally, sending the apks through discord/GoogleDrive/Email was clunky and unreliable at best. One night while waiting ~15 minutes in the RaidBots public line I got the idea to build my own RaidBots like service and use it as a reason to prove out a content delivery network (CDN) to also solve my apk delivery woes.

Fast forward to the end, I built it and it has saved many hours of sims and provided me lots of fun/interesting problems to solve and technologies to study.

I’ll be adding more posts later to further describe the system if there’s interest or I if get bored. However, included below is a high level diagram of the application. The services shown, except for the Identity Server service and the TruNAS server, are automatically built and deployed to a Kubernetes (k8s) cluster (This is honestly a meaningful use for k8s as opposed to this other project of mine).

High level overview of the Atriarch Simc Runner architecture

Additional technology not listed in the diagram that helps to bring this application together is listed below:

ELK stack
Kubernetes
NexusOSS
ArgoCD
GitHub
Jenkins

I hope this was interesting at the very least. If you have a question about problems I may have tacked, place them in the comments below and I’ll see what I can answer.

Tag: K8s

Stop Wrapping Every Endpoint: Practical Patterns for MCP Servers