Insight federation: how mureo borrows practitioner know-how without leaking it

mureo’s bundled skills capture a respectable amount of paid-search craft, but they are not, and will never be, a substitute for the working knowledge of an experienced practitioner. The interesting question is whether an agent can consult that practitioner during a diagnosis without the practitioner having to publish their playbook.

The previous answer was: not really. The two obvious paths both lose the practitioner the thing that makes them valuable. Bundle their notes into a skill and the notes are now public. Hand the agent an API that returns the full corpus on each call and you have published the same notes, one query at a time.

mureo 0.9.19 ships a third path. mureo_consult_advisor lets the agent ask a question, mureo forwards it to one or more external advisor servers, each server runs a vector search over its own private corpus, and the operator-side Claude only ever sees the top-k snippets that matched the query. The corpus stays on the advisor. The know-how stays the advisor’s. The agent gets just enough context to reason about the specific account it is looking at.

This post walks the design — what was tried first, why it leaked, and the shape that replaced it.

The version that leaked: v0.9.18 text-return

The first sketch, tracked in #163, had the advisor server simply return its own knowledge text to the agent on each consultation. Clean MCP signature, easy to implement, easy to reason about.

The failure mode is the one you would expect after stating it out loud. Every call leaked the corpus, or some fraction of it, to the operator side. After enough consultations a determined operator (or a careless one with a logging agent) would accumulate the whole thing. The advisor would have given away the asset.

The text-return design was never released. It was rebuilt before 0.9.19 cut.

The version that does not: retrieval-pattern federation

The 0.9.19 design moves the work to the advisor side and tightens what crosses the boundary:

The advisor server holds the corpus, the embedder, and the vector store. None of these are shipped or exposed.
mureo, on the operator side, builds a query string from the agent’s question plus optional campaign context (campaign name, status, daily budget, recent action_log entries).
The advisor embeds the query, runs vector search over its private store, and returns the top-k matching fragments with similarity scores. Default top_k is 3–5 per source.
The agent reasons over the fragments and applies what is relevant. No LLM lives on the advisor server — it is a retrieval endpoint, not a generation endpoint.

What crosses the wire on a single consultation is a query in and a handful of fragments out. The corpus the fragments came from never moves. Over many calls the operator can only accumulate the fragments the agent actually asked about — which is exactly the material that the advisor wanted surfaced for those situations.

The retrieval pattern does not make the corpus impossible to exfiltrate (a sufficiently motivated operator can run many queries and stitch results together), but it makes the exposure bounded by query relevance rather than unbounded by call count. That is the right shape for “I will let you consult my notes on the specific problem you have right now.”

Why “federation” — multiple advisors, isolated per call

A single agent often benefits from more than one advisor: a niche practitioner who knows B2B paid search in JP, an industry benchmarks service, an in-house data-science team’s playbook. insight_sources is a list:

{
  "sources": [
    {
      "name": "acme",
      "transport": "stdio",
      "command": "acme-advisor-mcp",
      "tool": "vector_search",
      "top_k": 5
    },
    {
      "name": "benchmarks",
      "transport": "http",
      "url": "https://benchmarks.example/mcp",
      "headers": {"Authorization": "Bearer PASTE_TOKEN_LITERAL_HERE"},
      "tool": "vector_search",
      "top_k": 3,
      "timeout_sec": 8
    }
  ]
}

The operator side fans the query out to every configured source in parallel and concatenates the results, one section per source, into a single Markdown block. Per-source isolation is mechanical: a slow, crashed, or malformed advisor never blocks the others; timeout_sec caps each call; failures collapse to “no fragments from that source” without aborting the workflow.

Three transports are supported (stdio, sse, http) so an advisor can be a local subprocess, an internal HTTP server, or a hosted MCP endpoint. The shape of the advisor is a single tool that takes {query, top_k} and returns a JSON list of {text, similarity, ...} fragments; extra fields (tags, case IDs, source URLs) are forwarded verbatim to the agent.

What the agent actually sees

A consultation comes back as a Markdown block with one section per responding advisor:

## acme
- (similarity 0.92) Use micro-conversions when CV volume is sparse...
- (similarity 0.81) Brand search CPA inflation typically tracks...

---

## benchmarks
- (similarity 0.78) Median CPA for B2B Search in JP is ~4,200 JPY...

If no sources are configured, the tool returns a pointer to the docs rather than silently failing. If every source returned nothing, the tool says so explicitly — the agent does not silently fall back to its own prior, which is exactly the failure mode “consult an advisor” was meant to avoid.

Where the operator pays attention

Two things in the operator setup are easy to get wrong because they behave differently from shell-style defaults:

Secrets are passed verbatim. mureo does not expand ${VAR} references inside env or headers. Paste the literal value, then chmod 600 ~/.mureo/insight_sources.json.
"env": {} is a sealed subprocess, not “inherit parent env”. An advisor configured that way will not see your shell’s OPENAI_API_KEY or cloud credentials. The parent environment leaks only when env is omitted entirely. This matches the principle of least privilege; if you wanted inheritance, you have to ask for it.

These are not the only sharp edges, but they are the two that a careful operator should check before pointing the agent at a production advisor.

How this sits next to `/learn`

/learn (and the new mureo_learning_insights_get MCP tool that closes its read-side gap in 0.9.x) captures this account’s notes on the operator’s local machine — corrections, observations, the “don’t repeat that mistake” record. mureo_consult_advisor is the opposite direction: other people’s know-how, queried at the point of use, never copied.

Both feed into the same diagnostic skills. From a recent change (PR #169) the seven core diagnostic skills now treat mureo_consult_advisor as the primary practitioner-knowledge channel — the agent is told, in the skill prompt, that consulting an advisor is the right first move when it is missing context. The default behaviour with no advisors configured is unchanged; the guidance kicks in when there is something to consult.

Why the shape matters

The control-plane architecture from the 0.9.x post is about operations — the audit log, the throttle, the rollback path applied uniformly across providers. Insight federation is the same shape applied to knowledge: a small, standard contract ({query, top_k} → fragments), a federation surface that any practitioner or benchmarks service can plug into, and a privacy posture that lets them plug in without giving up the thing that makes them worth consulting.

This is the boring infrastructure work that makes the interesting question — should the agent be drawing on outside expertise here — answerable in code, per call, per source, with the receipts that mureo’s audit pipeline keeps for everything else.

The insight-federation design is documented end-to-end in docs/insight-federation.md (EN) and docs/insight-federation.ja.md (JA), versioned with the OSS release. The MCP tool ships in mureo 0.9.19; PR references in this post are #163 (the text-return design that was withdrawn), #167 (federation), and #169 (embedding it as the primary knowledge channel in diagnostic skills).