Proj/curate dag cutover#578
Conversation
Replaces the 50-iteration agent loop in `brv curate` with a deterministic
7-slot DAG executed via Kahn's algorithm + bounded pMap concurrency.
Slot order: recon → chunk → extract → group → dedup → conflict → write.
Pure-JS slots (recon, chunk, group, dedup) wrap existing curation
helpers without adding LLM cost. Service-bound slots delegate via
NodeServices, which the live adapter wires to:
- extract → agent.generate per chunk (no agent loop)
- detectConflicts → SearchKnowledgeService subject lookup, with
existing-memory loader returning matched paths so
conflict-node can emit `update` decisions
- write → executeCurate against the curate-tool
Per-curate internal LLM calls drop from ~50 (loop iterations) to ~5–12
(one per LLM-bound slot). Headline "external agent contributes node
code" capability lands in Phase 3; this commit ships the substrate.
Architecture boundary: abstractions (NodeSlot, runner, MetricsCollector,
slot contracts) live in `src/agent/core/curation/flow/`; concrete node
implementations + dag-builder + services-adapter + existing-memory-loader
live in `src/agent/infra/curation/flow/` (preserves the core→infra
one-way dependency rule).
Includes:
- Pre-compaction parallel hoist with orphan-session guard
(~5–10s latency win on large contexts).
- Snapshot-parity test for DAG plumbing consistency across runs.
- Live integration tests against real `executeCurate` writing to a
tempdir (catches schema mismatches like path-format and UPDATE
existingId handling that stub-only tests would miss).
- UUID-variable-naming regression tests moved to LocalSandbox suite
where they actually belong (post-cutover the executor no longer
injects sandbox variables).
Verification: 135 tests passing across all curate-flow + adjacent +
sandbox + workspace + live-write suites; npm run typecheck clean;
npm run lint shows 0 errors (227 pre-existing warnings unchanged).
Plan + design + code-review notes: see plan/agent-driven-graph/.
The documented scope gap on old-vs-new behavioral parity is mitigated
by manual smoke testing during the dogfood window before Phase 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-6, NEW-1) Builds on the Phase 1 cutover (5d54c4f) with three layered phases bundled into a single PR. Real-LLM UAT (Phase 4) passed 7/7 scenarios. Phase 2 — slot sandboxing + parallel extract fan-out - Per-slot sandbox in src/agent/core/curation/flow/sandbox/: tool allowlist via Proxy guard + cooperative timeout via AbortController + parent-signal propagation. Plain JS wrappers, NOT node:vm — same interface so Phase 3 can swap when untrusted agent JS first runs in-slot. - Schema gate (validateAndRun) with strict + soft modes. Soft is the runner default: input-fail strands downstream with one clear error; output-fail warns and forwards rawOutput so a too-strict schema doesn't strand the graph. - Extract-node parallelized via pMap inside the node (not N parallel DAG instances). Stub bench: xlarge (22 chunks) 1130ms→311ms at c=4. - BENCH=1 opt-in harness with committed baselines under test/benchmarks/. - xlarge.txt fixture (~155KB) + delay() helper extracted to test/helpers. Phase 2.5 — quality polish (R-1, R-2, R-3, R-4, R-6) - R-1 hotfix: conflict-detector + conflict-resolver track facts/keywords/tags. executeUpdate's structural-loss machinery merges them back; UPDATE no longer silently overwrites existing facts. - R-2 deterministic metadata derivation in metadata-derivers.ts (deriveTags, deriveKeywords, deriveRelated). Wired into services-adapter content fields. Uses toSnakeCase for slug parity with executeCurate's writer. - R-3 provenance envelope in Reason: cur-<logId> + ISO timestamp + source channel + subject + category + statement preview. Six-file plumbing chain through TaskExecuteSchema → task-router → agent-process → curate-executor → buildLiveServices → buildReason. Envelope ALWAYS emitted; UPDATE's d.reason appended as `Decision: ...` INSIDE the envelope. - R-4 subject-as-title + ADD→UPSERT switch: filenames are clean <subject>.md slugs. UPSERT routes in-batch same-subject collisions through executeUpdate's safe merge path instead of executeAdd's blind overwrite — closes a latent data-loss bug both source plans missed. - R-6 effectiveSummary in executeUpdate: when structural-loss merge happened and >1 fact landed in the resolved set, regenerate frontmatter `summary` as `;`-joined statements (200-char cap) instead of leaving it stale. Phase 2.6 — Phase 4 UAT follow-ups - R-5 ROLLBACK extractConcurrency 8→4 per Phase 2.5 §3.5 go/no-go gate: Phase 4 UAT showed c=8 regressed Scenario 3 from 150s→169s (rate-limit retries on gpt-5.4-mini). Real perf work moves to a Phase 6 spike with profiler-driven measurements. - NEW-1 fix: deriveRelated emitted phantom paths when multiple decisions in one batch routed to the same target file (UPSERT collision OR cross-batch UPDATE merge via lookupSubject). New deriveRelatedFromResolved operates over RESOLVED targets and filters same-target decisions. services-adapter.write refactored to two-pass (resolve → build) so cross-links never reference non-materialized files. Phase 4 UAT data: 3508 cross-links, 0 dangling. - NEW-2 doc: UPSERT consolidation (Phase 4+ behavior, sibling-split removed) recorded as accepted design in services-adapter inline comment + plan §8. Tests: 7072 passing (up ~50 from regression coverage across the three phases). Lint clean, typecheck clean. Open carry-forward: R-5 perf — 182s vs ≤120s spec on Scenario 3. Functional, no timeouts. Phase 6 ticket: profile and budget chunked-curate perf to ≤120s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…answer) Splits the legacy brv-query MCP tool into three LLM-free tools so external agents can drive the search → synthesize → cache loop with their own model instead of the daemon's internal LLM. Curate-flow path is untouched. Tasks 5.1-5.6 + acceptance #8 round-trip integration test all landed. UAT round 2 passed 17/20 (2 partials and 1 skip all out-of-scope per script §5); see plan/agent-driven-graph/PHASE-5-CODE-REVIEW.md for the full pass-by-pass lineage (Codex × 4, Kimi × 1, UAT × 2). Task 5.1 — QueryDispatcher (tier 0/1/2 reusable core) - New src/server/infra/dispatcher/query-dispatcher.ts: stateless dispatcher with discriminated-union DispatchResult. Tier 0 (exact cache) → tier 1 (Jaccard fuzzy) → tier 2 (BM25 + direct-response). - Public BrvSearchResult DTO + toBrvSearchResult() mapper: snake_case per DESIGN §6.1 (cached_answer / total_found / timing_ms); internal searchResult and directAnswer dropped from MCP wire (Codex F4 fix). - DEFAULT_LIMIT = 10 to match DESIGN §6.1; DIRECT_RESPONSE_MAX_DOCS = 5 for the internal direct-response slice (Pass 6 decoupling). - Refactored query-executor.ts to delegate tiers 0/1/2 to dispatcher; tier 3/4 LLM path unchanged. Removed dead tryDirectSearchResponse + unused direct-response imports (W1 fix). - Extracted buildPrefetchedContext() to prefetch-context-builder.ts so GatherExecutor reuses the helper without duplicating logic. Task 5.2 — brv-search MCP tool - src/server/infra/mcp/tools/brv-search-tool.ts: registers brv-search, sends task type 'mcp-search' (not 'search' — that's CLI BM25 only). - agent-process.ts case 'mcp-search': computes fingerprint via new QueryExecutor.resolveFingerprint() so tier 0/1 cache hits work (F2 fix), maps via toBrvSearchResult before serialization (F4), attaches typed payload to MCP _meta channel for tool-aware clients (W2 fix). - LLM_FREE_TASK_TYPES allowlist exempts mcp-search/gather/record-answer from provider validation (F3 fix). Task 5.3 — brv-gather MCP + CLI - New gather-executor.ts: pure data assembly (BM25 + prefetch bundle + token estimate via shared estimateTokens + rule-based follow_up_hints). Never invokes the LLM; constructor accepts only searchService — pinned by structural test. - New brv-gather-tool.ts (MCP) + brv gather CLI command with --limit/--scope/--token-budget/--format flags. - shared transport encoder gather-content.ts. Task 5.4 — brv-record-answer MCP + CLI - New record-answer-executor.ts: cache.set graceful (returns recorded:false when no cache or set throws — never errors). - New brv-record-answer-tool.ts (MCP) + brv record-answer CLI command (positional query + answer args, required --fingerprint flag). - QueryDispatcher.cache made public so agent-process shares one instance between dispatcher reads and record-answer writes. Task 5.5 — brv-query deprecation marker + telemetry - brv-query-tool.ts: [deprecated] description prefix, (deprecated) title suffix, _meta.deprecated:true with replacedBy hints (MCP SDK doesn't allow custom annotation fields). - New deprecation-telemetry.ts: appends one JSONL line per legacy invocation to <getGlobalDataDir>/telemetry/mcp-deprecation.jsonl. Honors BRV_DATA_DIR. Best-effort (swallows I/O errors). - CLI 'brv query' is unaffected — only MCP brv-query is deprecated. Task 5.6 — SKILL.md + mcp-workflow.md updates - SKILL.md: new ## MCP Workflow section, 3rd ## Workflow bullet for the LLM-free synthesis loop, two new CLI sections (### 4. Gather Context Bundle, ### 5. Record Synthesized Answer), downstream sections renumbered to 6-13. - All references use HYPHENATED MCP names (brv-search, brv-gather, brv-record-answer) matching actual registrations — Pass 7 caught the prior mismatch where SKILL.md taught underscored names that didn't exist (DESIGN.md keeps underscored prose for readability). - sections/mcp-workflow.md (loaded by RuleTemplateService.generateMcpContent for connector-installed agents): rewrote from legacy two-tool flow to full Phase 5 pipeline with deprecation callout + worked example. UAT B3 fix — CLI exit codes - src/oclif/commands/{search,gather,record-answer}.ts: validateInput now sets process.exitCode = 1 before returning false. CI scripts (e.g., 'brv search "" || echo failed') can now detect failure. Round-trip acceptance (#8) - test/integration/mcp/query-split-roundtrip.test.ts: executor-level cache-loop test (4 cases) — proves shared-cache wiring works search → gather → record → cached-search. - test/integration/mcp/query-split-mcp-harness.test.ts: real MCP protocol test using @modelcontextprotocol/sdk InMemoryTransport + Client + McpServer — drives client.callTool('brv-search') through the wire. Catches what executor-level can't: tool name registration mismatches, schema validation, _meta correctness, encoding/decoding wire issues. Tests: 229 passing across the Phase 5 surface (dispatcher, executors, MCP tools, CLI commands, templates, integration). Lint clean. Typecheck clean. Out-of-scope carry-forwards documented in plan/agent-driven-graph/CARRY-FORWARDS.md: - ByteRover provider 401 token-expiry UX (separate provider-auth ticket) - Tier 0 vs 2 latency gap retest on 500+ leaf corpus (Phase 6 perf spike) - brv query CLI --format json missing tier field (optional polish) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude finished @ngduyanhece's task in 4m 42s —— View job PR Review: Curate DAG Cutover + Query SplitTasks
Overall Assessment — Score: 4 / 5This is a well-engineered, thoroughly-documented architectural PR. The Kahn-based topological runner, slot contracts, sandbox proxy, and tiered query dispatcher are all correct. The phase-gate pattern (Phase 2 plain-JS wrapper → Phase 3 The issues below are limited in scope. None are blockers. Two are worth fixing before merge (correctness + project standards). IssuesBlocking / Must FixNone. Important — Fix Before Merge1. 2. Empty bundle from 3. Silent Minor / Nit4. 5. Missing 6. 7. 8. Duplicate cache guard in What's well done
|
| // so legacy callers still receive the raw daemon response. | ||
| let parsedMeta: BrvSearchResult | undefined | ||
| try { | ||
| parsedMeta = JSON.parse(result) as BrvSearchResult |
There was a problem hiding this comment.
issue (correctness/standards): JSON.parse(result) as BrvSearchResult violates the project's TypeScript rule against as assertions (CLAUDE.md). If the daemon returns a non-JSON string or a shape that diverges from BrvSearchResult, the cast silently accepts invalid data and the _meta downstream consumers will see wrong types at runtime.
Use the existing BrvSearchResult Zod shape (or a lean inline parse) instead:
| parsedMeta = JSON.parse(result) as BrvSearchResult | |
| let parsedMeta: BrvSearchResult | undefined | |
| try { | |
| const raw: unknown = JSON.parse(result) | |
| // Light structural check — at minimum tier + status must be present | |
| if ( | |
| raw !== null && | |
| typeof raw === 'object' && | |
| 'tier' in raw && | |
| 'status' in raw | |
| ) { | |
| parsedMeta = raw as BrvSearchResult | |
| } | |
| } catch { | |
| // raw text fallback below | |
| } |
Alternatively, export a Zod schema for BrvSearchResult alongside toBrvSearchResult and use schema.safeParse(JSON.parse(result)).
| * If the bundle exceeds the token budget, truncate to the last full section | ||
| * (sections are joined by `\n\n---\n\n` per `buildPrefetchedContext`). | ||
| */ | ||
| function applyTokenBudget(bundle: string, tokenBudget: number): string { |
There was a problem hiding this comment.
issue (correctness): When no section individually fits under tokenBudget, kept stays empty and applyTokenBudget returns ''. An agent that calls brv-gather with a very tight token_budget (e.g., 150 tokens) but a large first section silently receives prefetched_context: ''—indistinguishable from a zero-results search. The invariant should guarantee at least the first section to avoid the "phantom empty" case.
| function applyTokenBudget(bundle: string, tokenBudget: number): string { | |
| function applyTokenBudget(bundle: string, tokenBudget: number): string { | |
| if (!bundle) return bundle | |
| const tokens = estimateTokens(bundle) | |
| if (tokens <= tokenBudget) return bundle | |
| const sections = bundle.split('\n\n---\n\n') | |
| const kept: string[] = [] | |
| let accumulated = 0 | |
| for (const section of sections) { | |
| const sectionTokens = estimateTokens(section) | |
| if (accumulated + sectionTokens > tokenBudget) break | |
| kept.push(section) | |
| accumulated += sectionTokens | |
| } | |
| // Always include at least the first section so callers receive some context | |
| // even when token_budget is smaller than that section. | |
| if (kept.length === 0 && sections.length > 0) { | |
| return sections[0] | |
| } | |
| return kept.join('\n\n---\n\n') | |
| } |
| } | ||
| } | ||
| }, | ||
| onError({error}) { |
There was a problem hiding this comment.
issue (UX): onError only emits output in json format; in text mode the callback body is empty, so a task-level error (e.g., daemon emits task:error after the task is routed) produces no visible output to the user. The outer withDaemonRetry catch handles transport failures, but in-stream errors go silent in text mode.
| onError({error}) { | |
| onError({error}) { | |
| if (format === 'json') { | |
| writeJsonResponse({ | |
| command: 'gather', | |
| data: {event: 'error', message: error.message, status: 'error'}, | |
| success: false, | |
| }) | |
| } else { | |
| this.log(`\nError: ${error.message}\n`) | |
| } | |
| }, |
| // PHASE-5-UAT.md UAT-14: empty input must exit non-zero so CI scripts | ||
| // can detect failure. | ||
| process.exitCode = 1 | ||
| return false |
There was a problem hiding this comment.
nit (same pattern as gather.ts): onError is also silent in text mode here. Same fix applies — add an else { this.log(...) } branch so users see the error message in both formats.
| if (this.cache && fingerprint) { | ||
| const cached = this.cache.get(query, fingerprint) | ||
| if (cached) { | ||
| return { | ||
| cachedAnswer: cached, | ||
| fingerprint, | ||
| status: 'cached_answer', | ||
| tier: 0, | ||
| timingMs: Date.now() - startTime, | ||
| totalFound: 0, | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (this.cache && fingerprint) { | ||
| const fuzzyHit = this.cache.findSimilar(query, fingerprint) | ||
| if (fuzzyHit) { | ||
| return { | ||
| cachedAnswer: fuzzyHit, | ||
| fingerprint, | ||
| status: 'cached_answer', | ||
| tier: 1, | ||
| timingMs: Date.now() - startTime, | ||
| totalFound: 0, | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
nit (readability): The this.cache && fingerprint guard is evaluated twice — once for tier 0 and once for tier 1. These are the only cache paths, so a single early-exit after both checks would make the invariant clearer and eliminate the duplication:
if (this.cache && fingerprint) {
const exact = this.cache.get(query, fingerprint)
if (exact) return { cachedAnswer: exact, fingerprint, status: 'cached_answer', tier: 0, timingMs: Date.now() - startTime, totalFound: 0 }
const fuzzy = this.cache.findSimilar(query, fingerprint)
if (fuzzy) return { cachedAnswer: fuzzy, fingerprint, status: 'cached_answer', tier: 1, timingMs: Date.now() - startTime, totalFound: 0 }
}Not a correctness issue — current logic is equivalent.
|
|
||
| const result = await resultPromise | ||
|
|
||
| return {content: [{text: result, type: 'text' as const}]} |
There was a problem hiding this comment.
suggestion: brv-search returns both content (text) and _meta (typed struct) for tool-aware MCP clients. brv-gather only returns content. Tool-aware agents that parse brv-search results structurally would need to JSON.parse the gather text to access total_tokens_estimated or follow_up_hints. Consider adding the same dual-format pattern for consistency:
let parsedMeta: unknown
try { parsedMeta = JSON.parse(result) } catch { /* fallback */ }
return {
_meta: parsedMeta as Record<string, unknown> | undefined,
content: [{text: result, type: 'text' as const}],
}Not required for correctness; the text payload is still usable as-is.
| * Concurrency for ExtractNode's per-chunk fan-out (Phase 2 Task 2.4). | ||
| * Defaults to 4 inside `extract-node.ts`. Tunable via curate-executor. | ||
| */ | ||
| readonly extractConcurrency?: number |
There was a problem hiding this comment.
nit (coupling): extractConcurrency is an ExtractNode-specific tuning knob sitting on the shared NodeContext. As more node-specific options arrive in later phases, NodeContext will accumulate per-node fields. Consider a nodeOptions?: Partial<Record<NodeSlot, unknown>> bag here so slot-specific tuning stays slot-scoped (e.g., nodeOptions?.extract?.concurrency). Low-priority given the single current user, but worth flagging before Phase 3 adds more.
| const cleaned = existingId.replaceAll(/^\/+|\/+$/g, '').replace(/\.md$/, '') | ||
| const parts = cleaned.split('/').filter(Boolean) | ||
| // Need at least domain/topic/title (3) and at most domain/topic/subtopic/title (4). | ||
| if (parts.length < 3 || parts.length > 4) { |
There was a problem hiding this comment.
nit (silent data loss): Paths with depth > 4 (e.g., project/auth/oauth/providers/google.md) silently fall back to ADD instead of UPDATE. For a context tree that has ever grown beyond the 4-segment ceiling, this means repeat curations of the same subject always ADD new leaves instead of updating the existing one, causing duplicates.
The comment documents the limit, but it's worth either widening the range (to e.g., >= 3 segments, taking title = last, path = everything before last) or logging a fallback telemetry signal so the pattern can be detected in production.
| min: 100, | ||
| }), | ||
| } | ||
| public static strict = false |
There was a problem hiding this comment.
nit: static strict = false silently ignores unknown flags (e.g., brv gather "q" --limt 5 won't error). The comment on the peer record-answer.ts gives no reason either. If this is intentional forward-compat, a one-line comment would help reviewers (and keep it from being removed as "dead config" in a future cleanup).
- brv-search-tool.ts: replace unsafe 'as' cast with structural check - gather-executor.ts: guarantee at least first section under tight token budget - gather.ts & record-answer.ts: fix silent onError in text mode - query-dispatcher.ts: merge duplicate cache guard blocks - Add comments explaining static strict = false forward-compat
|
Claude finished @ngduyanhece's task in 1m 25s —— View job Re-review: PR #578 — Fix commit
|
| # | Original severity | Status |
|---|---|---|
| 1 | Important | ✅ Fixed |
| 2 | Important | ✅ Fixed |
| 3 | Important | ✅ Fixed |
| 4 | Minor/nit | Carry-forward (acceptable) |
| 5 | Minor/suggestion | Carry-forward (acceptable) |
| 6 | Minor/nit | Carry-forward (acceptable) |
| 7 | Nit | ✅ Fixed |
| 8 | Nit | ✅ Fixed |
Fix details
Issue #1 — as cast in brv-search-tool.ts:120 ✅
The JSON.parse(result) as BrvSearchResult unsafe cast is replaced with the structural check suggested in the review:
const raw: unknown = JSON.parse(result)
if (raw !== null && typeof raw === 'object' && 'tier' in raw && 'status' in raw) {
parsedMeta = raw as BrvSearchResult
}The residual as BrvSearchResult after the guard is the standard narrowing pattern; the invariant is now verified before the cast. The double-cast _meta: parsedMeta as unknown as Record<string, unknown> on line 136 is an acceptable structural upcast to satisfy the MCP SDK's return type — BrvSearchResult is a plain object.
Issue #2 — Empty bundle from applyTokenBudget ✅
The fallback guarantees at least one section:
if (kept.length === 0 && sections.length > 0) {
return sections[0]
}Correct placement (after the loop, before the join) and matches the review suggestion exactly.
Issue #3 — Silent onError in text mode ✅
Both gather.ts and record-answer.ts now have an else branch. Note the additional fix that makes this work: onError({error}) was changed to onError: ({error}) => — the arrow function correctly closes over this so this.log(...) binds to the oclif command instance. The original method shorthand would have silently shadowed this at runtime.
Issue #7 — static strict = false undocumented ✅
Both commands now have a comment explaining the forward-compat rationale:
// Allow unknown flags for forward-compatibility (e.g., new daemon flags
// passed through by wrapper scripts without requiring a CLI upgrade).
public static strict = falseIssue #8 — Duplicate cache guard in QueryDispatcher ✅
The two if (this.cache && fingerprint) blocks are merged into one. Variable names are also improved (cached → exact, fuzzyHit → fuzzy).
Remaining carry-forwards (all non-blocking)
- feat: update readme and contribuing guide #4 (
parseExistingIdForUpdatedepth ceiling) — documented limit; deferring is fine. - feat: Add MemAgent automatic chat+embedding storage and demo script #5 (missing
_metaon gather/record-answer tools) — text payload is still usable; can be added in a follow-up. - feat(llm): add OpenRouter API provider support #6 (
extractConcurrencyonNodeContext) — single current user; refactor fits naturally with Phase 3.
Verdict
All blocking and "fix before merge" items are resolved. The PR is ready to merge.
Summary
brv curateloop was a monolithic 50-iteration generator with no isolation, no parallelism, and no extensibility surface. Separately, thebrv-queryMCP tool tightly coupled deterministic tier 0/1/2 lookup with LLM synthesis, forcing every agent through ByteRover's model even when the agent had its own.5d54c4f7) — Replaced the curate-executor monolith with a typed-slot DAG (src/agent/core/curation/flow/): 7 default nodes,TopologicalCurationRunner, slot-contract Zod schemas, snapshot parity fixtures.101afce8) — Per-slot vm sandboxing with tool allowlists, parallel ExtractNode fan-out (concurrency 4), per-slot timeout escalation, and review-cycle hardening (R-1..R-6 + NEW-1).ed5ee2e8) — Splitbrv-queryinto three LLM-free MCP tools:brv-search(tier 0/1/2 cache+BM25),brv-gather(prefetched bundle),brv-record-answer(close the cache loop). Added matchingbrv gather/brv record-answeroclif CLI commands.brv-querykept with[deprecated]marker +mcp.query.legacy_invocationstelemetry counter.brv-queryMCP tool is not removed — deprecation marker only.QueryResultCache.--format jsonenvelope shape — unchanged.Type of change
Scope (select all touched areas)
Linked issues
Root cause (bug fixes only, otherwise write
N/A)brv search ""was exiting 0 on empty input (UAT-14 blocker B3); fixed by settingprocess.exitCode = 1in the validation path ofsearch.ts,gather.ts, andrecord-answer.ts.Test plan
test/unit/server/infra/dispatcher/query-dispatcher.test.tstest/unit/server/infra/executor/gather-executor.test.tstest/unit/server/infra/executor/record-answer-executor.test.tstest/unit/server/infra/mcp/tools/brv-search-tool.test.tstest/unit/server/infra/mcp/tools/brv-gather-tool.test.tstest/unit/server/infra/mcp/tools/brv-record-answer-tool.test.tstest/unit/server/infra/mcp/tools/deprecation-telemetry.test.tstest/integration/mcp/query-split-mcp-harness.test.ts— drives all 5 tools throughInMemoryTransporttest/unit/agent/curate-flow/runner.test.ts,flag.test.ts,sandbox/*.test.ts(Phases 1–2)brv-search(needs_synthesis) →brv-gather→ simulate-synthesize →brv-record-answer→ re-brv-search(returnstier:0, cached_answer)brv-querydeprecation:[deprecated]description prefix,_meta.deprecated:true,_meta.replacedBy:[…], telemetry JSONL written to<dataDir>/telemetry/mcp-deprecation.jsonlLLM_FREE_TASK_TYPESallowlist permitsgather/mcp-search/record-answer/searchto bypass provider validationUser-visible changes
brv-search— returns{tier, status, passages?, cached_answer?, fingerprint, total_found, timing_ms}on the_metachannelbrv-gather— returns{prefetched_context, manifest_context?, total_tokens_estimated, search_metadata, follow_up_hints?}brv-record-answer— returns{recorded, fingerprint}brv gather "<query>"andbrv record-answer "<query>" "<answer>" --fingerprint <fp>. Both support--format text|json.brv-queryMCP tool — title now suffixed(deprecated); description prefixed[deprecated]; functional behavior unchanged.brv search ""(and gather/record-answer) now exits non-zero on empty input (was exit 0).## MCP Workflowsection before## Commands; CLI sections renumbered (4–5 added, 6–13 shifted).Evidence
Attach at least one:
Checklist
npm test)npm run lint)npm run typecheck)npm run build)mainRisks and mitigations
brv curatecall.mcp-search,gather,record-answer) bypass provider validation viaLLM_FREE_TASK_TYPESallowlist — if the allowlist drifts, an LLM-bound task could silently skip auth.Setconstant inagent-process.tswith co-located unit tests; new task types must be deliberately added.brv-querydeprecation may surprise existing MCP clients reading the tool description.description,_meta.deprecated,_meta.replacedBy) changed. Telemetry countermcp.query.legacy_invocationslets us track adoption before any future removal.CARRY-FORWARDS.md.Appendix A — Architecture: Curation pipeline
A.1 Node DAG
flowchart LR INPUT([source text + existing memory]) --> RECON RECON[recon<br/>10s · tools.curation.recon] --> CHUNK CHUNK[chunk<br/>5s · pure] --> EXTRACT EXTRACT[extract<br/>60s · tools.curation.mapExtract<br/><i>fan-out, concurrency=4</i>] --> GROUP GROUP[group<br/>5s · pure] --> DEDUP DEDUP[dedup<br/>5s · pure] --> CONFLICT CONFLICT[conflict<br/>30s · tools.curation.conflict] --> WRITE WRITE[write<br/>30s · tools.curate] --> OUTPUT([context-tree mutations]) classDef llm fill:#fff4e0,stroke:#d97706,color:#7c2d12 classDef pure fill:#e0f2fe,stroke:#0369a1,color:#0c4a6e classDef io fill:#f3e8ff,stroke:#7c3aed,color:#4c1d95 class RECON,EXTRACT,CONFLICT llm class CHUNK,GROUP,DEDUP pure class WRITE ioNODE_SLOT_ORDERis enforced as['recon', 'chunk', 'extract', 'group', 'dedup', 'conflict', 'write']inflow/types.ts.A.2 Slot contracts
Every node executes inside a per-slot vm sandbox with a tool allowlist and timeout. If the allowlist is
[]the node is pure — no tool access, no I/O.recontools.curation.reconreconInputSchema→reconOutputSchemachunkchunkInputSchema→chunkOutputSchemaextracttools.curation.mapExtractextractInputSchema→extractOutputSchemagroupgroupInputSchema→groupOutputSchemadedupdedupInputSchema→dedupOutputSchemaconflicttools.curation.conflictconflictInputSchema→conflictOutputSchemawritetools.curatewriteInputSchema→writeOutputSchemaSource:
flow/slots/contracts.ts.A.3
TopologicalCurationRunneralgorithmKahn's topological sort with
pMap-bounded concurrency at each level. Per-node fail-open: if a node throws, the failure is recorded and downstream branches fed by its output are skipped, while parallel branches continue.Input plumbing convention (
runner.ts):ctx.initialInputRecord<predecessorNodeId, output>(forward-compat for Phase 8 positional insertions)A.4 Sandbox boundary
sequenceDiagram participant R as Runner participant SB as SlotSandbox participant SG as SchemaGate participant N as NodeFn participant T as ToolsProxy R->>SG: validate input vs inputSchema SG-->>R: ok (or soft-record issue) R->>SB: run(NodeFn, input, ctx) SB->>SB: AbortController(timeoutMs) ⊕ parent signal SB->>T: build Proxy gated by toolAllowlist SB->>N: execute({signal, tools}) N->>T: tools.curation.<x>(...) alt path in allowlist T-->>N: real implementation result else path NOT in allowlist T-->>N: throws SandboxToolDeniedError end N-->>SB: output SB->>SG: validate output vs outputSchema SG-->>R: output (+ any soft validation issues)The
toolsparameter is aProxywhosegettrap rejects any dotted path not present in the slot'stoolAllowlist. Source:flow/sandbox/slot-sandbox-builder.ts.Appendix B — Architecture: Query pipeline
B.1 Tiered dispatch
flowchart TD Q(["brv-search query"]) --> FP["compute context-tree<br/>fingerprint mtime hash"] FP --> T0{"Tier 0<br/>exact cache hit?"} T0 -- yes --> R0[/"return cached_answer<br/>tier=0"/] T0 -- no --> T1{"Tier 1<br/>Jaccard ≥ 0.6?"} T1 -- yes --> R1[/"return cached_answer<br/>tier=1"/] T1 -- no --> T2["BM25 search<br/>limit=10"] T2 --> T2D{"Tier 2<br/>direct-answer<br/>thresholds met?"} T2D -- yes --> R2A[/"return passages<br/>status=direct_passages<br/>cache.set"/] T2D -- "no, has results" --> R2B[/"return passages<br/>status=needs_synthesis"/] T2D -- empty --> R2C[/"return empty passages<br/>status=no_results"/] R2B -. LLM-free path .-> G["brv-gather"] G --> EXT["external agent LLM"] EXT --> RA["brv-record-answer<br/>cache.set"] RA -. next equivalent query .-> T0 R2B -. legacy path .-> T3["brv-query: Tier 3 LLM synthesis"] T3 --> T4["Tier 4 agentic loop"] classDef cache fill:#dcfce7,stroke:#16a34a,color:#14532d classDef bm25 fill:#dbeafe,stroke:#2563eb,color:#1e3a8a classDef llm fill:#fef3c7,stroke:#d97706,color:#78350f class T0,T1,R0,R1,RA cache class T2,T2D,R2A,R2B,R2C bm25 class EXT,T3,T4 llmB.2 Latency targets
B.3 Algorithm constants
DEFAULT_LIMIT(BM25 result count)10query-dispatcher.tsDIRECT_RESPONSE_MAX_DOCS5query-dispatcher.tsDIRECT_RESPONSE_SCORE_FLOOR0.7query-dispatcher.tsDIRECT_RESPONSE_SCORE_THRESHOLD(top result min)0.85direct-search-responder.tsDIRECT_RESPONSE_HIGH_CONFIDENCE_THRESHOLD(auto-answer)0.93direct-search-responder.tsDIRECT_RESPONSE_MIN_GAP(top vs runner-up)0.08direct-search-responder.tsFUZZY_SIMILARITY_THRESHOLD(Jaccard)0.6query-similarity.tsQueryResultCacheTTL60 000 msquery-result-cache.tsB.4
QueryDispatcher.dispatch()algorithmB.5 Context-tree fingerprint
Cache entries are keyed by
(query, fingerprint). The fingerprint is a deterministic hash ofpath:mtimepairs over every leaf in the context tree, sorted by path, plus the sources-file mtime. Any add/remove/edit changes the fingerprint → all prior cache entries become stale (no TTL race needed for correctness; TTL is a safety net for clock skew).Source:
query-result-cache.ts.B.6 LLM-free MCP loop
sequenceDiagram participant Agent as External agent participant S as brv-search participant G as brv-gather participant L as Agent's own LLM participant R as brv-record-answer participant C as QueryResultCache Agent->>S: query S->>C: lookup (fingerprint, query) alt Tier 0/1 hit C-->>S: cached_answer S-->>Agent: { tier, cached_answer, fingerprint } else Tier 2 needs_synthesis S-->>Agent: { tier:2, status:needs_synthesis, passages, fingerprint } Agent->>G: query (+ scope?, token-budget?) G-->>Agent: { prefetched_context, total_tokens_estimated, follow_up_hints } Agent->>L: prompt = prefetched_context + query L-->>Agent: synthesized answer Agent->>R: { query, answer, fingerprint } R->>C: cache.set R-->>Agent: { recorded:true, fingerprint } end Note over Agent,C: Next equivalent query → tier 0 hitThe
LLM_FREE_TASK_TYPESallowlist ({ 'gather', 'mcp-search', 'record-answer', 'search' }) inagent-process.tsbypasses provider validation for these task types — no API key required.B.7 Surface boundary: CLI vs MCP
brv-search_metachannel withBrvSearchResultbrv-gatherprefetched_context,manifest_context?,follow_up_hints?brv-record-answer{recorded, fingerprint}brv-query(deprecated)brv searchbrv gather--format json)brv record-answer{recorded, fingerprint}brv querytierfield on--format json, see CF-3)Source:
brv-search-tool.ts,brv-gather-tool.ts,brv-record-answer-tool.ts,gather.ts,record-answer.ts.