diff --git a/AGENTS.md b/AGENTS.md index 02d9014..1cc8dad 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -13,7 +13,7 @@ * **Lore DB uses incremental auto\_vacuum to prevent free-page bloat**: Lore's SQLite DB uses incremental auto\_vacuum (schema version 3 migration) to prevent free-page bloat from deletions. The migration sets PRAGMA auto\_vacuum = INCREMENTAL then VACUUM outside a transaction. temporal\_messages is the primary storage consumer (~51MB); knowledge table is tiny. -* **Lore search pipeline: FTS5 with AND-then-OR fallback and RRF fusion**: Lore's search overhaul (planned/in-progress) replaces three independent search systems with a unified pipeline in \`src/search.ts\`. Key design: \`ftsQuery()\` builds AND queries (primary), \`ftsQueryOr()\` builds OR queries (fallback only when AND returns zero results). Blanket OR was rejected empirically — it adds noise even with stopword filtering. Conservative stopword list excludes domain terms like 'handle', 'state', 'type'. FTS5 rank is negative (more negative = better); \`ORDER BY rank\` sorts best first. \`bm25()\` with column weights (title=6, content=2, category=3) verified working in Bun's SQLite. Recall tool uses Reciprocal Rank Fusion (k=60) across knowledge, temporal, and distillation sources. \`forSession()\` scoring uses OR (not AND-then-OR) because it's ranking all candidates, not searching for exact matches — BM25 naturally weights multi-term matches higher. +* **Lore search pipeline: FTS5 with AND-then-OR fallback and RRF fusion**: Lore's search pipeline (\`src/search.ts\`) uses FTS5 with AND-then-OR fallback and RRF fusion. \`ftsQuery()\` builds AND queries (primary), \`ftsQueryOr()\` builds OR fallback (only when AND returns zero results). Conservative stopword list excludes domain terms like 'handle', 'state', 'type'. FTS5 rank is negative (more negative = better). \`bm25()\` column weights: title=6, content=2, category=3. \`extractTopTerms()\` extracts top-40 frequency-ranked terms with stopword filtering. Recall tool uses \`reciprocalRankFusion\(lists, k=60)\` across knowledge, temporal, and distillation sources into a single ranked list with source-type annotations. \`forSession()\` uses OR-based FTS5 BM25 scoring (not AND-then-OR) because it ranks all candidates — BM25 naturally weights multi-term matches higher. Safety net: top-5 project entries by confidence always included. * **Lore temporal pruning runs after distillation and curation on session.idle**: In src/index.ts, session.idle awaits backgroundDistill and backgroundCurate sequentially before running temporal.prune(). Ordering is critical: pruning must not delete unprocessed messages. Pruning defaults: 120-day retention, 1GB max storage (in .lore.json under pruning.retention and pruning.maxStorage). These generous defaults were chosen because the system was new — earlier proposals of 7d/200MB were based on insufficient data. @@ -21,6 +21,9 @@ * **LTM injection pipeline: system transform → forSession → formatKnowledge → gradient deduction**: LTM injected via experimental.chat.system.transform hook. getLtmBudget() computes ceiling as (contextLimit - outputReserved - overhead) \* ltmFraction (default 10%, configurable 2-30%). forSession() loads project-specific entries unconditionally + cross-project entries scored by term overlap, greedy-packs into budget. formatKnowledge() renders as markdown. setLtmTokens() records consumption so gradient deducts it. Key: LTM goes into output.system (system prompt) — invisible to tryFit(), counts against overhead budget. + +* **OpenCode plugin SDK has no embedding API — vector search blocked**: The OpenCode plugin SDK (\`@opencode-ai/plugin\`, \`@opencode-ai/sdk\`) exposes only session/chat/tool operations. There is no \`client.embed()\`, embeddings endpoint, or raw model inference API. The only LLM access is \`client.session.prompt()\` which creates full chat roundtrips through the agentic loop. This means Lore cannot do vector/embedding search without either: (1) OpenCode adding an embedding API, or (2) direct \`fetch()\` to provider APIs bypassing the SDK (fragile — requires key extraction from \`client.config.providers()\`). The FTS5 + RRF search infrastructure is designed to be additive — vector search would layer on top as another RRF input list, not replace BM25. + ### Decision @@ -38,7 +41,7 @@ * **Lore auto-recovery can infinite-loop without re-entrancy guard**: Three v0.5.2 bugs causing excessive background LLM requests: (1) Auto-recovery loop — session.error handler injected recovery prompt → could overflow again → loop. Fix: recoveringSessions Set as re-entrancy guard. (2) Curator ran every idle — \`onIdle || afterTurns\` short-circuited (onIdle=true). Fix: \`||\` → \`&&\`. Lesson: boolean flag gating numeric threshold needs AND not OR. (3) shouldSkip() fell back to session.list() on unknown sessions. Fix: remove list fallback, cache in activeSessions. -* **Lore knowledge FTS search was sorted by updated\_at, not BM25 relevance**: In \`ltm.search()\`, knowledge FTS results were ordered by \`k.updated\_at DESC\` instead of FTS5 BM25 rank — most recently edited won over most relevant. Fix: replace the \`WHERE k.rowid IN (SELECT rowid FROM knowledge\_fts ...)\` subquery pattern with a JOIN that exposes \`rank\`, then \`ORDER BY bm25(knowledge\_fts, 6.0, 2.0, 3.0)\`. Also: distillations had no FTS table at all (LIKE-only search), fixed by adding \`distillation\_fts\` in schema migration v7 with backfill and sync triggers. +* **Lore knowledge FTS search was sorted by updated\_at, not BM25 relevance**: Three FTS search bugs fixed in the search overhaul: (1) Knowledge FTS sorted by \`updated\_at DESC\` not BM25 — fix: JOIN knowledge\_fts, \`ORDER BY bm25(knowledge\_fts, 6, 2, 3)\`. (2) Distillations had no FTS table (LIKE-only search) — fix: \`distillation\_fts\` virtual table in schema migration v7 with backfill + sync triggers. (3) \`forSession()\` used coarse bag-of-words term-overlap (top 30 terms >3 chars, no stemming) — fix: replaced \`scoreEntries()\` with \`scoreEntriesFTS()\` using FTS5 BM25 with OR semantics. All search functions now use AND-then-OR fallback pattern. \`ftsQuery()\`/\`ftsQueryOr()\` centralized in \`src/search.ts\` with stopword filtering and single-char removal. * **Test DB isolation via LORE\_DB\_PATH and Bun test preload**: Lore test suite uses isolated temp DB via test/setup.ts preload (bunfig.toml). Preload sets LORE\_DB\_PATH to mkdtempSync path before any imports of src/db.ts; afterAll cleans up. src/db.ts checks LORE\_DB\_PATH first. agents-file.test.ts needs beforeEach cleanup for intra-file isolation and TEST\_UUIDS cleanup in afterAll (shared with ltm.test.ts). Individual test files don't need close() calls — preload handles DB lifecycle. diff --git a/src/config.ts b/src/config.ts index cc019eb..baa801d 100644 --- a/src/config.ts +++ b/src/config.ts @@ -50,6 +50,28 @@ export const LoreConfig = z.object({ maxStorage: z.number().min(50).default(1024), }) .default({ retention: 120, maxStorage: 1024 }), + search: z + .object({ + /** BM25 column weights for knowledge FTS5 [title, content, category]. */ + ftsWeights: z + .object({ + title: z.number().min(0).default(6.0), + content: z.number().min(0).default(2.0), + category: z.number().min(0).default(3.0), + }) + .default({ title: 6.0, content: 2.0, category: 3.0 }), + /** Max results per source in recall tool before fusion. Default: 10. */ + recallLimit: z.number().min(1).max(50).default(10), + /** Enable LLM-based query expansion for the recall tool. Default: false. + * When enabled, the configured model generates 2–3 alternative query phrasings + * before search, improving recall for ambiguous queries. */ + queryExpansion: z.boolean().default(false), + }) + .default({ + ftsWeights: { title: 6.0, content: 2.0, category: 3.0 }, + recallLimit: 10, + queryExpansion: false, + }), crossProject: z.boolean().default(false), agentsFile: z .object({ diff --git a/src/index.ts b/src/index.ts index 90b531a..8346c57 100644 --- a/src/index.ts +++ b/src/index.ts @@ -236,6 +236,10 @@ export const LorePlugin: Plugin = async (ctx) => { hidden: true, description: "Lore knowledge curator worker", }, + "lore-query-expand": { + hidden: true, + description: "Lore query expansion worker", + }, }; }, @@ -660,7 +664,12 @@ End with "I'm ready to continue." so the agent knows to pick up where it left of // Register the recall tool tool: { - recall: createRecallTool(projectPath, config().knowledge.enabled), + recall: createRecallTool( + projectPath, + config().knowledge.enabled, + ctx.client, + config().search, + ), }, }; diff --git a/src/ltm.ts b/src/ltm.ts index 59aba41..660dc0f 100644 --- a/src/ltm.ts +++ b/src/ltm.ts @@ -1,5 +1,6 @@ import { uuidv7 } from "uuidv7"; import { db, ensureProject } from "./db"; +import { config } from "./config"; import { ftsQuery, ftsQueryOr, EMPTY_QUERY, extractTopTerms } from "./search"; // ~3 chars per token — validated as best heuristic against real API data. @@ -153,8 +154,11 @@ export function forProject( type Scored = { entry: KnowledgeEntry; score: number }; -/** BM25 column weights for knowledge_fts: title, content, category. */ -const FTS_WEIGHTS = { title: 6.0, content: 2.0, category: 3.0 }; +/** BM25 column weights for knowledge_fts: title, content, category. + * Reads from config().search.ftsWeights, falling back to defaults. */ +function ftsWeights() { + return config().search.ftsWeights; +} /** Max entries per pool to include on first turn when no session context exists. */ const NO_CONTEXT_FALLBACK_CAP = 10; @@ -180,7 +184,7 @@ function scoreEntriesFTS(sessionContext: string): Map { if (!terms.length) return new Map(); const q = terms.map((t) => `${t}*`).join(" OR "); - const { title, content, category } = FTS_WEIGHTS; + const { title, content, category } = ftsWeights(); try { const results = db() @@ -410,7 +414,7 @@ export function search(input: { AND k.confidence > 0.2 ORDER BY bm25(knowledge_fts, ?, ?, ?) LIMIT ?`; - const { title, content, category } = FTS_WEIGHTS; + const { title, content, category } = ftsWeights(); const ftsParams = pid ? [q, pid, title, content, category, limit] : [q, title, content, category, limit]; @@ -452,7 +456,7 @@ export function searchScored(input: { if (q === EMPTY_QUERY) return []; const pid = input.projectPath ? ensureProject(input.projectPath) : null; - const { title, content, category } = FTS_WEIGHTS; + const { title, content, category } = ftsWeights(); const ftsSQL = pid ? `SELECT k.*, bm25(knowledge_fts, ?, ?, ?) as rank FROM knowledge k diff --git a/src/prompt.ts b/src/prompt.ts index 2e4fc90..a637c54 100644 --- a/src/prompt.ts +++ b/src/prompt.ts @@ -431,3 +431,18 @@ export function formatKnowledge( return serialize(root(...children)); } + +// --------------------------------------------------------------------------- +// Query expansion (Phase 4) +// --------------------------------------------------------------------------- + +export const QUERY_EXPANSION_SYSTEM = `You are a search query expander for a code knowledge base. Given a search query, generate 2–3 alternative queries that would help find relevant results. Focus on: +- Synonyms and related technical terms +- Different phrasings of the same concept +- Broader or narrower scopes + +Return ONLY a JSON array of strings. No explanation, no markdown. + +Example: +Input: "SQLite FTS5 ranking" +Output: ["full text search scoring SQLite", "BM25 relevance ranking database", "FTS5 match order by rank"]`; diff --git a/src/reflect.ts b/src/reflect.ts index 0a266c6..284fd43 100644 --- a/src/reflect.ts +++ b/src/reflect.ts @@ -1,10 +1,14 @@ import { tool } from "@opencode-ai/plugin/tool"; +import type { createOpencodeClient } from "@opencode-ai/sdk"; import * as temporal from "./temporal"; import * as ltm from "./ltm"; import * as log from "./log"; import { db, ensureProject } from "./db"; -import { ftsQuery, ftsQueryOr, EMPTY_QUERY, reciprocalRankFusion } from "./search"; +import { ftsQuery, ftsQueryOr, EMPTY_QUERY, reciprocalRankFusion, expandQuery } from "./search"; import { serialize, inline, h, p, ul, lip, liph, t, root } from "./markdown"; +import type { LoreConfig } from "./config"; + +type Client = ReturnType; type Distillation = { id: string; @@ -186,7 +190,12 @@ function formatFusedResults( return serialize(root(h(2, "Recall Results"), ul(items))); } -export function createRecallTool(projectPath: string, knowledgeEnabled = true): ReturnType { +export function createRecallTool( + projectPath: string, + knowledgeEnabled = true, + client?: Client, + searchConfig?: LoreConfig["search"], +): ReturnType { return tool({ description: "Search your persistent memory for this project. Your visible context is a trimmed window — older messages, decisions, and details may not be visible to you even within the current session. Use this tool whenever you need information that isn't in your current context: file paths, past decisions, user preferences, prior approaches, or anything from earlier in this conversation or previous sessions. Always prefer recall over assuming you don't have the information. Searches long-term knowledge, distilled history, and raw message archives.", @@ -206,84 +215,103 @@ export function createRecallTool(projectPath: string, knowledgeEnabled = true): async execute(args, context) { const scope = args.scope ?? "all"; const sid = context.sessionID; + const limit = searchConfig?.recallLimit ?? 10; // If the query is all stopwords / single chars, short-circuit with guidance if (ftsQuery(args.query) === EMPTY_QUERY) { return "Query too vague — try using specific keywords, file names, or technical terms."; } - // Run scored searches across all sources - const knowledgeResults: ltm.ScoredKnowledgeEntry[] = []; - if (knowledgeEnabled && scope !== "session") { + // Optional query expansion: generate alternative phrasings via LLM + let queries = [args.query]; + if (searchConfig?.queryExpansion && client && sid) { try { - knowledgeResults.push( - ...ltm.searchScored({ - query: args.query, - projectPath, - limit: 10, - }), - ); + queries = await expandQuery(client, args.query, sid); } catch (err) { - log.error("recall: knowledge search failed:", err); + log.info("recall: query expansion failed, using original:", err); } } - const distillationResults: ScoredDistillation[] = []; - if (scope !== "knowledge") { - try { - distillationResults.push( - ...searchDistillationsScored({ - projectPath, - query: args.query, - sessionID: scope === "session" ? sid : undefined, - limit: 10, - }), - ); - } catch (err) { - log.error("recall: distillation search failed:", err); + // Run scored searches for each query variant + // Original query is always first; if expansion produced extras, + // we include the original twice in the RRF lists (2× weight). + const allRrfLists: Array<{ items: TaggedResult[]; key: (r: TaggedResult) => string }> = []; + + for (const query of queries) { + const knowledgeResults: ltm.ScoredKnowledgeEntry[] = []; + if (knowledgeEnabled && scope !== "session") { + try { + knowledgeResults.push( + ...ltm.searchScored({ + query, + projectPath, + limit, + }), + ); + } catch (err) { + log.error("recall: knowledge search failed:", err); + } } - } - const temporalResults: temporal.ScoredTemporalMessage[] = []; - if (scope !== "knowledge") { - try { - temporalResults.push( - ...temporal.searchScored({ - projectPath, - query: args.query, - sessionID: scope === "session" ? sid : undefined, - limit: 10, - }), - ); - } catch (err) { - log.error("recall: temporal search failed:", err); + const distillationResults: ScoredDistillation[] = []; + if (scope !== "knowledge") { + try { + distillationResults.push( + ...searchDistillationsScored({ + projectPath, + query, + sessionID: scope === "session" ? sid : undefined, + limit, + }), + ); + } catch (err) { + log.error("recall: distillation search failed:", err); + } } + + const temporalResults: temporal.ScoredTemporalMessage[] = []; + if (scope !== "knowledge") { + try { + temporalResults.push( + ...temporal.searchScored({ + projectPath, + query, + sessionID: scope === "session" ? sid : undefined, + limit, + }), + ); + } catch (err) { + log.error("recall: temporal search failed:", err); + } + } + + allRrfLists.push( + { + items: knowledgeResults.map((item) => ({ + source: "knowledge" as const, + item, + })), + key: (r) => `k:${r.item.id}`, + }, + { + items: distillationResults.map((item) => ({ + source: "distillation" as const, + item, + })), + key: (r) => `d:${r.item.id}`, + }, + { + items: temporalResults.map((item) => ({ + source: "temporal" as const, + item, + })), + key: (r) => `t:${r.item.id}`, + }, + ); } - // Fuse results using Reciprocal Rank Fusion - const fused = reciprocalRankFusion([ - { - items: knowledgeResults.map((item) => ({ - source: "knowledge" as const, - item, - })), - key: (r) => `k:${r.item.id}`, - }, - { - items: distillationResults.map((item) => ({ - source: "distillation" as const, - item, - })), - key: (r) => `d:${r.item.id}`, - }, - { - items: temporalResults.map((item) => ({ - source: "temporal" as const, - item, - })), - key: (r) => `t:${r.item.id}`, - }, - ]); + // Fuse results using Reciprocal Rank Fusion across all query variants + const fused = reciprocalRankFusion(allRrfLists); return formatFusedResults(fused, 20); }, diff --git a/src/search.ts b/src/search.ts index e5ad559..8881ca1 100644 --- a/src/search.ts +++ b/src/search.ts @@ -266,3 +266,113 @@ export function reciprocalRankFusion( return [...scores.values()].sort((a, b) => b.score - a.score); } + +// --------------------------------------------------------------------------- +// LLM query expansion (Phase 4) +// --------------------------------------------------------------------------- + +import type { createOpencodeClient } from "@opencode-ai/sdk"; +import { workerSessionIDs } from "./distillation"; +import { QUERY_EXPANSION_SYSTEM } from "./prompt"; +import * as log from "./log"; + +type Client = ReturnType; + +// Worker sessions for query expansion — keyed by parent session ID +const expansionWorkerSessions = new Map(); + +async function ensureExpansionWorkerSession( + client: Client, + parentID: string, +): Promise { + const existing = expansionWorkerSessions.get(parentID); + if (existing) return existing; + const session = await client.session.create({ + body: { parentID, title: "lore query expansion" }, + }); + const id = session.data!.id; + expansionWorkerSessions.set(parentID, id); + workerSessionIDs.add(id); + return id; +} + +/** + * Expand a user query into multiple search variants using the configured LLM. + * Returns `[original, ...expanded]`. The original is always first. + * + * Uses a 3-second timeout — if the LLM is slow, returns only the original query. + * Errors are caught silently (logged) and the original query is returned. + * + * @param client OpenCode client for LLM calls + * @param query The original user query + * @param sessionID Parent session ID (for worker session creation) + * @param model Optional model override + */ +export async function expandQuery( + client: Client, + query: string, + sessionID: string, + model?: { providerID: string; modelID: string }, +): Promise { + const TIMEOUT_MS = 3000; + + try { + const workerID = await ensureExpansionWorkerSession(client, sessionID); + const parts = [ + { + type: "text" as const, + text: `${QUERY_EXPANSION_SYSTEM}\n\nInput: "${query}"`, + }, + ]; + + // Race the LLM call against a timeout + const result = await Promise.race([ + client.session.prompt({ + path: { id: workerID }, + body: { + parts, + agent: "lore-query-expand", + ...(model ? { model } : {}), + }, + }), + new Promise((resolve) => setTimeout(() => resolve(null), TIMEOUT_MS)), + ]); + + // Rotate worker session so the next call starts fresh + expansionWorkerSessions.delete(sessionID); + + if (!result) { + log.info("query expansion timed out, using original query"); + return [query]; + } + + // Read the response + const msgs = await client.session.messages({ + path: { id: workerID }, + query: { limit: 2 }, + }); + const last = msgs.data?.at(-1); + if (!last || last.info.role !== "assistant") return [query]; + + const responsePart = last.parts.find((p) => p.type === "text"); + if (!responsePart || responsePart.type !== "text") return [query]; + + // Parse JSON array from response + const cleaned = responsePart.text + .trim() + .replace(/^```json?\s*/i, "") + .replace(/\s*```$/i, ""); + const parsed = JSON.parse(cleaned); + if (!Array.isArray(parsed)) return [query]; + + const expanded = parsed.filter( + (q): q is string => typeof q === "string" && q.trim().length > 0, + ); + if (!expanded.length) return [query]; + + return [query, ...expanded.slice(0, 3)]; // cap at 3 expansions + } catch (err) { + log.info("query expansion failed, using original query:", err); + return [query]; + } +} diff --git a/test/config.test.ts b/test/config.test.ts index 60d4253..4975964 100644 --- a/test/config.test.ts +++ b/test/config.test.ts @@ -76,6 +76,55 @@ describe("LoreConfig — curator schema", () => { }); }); +describe("LoreConfig — search schema", () => { + test("search defaults: ftsWeights, recallLimit, queryExpansion", () => { + const cfg = LoreConfig.parse({}); + expect(cfg.search.ftsWeights.title).toBe(6.0); + expect(cfg.search.ftsWeights.content).toBe(2.0); + expect(cfg.search.ftsWeights.category).toBe(3.0); + expect(cfg.search.recallLimit).toBe(10); + expect(cfg.search.queryExpansion).toBe(false); + }); + + test("search.ftsWeights can be customised", () => { + const cfg = LoreConfig.parse({ + search: { ftsWeights: { title: 10.0, content: 1.0, category: 0.5 } }, + }); + expect(cfg.search.ftsWeights.title).toBe(10.0); + expect(cfg.search.ftsWeights.content).toBe(1.0); + expect(cfg.search.ftsWeights.category).toBe(0.5); + }); + + test("search.recallLimit can be customised", () => { + const cfg = LoreConfig.parse({ search: { recallLimit: 25 } }); + expect(cfg.search.recallLimit).toBe(25); + }); + + test("search.recallLimit rejects values over 50", () => { + expect(() => LoreConfig.parse({ search: { recallLimit: 100 } })).toThrow(); + }); + + test("search.queryExpansion can be enabled", () => { + const cfg = LoreConfig.parse({ search: { queryExpansion: true } }); + expect(cfg.search.queryExpansion).toBe(true); + }); + + test("search section is optional — omitting it uses defaults", () => { + const cfg = LoreConfig.parse({ curator: { enabled: false } }); + expect(cfg.search.ftsWeights.title).toBe(6.0); + expect(cfg.search.recallLimit).toBe(10); + expect(cfg.search.queryExpansion).toBe(false); + }); + + test("partial search config merges with defaults", () => { + const cfg = LoreConfig.parse({ search: { recallLimit: 20 } }); + // ftsWeights should still have defaults + expect(cfg.search.ftsWeights.title).toBe(6.0); + expect(cfg.search.recallLimit).toBe(20); + expect(cfg.search.queryExpansion).toBe(false); + }); +}); + describe("load — reads config from .lore.json", () => { test("loads agentsFile.enabled=false from .lore.json", async () => { mkdirSync(TMP, { recursive: true });