A conversational assistant for explainable, transparent, and traceable software architecture decisions.
Need and Motivation · Capabilities · Architecture Components · Setup · Prompts and Traceability
LLM-based conversational recommenders commonly improve interaction quality, but they often become difficult to audit when the LLM is responsible for multiple roles (dialogue management, interpretation, and sometimes recommendation). This prototype targets that gap by isolating the decision mechanism into a deterministic, inspectable component while keeping the LLM limited to interpreting user inputs, asking clarifying questions, and producing explanations grounded on the deterministic output.
Caution
This is a research prototype.
| Capability | Status |
|---|---|
| Multi-turn interaction orchestration (state machine) | ✅ |
| Deterministic recommendation (decision table / scoring / ranking) | ✅ |
| Explicit Decision Model (explicit architecture catalog) | ✅ |
| LLM elicitation: interpret user answers into predefined criteria | ✅ |
| LLM elicitation: ambiguity detection + clarification question generation | ✅ |
| LLM explanation: natural-language justification grounded on decision output | ✅ |
| Prompted workflow (prompt templates versioned in-repo) | ✅ |
| Token/context optimization (avoid re-sending long histories) | ❌ |
| Evaluation metrics for explanation quality and auditability | ❌ |
| Long-term conversation memory with traceable persistence (no re-sending context) | ❌ |
Note
The prototype produces explanations grounded on deterministic outputs, but it currently lacks a standardized evaluation layer to quantify explanation quality (e.g., faithfulness, completeness) and auditability (e.g., trace reconstruction accuracy).
- UI (chat): user entry point (frontend).
- Orchestrator: central coordinator of the interaction flow and component exchanges.
- Elicitation Machine (LLM): receives elicitation requests and returns inferred criteria.
- Explicit Decision Model: receives inferred criteria and returns a decision table.
- Decision Maker (deterministic): receives a recommendation request plus decision table, and returns recommendation plus decision table.
- Recommendation Explainer (LLM): receives recommendation plus decision table, and returns recommendation plus LLM explanation.
Caution
The LLM must remain non-decisional. Any prompt or integration change that lets the LLM alter ranking undermines auditability.
Note
Components marked with ⊗ are LLM-based and are restricted to elicitation and explanation (not decision making).
From the repository root, create .env, set DEEPSEEK_API_KEY, and optionally adjust LOG_LEVEL:
cp .env.example .env
# edit .env and replace DEEPSEEK_API_KEY
docker compose up --buildRelevant .env values:
LOG_LEVEL=DEBUG # INFO / WARNING / ERROR / CRITICAL
HOST=0.0.0.0
PORT=5000
DEEPSEEK_API_KEY=sk-your-key-herePrompt templates are stored under:
archssistant-backend/app/services/elicitation_machine/prompt/
These prompts are treated as versioned behavioral artifacts (Git history = traceability). The current prompt set enforces strict JSON-only outputs to keep the pipeline deterministic and auditable.
-
interpret_user_answer_prompt.txtClassifies a user answer for a given parameter (e.g.,scalability,teamSize) and returns:classification(orUNCERTAIN)confidence(high|medium|low)- a short
reasoningstring
-
generate_next_question_prompt.txtProduces the next conversational turn. It supports:- clarification mode when
isClarificationNeeded=true - normal flow when
isClarificationNeeded=falseOutput is a JSON contract includingparameter_to_infer,question_for_user, andfull_response_text.
- clarification mode when
-
generate_final_descriptions_prompt.txtGenerates final architecture descriptions and justifications grounded on:{project_description}{conversation_history}{recommendations_names}Output is a JSON object whose keys must match architecture names exactly.
