Vector.Search is a .NET 10 solution for indexing source code into a vector store and exposing HTTP + SignalR endpoints that power semantic code search and related features. It scans a repository, chunks supported files, generates embeddings via Ollama, and persists those embeddings using the Semantic Kernel PgVector connector on PostgreSQL. Clients can connect over SignalR to receive real-time progress as files are chunked and embedded.
- Features
- Solution Structure
- Getting Started
- HTTP & SignalR APIs
- Integration Tests
- Development Guide
- Contributing
- Troubleshooting
-
Code-aware chunking pipeline
- Pluggable
IChunkimplementation (e.g.,CodeChunking) that walks a repository and splits code files (C# and potentially others) into semantically aligned chunks. - Configurable minimum chunk size and set of file extensions.
- Stable hashing and GUID generation per chunk for idempotent indexing and deduplication.
- Pluggable
-
Embedding service
- Uses Ollama for both embedding and chat models, configured via
appsettings.*.jsonand/or environment variables. - Processes chunks in parallel with configurable
ParallelOptions. - Streams structured progress notifications (
ChunkProcessed,EmbeddingCompleted,EmbeddingError) over SignalR.
- Uses Ollama for both embedding and chat models, configured via
-
Vector storage
- Integrates with
Microsoft.SemanticKernel.Connectors.PgVectorandMicrosoft.Extensions.VectorData.Abstractions. - Uses a
VectorStoreCollection<Guid, CodeChunkRecord>-style abstraction for upsert and search operations. - Targets PostgreSQL with the
pgvectorextension enabled for efficient similarity search.
- Integrates with
-
Diagnostics & observability
- Minimal APIs for configuration and connectivity:
/debug/config,/debug/connection,/health/ollama. - Structured logging via Serilog.
- Optional OpenTelemetry-based tracing/metrics export.
- Background cleanup of temporary chunk files (configurable).
- Minimal APIs for configuration and connectivity:
The solution is organized roughly as follows (names may vary slightly depending on the repo):
src/Vector.Search- ASP.NET Core minimal API host.
- HTTP endpoints, SignalR hubs, and background processing orchestration.
- Wiring for Ollama, vector store, and chunking services.
src/Vector.Store/src/Vector.Files(or similar)- Chunking pipeline implementation (e.g.,
CodeChunking). - Models that represent chunks (
CodeChunkRecord/CodeChunk). - Vector store integration (
CodeVectorStore), which wrapsVectorStoreCollection.
- Chunking pipeline implementation (e.g.,
tests/*- Integration and (optionally) unit tests targeting the public HTTP/SignalR surface.
Check the actual folder names in your repository for the exact layout.
- .NET SDK 10
- PostgreSQL with the
pgvectorextension installed and enabled. - A running Ollama instance with:
- At least one embedding model (e.g.,
nomic-embed-text). - At least one chat model (e.g.,
llama3.1).
- At least one embedding model (e.g.,
- (Optional) A logging/observability backend such as Seq or an OpenTelemetry-compatible target.
Configuration is provided via appsettings.json / appsettings.Development.json and environment variables. Important settings include:
-
Repository & chunking
REPO_ROOT
Absolute path to the repository to index.FILE_EXTENSIONS
Comma-separated list of file extensions to include (e.g.,.cs,.csproj).DeleteTemporaryFiles(in vector store/chunking settings)
Whether to delete temporary chunk files after processing.
-
Vector store
COLLECTION_NAME
Name of the vector store collection.- Connection string / settings for PostgreSQL with
pgvector. Microsoft.SemanticKernel.Connectors.PgVectorandMicrosoft.Extensions.VectorData.*versions must be compatible (managed centrally inDirectory.Packages.props).
-
Ollama / models
OllamaSettings:Url
Base URL for the Ollama instance (e.g.,http://ollama:11434).EMBEDDING_MODEL
Embedding model name (e.g.,nomic-embed-text).CHAT_MODEL
Chat model name (e.g.,llama3.1).OllamaSettings:TimeoutFromMinutes
Timeout used for Ollama operations.
From the repository root:
dotnet run --project src/Vector.Search
Then you can:
- Browse to
https://localhost:<port>/to confirm it’s running. - Call
GET /debug/configto inspect runtime configuration. - Call
GET /health/ollamato verify Ollama connectivity. - Use
POST /api/embedwith a JSONEmbedRequestand a valid SignalR connection ID to start indexing a repo.
-
GET /
Basic text response: confirms that Vector.Search is running. -
GET /api/antiforgery/token
Returns antiforgery tokens and sets anXSRF-TOKENcookie for SPA/browser clients. -
GET /debug/config
Returns high-level runtime configuration, including Ollama URL, selected models, and timeout values. -
GET /debug/connection
UsesIHttpClientFactoryto callOllama /api/tagsand returns connectivity details and raw response content. -
GET /health/ollama
UsesOllamaApiClientto list models and returns a simple{ status = "healthy", models = [...] }payload if Ollama is reachable. -
POST /api/embed
Starts the embedding pipeline for the configured repository.- Request body: an
EmbedRequestthat includes at least a SignalR connection ID. - Response:
202 Acceptedwith an operation ID. - Background processing:
- Scans
REPO_ROOTfor files withFILE_EXTENSIONS. - Chunks them into temporary files.
- Embeds each chunk via Ollama.
- Upserts to the vector store.
- Streams progress events over SignalR.
- Scans
- Request body: an
-
POST /api/code
Demo endpoint that echoes the uploaded file’s name and length, used to verify file upload handling.
Clients connect to a hub (e.g., EmbeddingHub) and listen for:
ChunkProcessed- Payload includes
OperationId,FilePath, andIndexedcount.
- Payload includes
EmbeddingCompleted- Payload includes
OperationIdand totalIndexedcount.
- Payload includes
EmbeddingError- Payload includes
OperationIdand an error message.
- Payload includes
Integration tests validate the system end-to-end using the real HTTP and SignalR surfaces:
-
Host startup
- Spins up the ASP.NET Core host using the same
Programas production. - Uses test-specific configuration (test repo path, file extensions, Ollama and DB endpoints).
- Spins up the ASP.NET Core host using the same
-
Embedding pipeline
- Issues a
POST /api/embedwith a testEmbedRequest. - Asserts the API returns
202 Acceptedand an operation ID. - Connects a SignalR client to capture:
ChunkProcessedevents for each processed chunk.EmbeddingCompletedwhen the process finishes successfully.EmbeddingErrorif an error occurs.
- (Optionally) Verifies records in the backing vector store (e.g., count, properties).
- Issues a
-
Diagnostics
GET /debug/config: verifies that configured values are exposed correctly.GET /debug/connection: ensures Ollama is reachable and returns expected metadata.GET /health/ollama: ensures ‘healthy’ status given a working Ollama instance.
-
Failure paths
- Runs with misconfigured or unavailable dependencies to ensure:
- Errors are logged.
EmbeddingErrorevents are emitted.- HTTP responses remain meaningful (e.g., proper error codes where applicable).
- Runs with misconfigured or unavailable dependencies to ensure:
This solution uses central package management via Directory.Packages.props. Keep versions aligned, especially for:
Microsoft.SemanticKernel.Connectors.PgVectorMicrosoft.Extensions.VectorData.Abstractions(and relatedMicrosoft.Extensions.AI.*packages)
Misaligned versions can cause runtime TypeLoadException errors.
- Restore dependencies and build:
- Run the API:
- Tail logs using your configured Serilog sinks (console, Seq, etc.).
- Use your preferred HTTP client (curl, Postman, browser) and a SignalR client to exercise the endpoints.
Contributions are welcome.
- Fork the repository and create a feature branch.
- Develop:
- Make changes in
src/*projects. - Add or update tests in
tests/*.
- Test:
- Submit a Pull Request:
- Describe the change clearly.
- Include any new configuration options or breaking changes.
- Reference any related issues.
Please keep changes focused and small where possible and ensure the build and tests pass before submitting.
-
TypeLoadException from PgVector / Semantic Kernel
EnsureMicrosoft.SemanticKernel.Connectors.PgVectorandMicrosoft.Extensions.VectorData.*share a compatible version line configured inDirectory.Packages.props. Mismatches can lead to methods missing implementations at runtime. -
“Hash collision detected” warnings These usually mean identical chunk content was seen more than once, not a true SHA-256 collision. If you require uniqueness per physical chunk (even with identical content), include file path or a chunk identifier in the hash input or adjust the collision check logic.
-
Slow embedding performance
Tune: -
ParallelOptions.MaxDegreeOfParallelismin the embedding/processing loops. -
Ollama concurrency and resource limits.
-
Temporary directory not deleted Check:
-
DeleteTemporaryFilesconfiguration. -
Whether any process is still holding open file handles (especially on Windows).
If you encounter issues not covered here, consider opening an issue with logs, configuration snippets (minus secrets), and repro steps.