Conversation
…e bundle support - Add EnableLocalBotAPI, TelegramBotApiId, TelegramBotApiHash, LocalBotApiPort to Config/Env - Auto-set BaseUrl and IsLocalAPI when EnableLocalBotAPI is true - Launch telegram-bot-api.exe as a ChildProcessManager-managed process in GeneralBootstrap - Add conditional Content entry for telegram-bot-api.exe in .csproj - Build telegram-bot-api from source via cmake+vcpkg in GitHub Actions push workflow Co-authored-by: ModerRAS <28183976+ModerRAS@users.noreply.github.com>
- 新增IOCRService接口和OCREngine枚举 - 新增LLMOCRService实现,使用GeneralLLMService.AnalyzeImageAsync - 新增OCRConfState枚举和状态机 - 新增EditOCRConfRedisHelper用于Redis状态存储 - 新增EditOCRConfService实现OCR配置状态机逻辑 - 新增EditOCRConfController和EditOCRConfView用于Bot私聊配置 - 修改AutoOCRController支持OCR引擎动态切换 - 修改PaddleOCRService实现IOCRService接口 支持通过Bot私聊发送"OCR设置"进行配置
📝 WalkthroughWalkthroughAdds optional local Telegram Bot API build and startup to CI/startup; introduces an IOCRService abstraction plus an LLM-based OCR implementation; implements a Redis-backed, stateful per-chat OCR configuration flow (service, controller, view) and runtime OCR engine selection in the AutoOCRController. Changes
Sequence DiagramsequenceDiagram
actor Admin
participant Telegram
participant Controller as EditOCRConfController
participant Service as EditOCRConfService
participant Redis as EditOCRConfRedisHelper
participant DB as DataDbContext
participant View as EditOCRConfView
Admin->>Telegram: Send OCR config command
Telegram->>Controller: Deliver update
Controller->>Controller: Validate admin & extract command
Controller->>Service: ExecuteAsync(command, chatId)
Service->>Redis: Initialize / GetStateAsync()
Redis-->>Service: Return current state
alt MainMenu -> Engine selection
Service->>Redis: SetStateAsync(SelectingEngine)
Service->>DB: Upsert OCR:Engine (if applicable)
else SelectingLLMChannel
Service->>Redis: SetStateAsync(SelectingLLMChannel)
Service->>DB: Query ChannelsWithModel
else SelectingLLMModel
Service->>Redis: GetChannelIdAsync()
Service->>DB: Upsert LLM model/config items
Service->>Redis: DeleteKeysAsync()
end
Service-->>Controller: (status, message)
Controller->>View: WithChatId().WithReplyTo().WithMessage()
Controller->>View: Render()
View->>Telegram: Send response
Telegram->>Admin: Show result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 PR检查报告📋 检查概览
🧪 测试结果
📊 代码质量
📁 测试产物
🔗 相关链接此报告由GitHub Actions自动生成 |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (4)
TelegramSearchBot/AppBootstrap/GeneralBootstrap.cs (1)
97-131: Windows-only implementation - consider documenting or guarding.This implementation is Windows-specific:
telegram-bot-api.exeis a Windows binarychildProcessManageruses Windows job objects (seeAppBootstrap.cs:19-143)Linux users with
EnableLocalBotAPI = truewill see the warning at line 129 but might be confused. Consider adding platform documentation or a more explicit message.🔧 Suggested improvement for cross-platform clarity
if (Env.EnableLocalBotAPI) { +#if !WINDOWS + Log.Warning("EnableLocalBotAPI is currently only supported on Windows. Feature disabled."); +#else string botApiExePath = Path.Combine(AppContext.BaseDirectory, "telegram-bot-api.exe"); // ... existing code ... +#endif }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot/AppBootstrap/GeneralBootstrap.cs` around lines 97 - 131, The current local telegram-bot-api startup block (guarded by Env.EnableLocalBotAPI) assumes a Windows executable and Windows job-object management (see botApiExePath, childProcessManager and WaitForLocalBotApiReady); update the logic to first check the OS (e.g., RuntimeInformation.IsOSPlatform(OSPlatform.Windows)) and only attempt to locate and start "telegram-bot-api.exe" and register it with childProcessManager on Windows, otherwise log a clear platform-specific message that EnableLocalBotAPI is not supported on the current OS and suggest the recommended Linux alternative or documentation link; ensure the log includes which platform was detected and keep the existing Windows startup flow (ArgumentList, AddProcess, WaitForLocalBotApiReady) unchanged when on Windows..github/workflows/push.yml (1)
36-53: Build step lacks explicit error handling for vcpkg/cmake failures.If vcpkg install or cmake fails, the step will fail the entire workflow without clear indication of which part failed. Consider adding explicit error checks or using
$ErrorActionPreference = 'Stop'at the start.Also, the recursive clone of
tdlib/telegram-bot-apican be slow and bandwidth-intensive. The current caching strategy helps, but initial builds or cache misses will be slow.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/push.yml around lines 36 - 53, The "Build telegram-bot-api from source" step lacks explicit failure handling and does a full recursive clone; make the step fail-fast and emit clear errors by setting PowerShell to stop on errors (e.g., set $ErrorActionPreference = 'Stop' at the top of the run block) and wrap the key commands (git clone, C:\vcpkg\vcpkg.exe install, cmake configure/build) in try/catch that logs which command failed (include the command name like git clone, vcpkg install, cmake) and rethrows/exit with a non‑zero code; also replace the recursive git clone with a shallow non-recursive clone (e.g., --depth 1 and no --recursive) or explicitly fetch only needed submodules to reduce bandwidth on cache misses.TelegramSearchBot/TelegramSearchBot.csproj (1)
99-103: Cross-platform consideration: Windows-only binary packaging.The project targets both
win-x64andlinux-x64(line 5), but onlytelegram-bot-api.exeis packaged. Linux deployments would needtelegram-bot-api(no.exeextension) built for Linux.The
Condition="Exists('telegram-bot-api.exe')"gracefully handles the missing file case, so Linux builds won't fail. However, users enablingEnableLocalBotAPIon Linux won't have the binary available.Consider documenting this Windows-only limitation or adding Linux binary support in a future iteration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot/TelegramSearchBot.csproj` around lines 99 - 103, The project currently packages only the Windows binary "telegram-bot-api.exe" (ItemGroup Content Include="telegram-bot-api.exe") while the runtime identifiers include both "win-x64" and "linux-x64"; update the packaging to account for Linux by either adding a conditional Content entry for the Linux binary name "telegram-bot-api" (and/or a platform-specific MSBuild Condition based on RuntimeIdentifier) or document that EnableLocalBotAPI is Windows-only so Linux users know they must provide a Linux build; locate the ItemGroup that references "telegram-bot-api.exe" and adjust it to include the Linux artifact or add clear documentation referencing EnableLocalBotAPI and the runtime identifiers.TelegramSearchBot/Service/Manage/EditOCRConfService.cs (1)
215-234: Persist LLM channel/model atomically to avoid partial configuration
SetLLMConfigAsyncperforms two separate writes, each with its ownSaveChangesAsync. If the second write fails, config can become partially updated.💡 Suggested fix
private async Task SetLLMConfigAsync(int channelId, string modelName) { - await SetConfigAsync(OCRLLMChannelIdKey, channelId.ToString()); - await SetConfigAsync(OCRLLMModelNameKey, modelName); + await UpsertConfigEntityAsync(OCRLLMChannelIdKey, channelId.ToString()); + await UpsertConfigEntityAsync(OCRLLMModelNameKey, modelName); + await DataContext.SaveChangesAsync(); } -private async Task SetConfigAsync(string key, string value) { +private async Task UpsertConfigEntityAsync(string key, string value) { var config = await DataContext.AppConfigurationItems .FirstOrDefaultAsync(x => x.Key == key); if (config == null) { await DataContext.AppConfigurationItems.AddAsync(new AppConfigurationItem { Key = key, Value = value }); } else { config.Value = value; } - - await DataContext.SaveChangesAsync(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs` around lines 215 - 234, SetLLMConfigAsync currently calls SetConfigAsync twice which invokes DataContext.SaveChangesAsync for each key, risking partial updates; change SetLLMConfigAsync to set or upsert both OCRLLMChannelIdKey and OCRLLMModelNameKey on the DataContext.AppConfigurationItems in-memory first and then call a single SaveChangesAsync (or wrap both SetConfigAsync operations in a database transaction using DataContext.Database.BeginTransactionAsync and commit once) so both values are persisted atomically; update or add AppConfigurationItem entries for OCRLLMChannelIdKey and OCRLLMModelNameKey then perform one DataContext.SaveChangesAsync (or commit the transaction) to avoid partial configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/push.yml:
- Around line 54-57: The copy step "Copy telegram-bot-api binary to project"
runs unconditionally but will fail if neither the cache hit nor the build
produced telegram-bot-api-bin\telegram-bot-api.exe; update that job step to only
run when the binary is available by adding an if condition checking the cache
hit or build success (for example: if: steps.cache.outputs.cache-hit == 'true'
|| steps.build.outcome == 'success'), or alternatively check the file existence
before copying (using a conditional script or an if: expression) so the step is
skipped when the binary doesn't exist.
In `@TelegramSearchBot.Common/Env.cs`:
- Around line 86-89: Add documentation entries in the Docs/ folder for the newly
persisted configuration fields added to the Config/Env class: EnableLocalBotAPI,
TelegramBotApiId, TelegramBotApiHash, and LocalBotApiPort. Update the
user-facing configuration reference (e.g., the config schema docs and any
example config files) to describe each field, its type (bool, string, string,
int), default values (false, null/empty, null/empty, 8081), expected usage, and
any security notes for TelegramBotApiId/TelegramBotApiHash; also add a short
migration note so users know these are now persistent. Ensure the docs reference
the same symbols (EnableLocalBotAPI, TelegramBotApiId, TelegramBotApiHash,
LocalBotApiPort) so they match the code.
In `@TelegramSearchBot/AppBootstrap/GeneralBootstrap.cs`:
- Around line 108-126: The ProcessStartInfo currently sets
RedirectStandardOutput and RedirectStandardError to true but never consumes
those streams, which can cause the child process to hang; fix by either
disabling redirection (set RedirectStandardOutput = false and
RedirectStandardError = false on the ProcessStartInfo used to start the bot API)
or, if you need logs, consume the streams asynchronously after Process.Start by
attaching asynchronous readers to botApiProcess.StandardOutput and
botApiProcess.StandardError (or use BeginOutputReadLine/BeginErrorReadLine or
Task-based reads) and forward them to Log before calling
childProcessManager.AddProcess and awaiting WaitForLocalBotApiReady so buffers
never fill.
In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs`:
- Around line 74-79: The controller currently awaits LLM OCR inline
(GetOCREngineAsync -> selecting _ocrServices where Engine == OCREngine.LLM and
calling LLMOCRService.ExecuteAsync), causing long-running work inside the
IOnUpdate path; change the flow so the controller does not run LLM OCR
synchronously: detect when GetOCREngineAsync returns OCREngine.LLM and instead
enqueue or dispatch the image to the existing subprocess/scheduler service used
for PaddleOCR (or the AppBootstrap-launched OCR subprocess) rather than calling
LLMOCRService.ExecuteAsync directly; keep synchronous/fast handling for update
handlers by selecting the local OCR service (e.g., PaddleOCR) for immediate
ExecuteAsync and push LLM requests to the background scheduler/service (use the
project’s scheduler enqueue method or subprocess launcher used elsewhere) so
IOnUpdate never performs image conversion/temp-file I/O or LLM network calls
inline.
In `@TelegramSearchBot/Controller/Manage/EditOCRConfController.cs`:
- Around line 18-20: EditOCRConfController currently holds a mutable
EditOCRConfView instance that gets reused by ControllerExecutor, causing
cross-request state corruption; remove the view field from EditOCRConfController
and either (A) change EditOCRConfView to be stateless by replacing instance
fields/_chatId/_replyToMessageId/_messageText with a Render(chatId,
replyToMessageId, messageText) method, or (B) instantiate a new EditOCRConfView
inside the controller execution path each request before calling its render/send
methods; update all places in EditOCRConfController that read or mutate
EditOCRConfView (e.g., where you set chatId/replyToMessageId/messageText) to use
the stateless Render signature or the fresh instance, ensuring the controller
remains stateless as required by ControllerExecutor.
In `@TelegramSearchBot/Helper/EditOCRConfRedisHelper.cs`:
- Around line 30-31: The Redis keys for the transient edit flow are never
expiring; update SetStateAsync (and the other write methods referenced around
lines 38-39 and 50-51) to set a short TTL and refresh it on every write:
introduce a single TTL constant (e.g., TimeSpan editFlowTtl =
TimeSpan.FromMinutes(10)) in the class and use GetDatabase().StringSetAsync(key,
value, editFlowTtl) (or call KeyExpireAsync immediately after StringSetAsync)
for StateKey and the other keys written by SetSourceAsync / SetDocIdAsync so
each write resets the expiry and stale keys are automatically removed.
- Around line 42-51: Change the channel ID methods and storage to use long
instead of int: update GetChannelIdAsync to parse the stored string with
long.TryParse and return a long? (nullable long) and update SetChannelIdAsync to
accept a long channelId and store channelId.ToString(); ensure you reference
GetChannelIdAsync, SetChannelIdAsync and ChannelKey (and align with the existing
_chatId constructor parameter) so the Redis read/write uses 64-bit values and
preserves negative -100... Telegram IDs.
In `@TelegramSearchBot/Service/AI/OCR/LLMOCRService.cs`:
- Around line 37-38: The call to _generalLLMService.AnalyzeImageAsync(tempPath,
0) discards the originating chat context; update IOCRService.ExecuteAsync to
accept a chatId parameter (thread e.Message.Chat.Id from the caller), propagate
that chatId into LLMOCRService (or wherever ExecuteAsync is implemented), and
replace the hardcoded 0 with the passed chatId when calling
_generalLLMService.AnalyzeImageAsync(tempPath, chatId) so chat-scoped
routing/configuration in GeneralLLMService receives the correct ChatId.
- Around line 30-35: The current use of Path.GetTempFileName() + ".jpg" creates
an unused .tmp file each request; change the tempPath generation in
LLMOCRService (tempPath variable) to create a unique JPG path without
pre-creating a file (e.g., Path.Combine(Path.GetTempPath(),
Guid.NewGuid().ToString() + ".jpg") or Path.GetRandomFileName() + ".jpg"), write
the image to that path in your existing FileStream block, and ensure you delete
the tempPath in a finally/cleanup block after processing to avoid leaking temp
files.
In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs`:
- Around line 60-63: The handler currently treats both "退出" and "返回" the same
and clears state via redis.DeleteKeysAsync(), which conflicts with the UI prompt
that says 输入`返回`返回主菜单; update the logic so only the "退出" command triggers
redis.DeleteKeysAsync() and returns the exit message, while "返回" should return
(true, "返回主菜单") or equivalent without clearing state; adjust any related
prompt/response text (the view prompt that mentions 输入`返回`返回主菜单) to remain
consistent with this new behavior.
- Around line 77-91: HandleMainMenuAsync currently ignores the incoming command
and always re-renders the menu; update it to parse the command and set the next
interaction state on the EditOCRConfRedisHelper so the flow can transition
(e.g., set redis.State to SelectingEngine when command == "1", to ViewingConfig
when command == "2", and handle "3" as return/exit). Keep returning the menu
text when command is empty or invalid, but when a valid option is chosen return
the appropriate state change (and a brief confirmation message) so downstream
handlers (SelectingEngine/ViewingConfig) are reachable; reference the
HandleMainMenuAsync method, the redis parameter, and the
SelectingEngine/ViewingConfig state names when making the change.
- Around line 70-75: The code currently returns (false, string.Empty) when
currentState is not found in _stateHandlers, which leaves the conversation
dead-ended; instead reset the conversation state to MainMenu and return a
friendly message. In the else branch where
_stateHandlers.TryGetValue(currentState, out var handler) is false, write the
new state ("MainMenu") back to Redis (or call the existing state setter used
elsewhere), ensure any in-memory state is updated, and return (true, "<brief
prompt for main menu>") or similar so the user is guided back to MainMenu;
reference _stateHandlers, currentState, handler and MainMenu when making the
change.
---
Nitpick comments:
In @.github/workflows/push.yml:
- Around line 36-53: The "Build telegram-bot-api from source" step lacks
explicit failure handling and does a full recursive clone; make the step
fail-fast and emit clear errors by setting PowerShell to stop on errors (e.g.,
set $ErrorActionPreference = 'Stop' at the top of the run block) and wrap the
key commands (git clone, C:\vcpkg\vcpkg.exe install, cmake configure/build) in
try/catch that logs which command failed (include the command name like git
clone, vcpkg install, cmake) and rethrows/exit with a non‑zero code; also
replace the recursive git clone with a shallow non-recursive clone (e.g.,
--depth 1 and no --recursive) or explicitly fetch only needed submodules to
reduce bandwidth on cache misses.
In `@TelegramSearchBot/AppBootstrap/GeneralBootstrap.cs`:
- Around line 97-131: The current local telegram-bot-api startup block (guarded
by Env.EnableLocalBotAPI) assumes a Windows executable and Windows job-object
management (see botApiExePath, childProcessManager and WaitForLocalBotApiReady);
update the logic to first check the OS (e.g.,
RuntimeInformation.IsOSPlatform(OSPlatform.Windows)) and only attempt to locate
and start "telegram-bot-api.exe" and register it with childProcessManager on
Windows, otherwise log a clear platform-specific message that EnableLocalBotAPI
is not supported on the current OS and suggest the recommended Linux alternative
or documentation link; ensure the log includes which platform was detected and
keep the existing Windows startup flow (ArgumentList, AddProcess,
WaitForLocalBotApiReady) unchanged when on Windows.
In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs`:
- Around line 215-234: SetLLMConfigAsync currently calls SetConfigAsync twice
which invokes DataContext.SaveChangesAsync for each key, risking partial
updates; change SetLLMConfigAsync to set or upsert both OCRLLMChannelIdKey and
OCRLLMModelNameKey on the DataContext.AppConfigurationItems in-memory first and
then call a single SaveChangesAsync (or wrap both SetConfigAsync operations in a
database transaction using DataContext.Database.BeginTransactionAsync and commit
once) so both values are persisted atomically; update or add
AppConfigurationItem entries for OCRLLMChannelIdKey and OCRLLMModelNameKey then
perform one DataContext.SaveChangesAsync (or commit the transaction) to avoid
partial configuration.
In `@TelegramSearchBot/TelegramSearchBot.csproj`:
- Around line 99-103: The project currently packages only the Windows binary
"telegram-bot-api.exe" (ItemGroup Content Include="telegram-bot-api.exe") while
the runtime identifiers include both "win-x64" and "linux-x64"; update the
packaging to account for Linux by either adding a conditional Content entry for
the Linux binary name "telegram-bot-api" (and/or a platform-specific MSBuild
Condition based on RuntimeIdentifier) or document that EnableLocalBotAPI is
Windows-only so Linux users know they must provide a Linux build; locate the
ItemGroup that references "telegram-bot-api.exe" and adjust it to include the
Linux artifact or add clear documentation referencing EnableLocalBotAPI and the
runtime identifiers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8202ab74-6587-4b38-8fed-2af0e43f4739
📒 Files selected for processing (13)
.github/workflows/push.ymlTelegramSearchBot.Common/Env.csTelegramSearchBot/AppBootstrap/GeneralBootstrap.csTelegramSearchBot/Controller/AI/OCR/AutoOCRController.csTelegramSearchBot/Controller/Manage/EditOCRConfController.csTelegramSearchBot/Helper/EditOCRConfRedisHelper.csTelegramSearchBot/Interface/AI/OCR/IOCRService.csTelegramSearchBot/Model/AI/OCRConfState.csTelegramSearchBot/Service/AI/OCR/LLMOCRService.csTelegramSearchBot/Service/AI/OCR/PaddleOCRService.csTelegramSearchBot/Service/Manage/EditOCRConfService.csTelegramSearchBot/TelegramSearchBot.csprojTelegramSearchBot/View/EditOCRConfView.cs
| - name: Copy telegram-bot-api binary to project | ||
| shell: pwsh | ||
| run: | | ||
| Copy-Item "telegram-bot-api-bin\telegram-bot-api.exe" "TelegramSearchBot\telegram-bot-api.exe" |
There was a problem hiding this comment.
Copy step may fail if both cache miss and build fail.
The copy step runs unconditionally, but telegram-bot-api-bin\telegram-bot-api.exe only exists if the cache was hit OR the build succeeded. If the build step fails (e.g., network issue cloning, CMake error), this step will fail the workflow.
Consider adding an if condition or combining with the build step.
🔧 Proposed fix
- name: Copy telegram-bot-api binary to project
+ if: success()
shell: pwsh
run: |
+ if (Test-Path "telegram-bot-api-bin\telegram-bot-api.exe") {
Copy-Item "telegram-bot-api-bin\telegram-bot-api.exe" "TelegramSearchBot\telegram-bot-api.exe"
+ } else {
+ Write-Warning "telegram-bot-api.exe not found, skipping copy"
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Copy telegram-bot-api binary to project | |
| shell: pwsh | |
| run: | | |
| Copy-Item "telegram-bot-api-bin\telegram-bot-api.exe" "TelegramSearchBot\telegram-bot-api.exe" | |
| - name: Copy telegram-bot-api binary to project | |
| if: success() | |
| shell: pwsh | |
| run: | | |
| if (Test-Path "telegram-bot-api-bin\telegram-bot-api.exe") { | |
| Copy-Item "telegram-bot-api-bin\telegram-bot-api.exe" "TelegramSearchBot\telegram-bot-api.exe" | |
| } else { | |
| Write-Warning "telegram-bot-api.exe not found, skipping copy" | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/push.yml around lines 54 - 57, The copy step "Copy
telegram-bot-api binary to project" runs unconditionally but will fail if
neither the cache hit nor the build produced
telegram-bot-api-bin\telegram-bot-api.exe; update that job step to only run when
the binary is available by adding an if condition checking the cache hit or
build success (for example: if: steps.cache.outputs.cache-hit == 'true' ||
steps.build.outcome == 'success'), or alternatively check the file existence
before copying (using a conditional script or an if: expression) so the step is
skipped when the binary doesn't exist.
| public bool EnableLocalBotAPI { get; set; } = false; | ||
| public string TelegramBotApiId { get; set; } | ||
| public string TelegramBotApiHash { get; set; } | ||
| public int LocalBotApiPort { get; set; } = 8081; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if documentation exists and mentions these new config options
fd -t f -e md . Docs/ 2>/dev/null | head -5
rg -l "EnableLocalBotAPI|TelegramBotApiId|TelegramBotApiHash|LocalBotApiPort" Docs/ 2>/dev/null || echo "No documentation found for new config fields"Repository: ModerRAS/TelegramSearchBot
Length of output: 268
Update documentation in Docs/ for new configuration fields.
New persistent configuration fields (EnableLocalBotAPI, TelegramBotApiId, TelegramBotApiHash, LocalBotApiPort) have been added to the Config class but are not documented. Per coding guidelines, user documentation must be updated when modifying persistent configuration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot.Common/Env.cs` around lines 86 - 89, Add documentation
entries in the Docs/ folder for the newly persisted configuration fields added
to the Config/Env class: EnableLocalBotAPI, TelegramBotApiId,
TelegramBotApiHash, and LocalBotApiPort. Update the user-facing configuration
reference (e.g., the config schema docs and any example config files) to
describe each field, its type (bool, string, string, int), default values
(false, null/empty, null/empty, 8081), expected usage, and any security notes
for TelegramBotApiId/TelegramBotApiHash; also add a short migration note so
users know these are now persistent. Ensure the docs reference the same symbols
(EnableLocalBotAPI, TelegramBotApiId, TelegramBotApiHash, LocalBotApiPort) so
they match the code.
| var startInfo = new ProcessStartInfo { | ||
| FileName = botApiExePath, | ||
| UseShellExecute = false, | ||
| RedirectStandardOutput = true, | ||
| RedirectStandardError = true | ||
| }; | ||
| startInfo.ArgumentList.Add("--local"); | ||
| startInfo.ArgumentList.Add($"--api-id={Env.TelegramBotApiId}"); | ||
| startInfo.ArgumentList.Add($"--api-hash={Env.TelegramBotApiHash}"); | ||
| startInfo.ArgumentList.Add($"--dir={botApiDataDir}"); | ||
| startInfo.ArgumentList.Add($"--http-port={Env.LocalBotApiPort}"); | ||
| var botApiProcess = Process.Start(startInfo); | ||
| if (botApiProcess == null) { | ||
| Log.Warning("telegram-bot-api 进程启动失败"); | ||
| } else { | ||
| childProcessManager.AddProcess(botApiProcess); | ||
| Log.Information("telegram-bot-api 已启动,等待端口 {Port} 就绪...", Env.LocalBotApiPort); | ||
| await WaitForLocalBotApiReady(Env.LocalBotApiPort); | ||
| } |
There was a problem hiding this comment.
Redirected stdout/stderr are never consumed - potential process hang.
When RedirectStandardOutput and RedirectStandardError are set to true, the output streams must be read. If the internal buffer fills up (typically 4KB), the child process will block on writes, potentially causing a hang.
Either disable redirection (if you don't need the output) or consume the streams asynchronously.
🔧 Option 1: Disable redirection (simplest)
var startInfo = new ProcessStartInfo {
FileName = botApiExePath,
UseShellExecute = false,
- RedirectStandardOutput = true,
- RedirectStandardError = true
+ RedirectStandardOutput = false,
+ RedirectStandardError = false
};🔧 Option 2: Consume streams asynchronously
var botApiProcess = Process.Start(startInfo);
if (botApiProcess == null) {
Log.Warning("telegram-bot-api 进程启动失败");
} else {
+ // Consume stdout/stderr to prevent buffer blocking
+ botApiProcess.OutputDataReceived += (s, e) => { if (e.Data != null) Log.Debug("[bot-api stdout] {Line}", e.Data); };
+ botApiProcess.ErrorDataReceived += (s, e) => { if (e.Data != null) Log.Debug("[bot-api stderr] {Line}", e.Data); };
+ botApiProcess.BeginOutputReadLine();
+ botApiProcess.BeginErrorReadLine();
childProcessManager.AddProcess(botApiProcess);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var startInfo = new ProcessStartInfo { | |
| FileName = botApiExePath, | |
| UseShellExecute = false, | |
| RedirectStandardOutput = true, | |
| RedirectStandardError = true | |
| }; | |
| startInfo.ArgumentList.Add("--local"); | |
| startInfo.ArgumentList.Add($"--api-id={Env.TelegramBotApiId}"); | |
| startInfo.ArgumentList.Add($"--api-hash={Env.TelegramBotApiHash}"); | |
| startInfo.ArgumentList.Add($"--dir={botApiDataDir}"); | |
| startInfo.ArgumentList.Add($"--http-port={Env.LocalBotApiPort}"); | |
| var botApiProcess = Process.Start(startInfo); | |
| if (botApiProcess == null) { | |
| Log.Warning("telegram-bot-api 进程启动失败"); | |
| } else { | |
| childProcessManager.AddProcess(botApiProcess); | |
| Log.Information("telegram-bot-api 已启动,等待端口 {Port} 就绪...", Env.LocalBotApiPort); | |
| await WaitForLocalBotApiReady(Env.LocalBotApiPort); | |
| } | |
| var startInfo = new ProcessStartInfo { | |
| FileName = botApiExePath, | |
| UseShellExecute = false, | |
| RedirectStandardOutput = false, | |
| RedirectStandardError = false | |
| }; | |
| startInfo.ArgumentList.Add("--local"); | |
| startInfo.ArgumentList.Add($"--api-id={Env.TelegramBotApiId}"); | |
| startInfo.ArgumentList.Add($"--api-hash={Env.TelegramBotApiHash}"); | |
| startInfo.ArgumentList.Add($"--dir={botApiDataDir}"); | |
| startInfo.ArgumentList.Add($"--http-port={Env.LocalBotApiPort}"); | |
| var botApiProcess = Process.Start(startInfo); | |
| if (botApiProcess == null) { | |
| Log.Warning("telegram-bot-api 进程启动失败"); | |
| } else { | |
| childProcessManager.AddProcess(botApiProcess); | |
| Log.Information("telegram-bot-api 已启动,等待端口 {Port} 就绪...", Env.LocalBotApiPort); | |
| await WaitForLocalBotApiReady(Env.LocalBotApiPort); | |
| } |
| var startInfo = new ProcessStartInfo { | |
| FileName = botApiExePath, | |
| UseShellExecute = false, | |
| RedirectStandardOutput = true, | |
| RedirectStandardError = true | |
| }; | |
| startInfo.ArgumentList.Add("--local"); | |
| startInfo.ArgumentList.Add($"--api-id={Env.TelegramBotApiId}"); | |
| startInfo.ArgumentList.Add($"--api-hash={Env.TelegramBotApiHash}"); | |
| startInfo.ArgumentList.Add($"--dir={botApiDataDir}"); | |
| startInfo.ArgumentList.Add($"--http-port={Env.LocalBotApiPort}"); | |
| var botApiProcess = Process.Start(startInfo); | |
| if (botApiProcess == null) { | |
| Log.Warning("telegram-bot-api 进程启动失败"); | |
| } else { | |
| childProcessManager.AddProcess(botApiProcess); | |
| Log.Information("telegram-bot-api 已启动,等待端口 {Port} 就绪...", Env.LocalBotApiPort); | |
| await WaitForLocalBotApiReady(Env.LocalBotApiPort); | |
| } | |
| var startInfo = new ProcessStartInfo { | |
| FileName = botApiExePath, | |
| UseShellExecute = false, | |
| RedirectStandardOutput = true, | |
| RedirectStandardError = true | |
| }; | |
| startInfo.ArgumentList.Add("--local"); | |
| startInfo.ArgumentList.Add($"--api-id={Env.TelegramBotApiId}"); | |
| startInfo.ArgumentList.Add($"--api-hash={Env.TelegramBotApiHash}"); | |
| startInfo.ArgumentList.Add($"--dir={botApiDataDir}"); | |
| startInfo.ArgumentList.Add($"--http-port={Env.LocalBotApiPort}"); | |
| var botApiProcess = Process.Start(startInfo); | |
| if (botApiProcess == null) { | |
| Log.Warning("telegram-bot-api 进程启动失败"); | |
| } else { | |
| // Consume stdout/stderr to prevent buffer blocking | |
| botApiProcess.OutputDataReceived += (s, e) => { if (e.Data != null) Log.Debug("[bot-api stdout] {Line}", e.Data); }; | |
| botApiProcess.ErrorDataReceived += (s, e) => { if (e.Data != null) Log.Debug("[bot-api stderr] {Line}", e.Data); }; | |
| botApiProcess.BeginOutputReadLine(); | |
| botApiProcess.BeginErrorReadLine(); | |
| childProcessManager.AddProcess(botApiProcess); | |
| Log.Information("telegram-bot-api 已启动,等待端口 {Port} 就绪...", Env.LocalBotApiPort); | |
| await WaitForLocalBotApiReady(Env.LocalBotApiPort); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/AppBootstrap/GeneralBootstrap.cs` around lines 108 - 126,
The ProcessStartInfo currently sets RedirectStandardOutput and
RedirectStandardError to true but never consumes those streams, which can cause
the child process to hang; fix by either disabling redirection (set
RedirectStandardOutput = false and RedirectStandardError = false on the
ProcessStartInfo used to start the bot API) or, if you need logs, consume the
streams asynchronously after Process.Start by attaching asynchronous readers to
botApiProcess.StandardOutput and botApiProcess.StandardError (or use
BeginOutputReadLine/BeginErrorReadLine or Task-based reads) and forward them to
Log before calling childProcessManager.AddProcess and awaiting
WaitForLocalBotApiReady so buffers never fill.
| var engine = await GetOCREngineAsync(); | ||
| var ocrService = _ocrServices.FirstOrDefault(s => s.Engine == engine) | ||
| ?? _ocrServices.First(s => s.Engine == OCREngine.PaddleOCR); | ||
|
|
||
| logger.LogInformation($"使用OCR引擎: {engine}"); | ||
| OcrStr = await ocrService.ExecuteAsync(new MemoryStream(PhotoStream)); |
There was a problem hiding this comment.
Keep LLM OCR off the update-handler path.
With OCREngine.LLM, this controller now awaits LLMOCRService.ExecuteAsync(...), which does image conversion, temp-file I/O, and the LLM call inline. That makes ExecuteAsync a long-running handler and can stall the controller pipeline for other updates. Dispatch the LLM OCR through the same subprocess/scheduler boundary as PaddleOCR instead of doing it inside IOnUpdate. As per coding guidelines, "OCR, ASR, and QR code processing must be delegated to independent subprocess services launched via AppBootstrap" and "Long-running operations like OCR and downloads must be placed in Service/Scheduler or separate processes, not in IOnUpdate handlers".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs` around lines 74 -
79, The controller currently awaits LLM OCR inline (GetOCREngineAsync ->
selecting _ocrServices where Engine == OCREngine.LLM and calling
LLMOCRService.ExecuteAsync), causing long-running work inside the IOnUpdate
path; change the flow so the controller does not run LLM OCR synchronously:
detect when GetOCREngineAsync returns OCREngine.LLM and instead enqueue or
dispatch the image to the existing subprocess/scheduler service used for
PaddleOCR (or the AppBootstrap-launched OCR subprocess) rather than calling
LLMOCRService.ExecuteAsync directly; keep synchronous/fast handling for update
handlers by selecting the local OCR service (e.g., PaddleOCR) for immediate
ExecuteAsync and push LLM requests to the background scheduler/service (use the
project’s scheduler enqueue method or subprocess launcher used elsewhere) so
IOnUpdate never performs image conversion/temp-file I/O or LLM network calls
inline.
| protected readonly AdminService AdminService; | ||
| protected readonly EditOCRConfService EditOCRConfService; | ||
| protected readonly EditOCRConfView EditOCRConfView; |
There was a problem hiding this comment.
Don't keep a mutable view instance on the controller.
TelegramSearchBot/Executor/ControllerExecutor.cs:19-37 reuses controller instances, but this controller stores EditOCRConfView as a field and mutates its _chatId/_replyToMessageId/_messageText per request. Two overlapping admin updates can overwrite each other and send the response to the wrong chat/message. Make the view stateless (Render(chatId, replyTo, message)) or create a fresh instance per execution. As per coding guidelines, "Controllers must be stateless; delegate all persistent operations to Service layer classes".
Also applies to: 22-30, 49-55
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Controller/Manage/EditOCRConfController.cs` around lines 18
- 20, EditOCRConfController currently holds a mutable EditOCRConfView instance
that gets reused by ControllerExecutor, causing cross-request state corruption;
remove the view field from EditOCRConfController and either (A) change
EditOCRConfView to be stateless by replacing instance
fields/_chatId/_replyToMessageId/_messageText with a Render(chatId,
replyToMessageId, messageText) method, or (B) instantiate a new EditOCRConfView
inside the controller execution path each request before calling its render/send
methods; update all places in EditOCRConfController that read or mutate
EditOCRConfView (e.g., where you set chatId/replyToMessageId/messageText) to use
the stateless Render signature or the fresh instance, ensuring the controller
remains stateless as required by ControllerExecutor.
| var tempPath = Path.GetTempFileName() + ".jpg"; | ||
|
|
||
| try { | ||
| using (var fs = new FileStream(tempPath, FileMode.Create, FileAccess.Write)) { | ||
| tg_img_data.SaveTo(fs); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In .NET, does Path.GetTempFileName() create a file on disk, and what is the recommended way to generate a temporary filename with a custom .jpg extension without leaving the original temp file behind?
💡 Result:
Yes, Path.GetTempFileName creates a uniquely named, zero-byte temporary file on disk with a .tmp extension in the user's temporary folder (Path.GetTempPath). The recommended way to generate a temporary filename with a custom .jpg extension without creating the original temp file is to use Path.GetTempPath combined with Path.GetRandomFileName (which generates a random filename string without creating a file) and change the extension to .jpg: string tempJpgPath = Path.Combine(Path.GetTempPath, Path.ChangeExtension(Path.GetRandomFileName, ".jpg")); This produces a cryptographically strong random name (e.g., "a1b2c3d4.e5f6.jpg") that is statistically unique, avoiding collisions and unnecessary file creation. When ready to use it, create the file yourself (e.g., via File.Create) and delete it after use, as temp files are not auto-deleted. Alternative using GUID (slightly less random but unique): string tempJpgPath = Path.Combine(Path.GetTempPath, Guid.NewGuid.ToString + ".jpg");
Citations:
- 1: https://learn.microsoft.com/en-us/dotnet/api/system.io.path.gettempfilename?view=net-7.0
- 2: https://learn.microsoft.com/en-us/dotnet/api/system.io.path.gettempfilename?view=net-10.0
- 3: https://evotec.xyz/difference-between-gettempfilename-and-getrandomfilename-that-got-my-ass-kicked/
- 4: https://learn.microsoft.com/en-us/dotnet/api/system.io.path.getrandomfilename?view=net-8.0
- 5: https://learn.microsoft.com/en-us/dotnet/api/system.io.path.gettempfilename?view=net-8.0
- 6: https://stackoverflow.com/questions/581570/how-can-i-create-a-temp-file-with-a-specific-extension-with-net
- 7: https://www.alwaysdeveloping.net/dailydrop/2022/05/19-temp-file-name/
🏁 Script executed:
cat -n TelegramSearchBot/Service/AI/OCR/LLMOCRService.cs | head -50Repository: ModerRAS/TelegramSearchBot
Length of output: 2305
Don't leak a temp file on every OCR request.
Path.GetTempFileName() creates a file on disk immediately. Appending .jpg to that path writes the image to a separate file while leaving the original .tmp file behind, steadily growing the temp directory under load.
Fix
- var tempPath = Path.GetTempFileName() + ".jpg";
+ var tempPath = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid():N}.jpg");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var tempPath = Path.GetTempFileName() + ".jpg"; | |
| try { | |
| using (var fs = new FileStream(tempPath, FileMode.Create, FileAccess.Write)) { | |
| tg_img_data.SaveTo(fs); | |
| } | |
| var tempPath = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid():N}.jpg"); | |
| try { | |
| using (var fs = new FileStream(tempPath, FileMode.Create, FileAccess.Write)) { | |
| tg_img_data.SaveTo(fs); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Service/AI/OCR/LLMOCRService.cs` around lines 30 - 35, The
current use of Path.GetTempFileName() + ".jpg" creates an unused .tmp file each
request; change the tempPath generation in LLMOCRService (tempPath variable) to
create a unique JPG path without pre-creating a file (e.g.,
Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString() + ".jpg") or
Path.GetRandomFileName() + ".jpg"), write the image to that path in your
existing FileStream block, and ensure you delete the tempPath in a
finally/cleanup block after processing to avoid leaking temp files.
| _logger.LogInformation("正在使用LLM进行OCR识别..."); | ||
| var result = await _generalLLMService.AnalyzeImageAsync(tempPath, 0); |
There was a problem hiding this comment.
Pass the real chat id into the LLM call.
TelegramSearchBot/Interface/AI/LLM/IGeneralLLMService.cs:23-24 defines the second AnalyzeImageAsync argument as ChatId. Hardcoding 0 here drops the originating chat context, so any chat-scoped LLM routing/configuration inside GeneralLLMService is bypassed. Please thread e.Message.Chat.Id through IOCRService.ExecuteAsync(...) and use it here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Service/AI/OCR/LLMOCRService.cs` around lines 37 - 38, The
call to _generalLLMService.AnalyzeImageAsync(tempPath, 0) discards the
originating chat context; update IOCRService.ExecuteAsync to accept a chatId
parameter (thread e.Message.Chat.Id from the caller), propagate that chatId into
LLMOCRService (or wherever ExecuteAsync is implemented), and replace the
hardcoded 0 with the passed chatId when calling
_generalLLMService.AnalyzeImageAsync(tempPath, chatId) so chat-scoped
routing/configuration in GeneralLLMService receives the correct ChatId.
| if (command == "退出" || command == "返回") { | ||
| await redis.DeleteKeysAsync(); | ||
| return (true, "已退出OCR配置"); | ||
| } |
There was a problem hiding this comment.
返回 command behavior conflicts with the view prompt
Line 187 tells users “输入返回返回主菜单”, but Line 60 currently treats 返回 as exit and clears state. This creates a misleading UX and can terminate the flow unexpectedly.
💡 Suggested fix
- if (command == "退出" || command == "返回") {
+ if (command == "退出") {
await redis.DeleteKeysAsync();
return (true, "已退出OCR配置");
}- sb.AppendLine("输入\"返回\"返回主菜单");
+ sb.AppendLine("输入任意内容返回主菜单");Also applies to: 187-190
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs` around lines 60 - 63,
The handler currently treats both "退出" and "返回" the same and clears state via
redis.DeleteKeysAsync(), which conflicts with the UI prompt that says
输入`返回`返回主菜单; update the logic so only the "退出" command triggers
redis.DeleteKeysAsync() and returns the exit message, while "返回" should return
(true, "返回主菜单") or equivalent without clearing state; adjust any related
prompt/response text (the view prompt that mentions 输入`返回`返回主菜单) to remain
consistent with this new behavior.
| if (_stateHandlers.TryGetValue(currentState, out var handler)) { | ||
| return await handler(redis, command); | ||
| } | ||
|
|
||
| return (false, string.Empty); | ||
| } |
There was a problem hiding this comment.
Unknown Redis state should self-recover instead of returning empty response
At Line 74, returning (false, string.Empty) on an unrecognized state gives no user guidance and can dead-end the conversation. Resetting to MainMenu is safer.
💡 Suggested fix
if (_stateHandlers.TryGetValue(currentState, out var handler)) {
return await handler(redis, command);
}
-return (false, string.Empty);
+await redis.SetStateAsync(OCRConfState.MainMenu.GetDescription());
+return await HandleMainMenuAsync(redis, command);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs` around lines 70 - 75,
The code currently returns (false, string.Empty) when currentState is not found
in _stateHandlers, which leaves the conversation dead-ended; instead reset the
conversation state to MainMenu and return a friendly message. In the else branch
where _stateHandlers.TryGetValue(currentState, out var handler) is false, write
the new state ("MainMenu") back to Redis (or call the existing state setter used
elsewhere), ensure any in-memory state is updated, and return (true, "<brief
prompt for main menu>") or similar so the user is guided back to MainMenu;
reference _stateHandlers, currentState, handler and MainMenu when making the
change.
| private async Task<(bool, string)> HandleMainMenuAsync(EditOCRConfRedisHelper redis, string command) { | ||
| var sb = new StringBuilder(); | ||
| sb.AppendLine("🔧 OCR配置"); | ||
| sb.AppendLine(); | ||
|
|
||
| var currentEngine = await GetCurrentEngineAsync(); | ||
| sb.AppendLine($"当前OCR引擎: {currentEngine}"); | ||
| sb.AppendLine(); | ||
| sb.AppendLine("请选择操作:"); | ||
| sb.AppendLine("1. 切换OCR引擎"); | ||
| sb.AppendLine("2. 查看配置详情"); | ||
| sb.AppendLine("3. 返回"); | ||
|
|
||
| return (true, sb.ToString()); | ||
| } |
There was a problem hiding this comment.
Main menu never transitions state, so config flow cannot proceed
At Line 77, the handler ignores command and always re-renders the menu. Inputs like 1/2 never move to SelectingEngine/ViewingConfig, making those flows unreachable from normal interaction.
💡 Suggested fix
private async Task<(bool, string)> HandleMainMenuAsync(EditOCRConfRedisHelper redis, string command) {
+ if (command == "1" || command == "切换OCR引擎") {
+ await redis.SetStateAsync(OCRConfState.SelectingEngine.GetDescription());
+ return (true, "请选择OCR引擎:\n1. PaddleOCR\n2. LLM");
+ }
+
+ if (command == "2" || command == "查看配置详情") {
+ await redis.SetStateAsync(OCRConfState.ViewingConfig.GetDescription());
+ return await HandleViewingConfigAsync(redis, command);
+ }
+
var sb = new StringBuilder();
sb.AppendLine("🔧 OCR配置");
sb.AppendLine();
var currentEngine = await GetCurrentEngineAsync();
sb.AppendLine($"当前OCR引擎: {currentEngine}");
sb.AppendLine();
sb.AppendLine("请选择操作:");
sb.AppendLine("1. 切换OCR引擎");
sb.AppendLine("2. 查看配置详情");
sb.AppendLine("3. 返回");
return (true, sb.ToString());
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Service/Manage/EditOCRConfService.cs` around lines 77 - 91,
HandleMainMenuAsync currently ignores the incoming command and always re-renders
the menu; update it to parse the command and set the next interaction state on
the EditOCRConfRedisHelper so the flow can transition (e.g., set redis.State to
SelectingEngine when command == "1", to ViewingConfig when command == "2", and
handle "3" as return/exit). Keep returning the menu text when command is empty
or invalid, but when a valid option is chosen return the appropriate state
change (and a brief confirmation message) so downstream handlers
(SelectingEngine/ViewingConfig) are reachable; reference the HandleMainMenuAsync
method, the redis parameter, and the SelectingEngine/ViewingConfig state names
when making the change.
Fix whitespace formatting issues detected by CI on Windows
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs (1)
74-79:⚠️ Potential issue | 🟠 MajorKeep LLM OCR off the
IOnUpdatepath.When
GetOCREngineAsync()resolves toOCREngine.LLM, this handler now awaits the full OCR pipeline inline. That puts image conversion, temp-file I/O, and the LLM call on the update path and can stall other updates. Route the LLM branch through the existing subprocess/scheduler boundary instead of callingExecuteAsync()here.As per coding guidelines, "OCR, ASR, and QR code processing must be delegated to independent subprocess services launched via AppBootstrap" and "Long-running operations like OCR and downloads must be placed in Service/Scheduler or separate processes, not in IOnUpdate handlers".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs` around lines 74 - 79, The handler currently calls GetOCREngineAsync() and then invokes ocrService.ExecuteAsync inline, which blocks IOnUpdate when engine == OCREngine.LLM; change the flow so that if GetOCREngineAsync() returns OCREngine.LLM you do not call ExecuteAsync here but instead enqueue the OCR job through the existing subprocess/scheduler boundary (use the app's subprocess/service bootstrap or scheduler API used elsewhere) and return immediately, while keeping the existing inline ExecuteAsync behavior only for non-LLM engines; update references around GetOCREngineAsync, OCREngine.LLM, _ocrServices, and ExecuteAsync to implement this routing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs`:
- Around line 75-78: The code in AutoOCRController is fragile: wrap the config
read (_configService.GetConfigurationValueAsync) in a try/catch and treat
failures as a missing/invalid engine so OCR selection continues; when selecting
the OCR service, use _ocrServices.FirstOrDefault(s => s.Engine == engine) ??
_ocrServices.FirstOrDefault(s => s.Engine == OCREngine.PaddleOCR) and if that
returns null throw a clear InvalidOperationException indicating no OCR
implementations are registered; finally change the log to emit the actual
selected engine (logger.LogInformation($"使用OCR引擎: {ocrService.Engine}")) instead
of the requested value; apply the same changes to the second selection block
around lines 116-121 in AutoOCRController.
---
Duplicate comments:
In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs`:
- Around line 74-79: The handler currently calls GetOCREngineAsync() and then
invokes ocrService.ExecuteAsync inline, which blocks IOnUpdate when engine ==
OCREngine.LLM; change the flow so that if GetOCREngineAsync() returns
OCREngine.LLM you do not call ExecuteAsync here but instead enqueue the OCR job
through the existing subprocess/scheduler boundary (use the app's
subprocess/service bootstrap or scheduler API used elsewhere) and return
immediately, while keeping the existing inline ExecuteAsync behavior only for
non-LLM engines; update references around GetOCREngineAsync, OCREngine.LLM,
_ocrServices, and ExecuteAsync to implement this routing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 68887bb4-edea-4c7d-9ee0-ea1cc6fbb3d3
📒 Files selected for processing (23)
TelegramSearchBot.Common/Attributes/McpAttributes.csTelegramSearchBot.Common/Model/Tools/IterationLimitReachedPayload.csTelegramSearchBot.Database/Migrations/20260303031828_AddUserWithGroupUniqueIndex.csTelegramSearchBot.Database/Migrations/20260313124507_AddChannelWithModelIsDeleted.csTelegramSearchBot.LLM.Test/Service/AI/LLM/GeneralLLMServiceTests.csTelegramSearchBot.LLM.Test/Service/AI/LLM/ModelCapabilityServiceTests.csTelegramSearchBot.LLM.Test/Service/AI/LLM/OpenAIProviderHistorySerializationTests.csTelegramSearchBot.LLM/Service/AI/LLM/GeminiService.csTelegramSearchBot.LLM/Service/AI/LLM/McpToolHelper.csTelegramSearchBot.LLM/Service/AI/LLM/OllamaService.csTelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.csTelegramSearchBot.LLM/Service/Mcp/McpClient.csTelegramSearchBot.LLM/Service/Mcp/McpServerManager.csTelegramSearchBot.LLM/Service/Tools/FileToolService.csTelegramSearchBot.Test/Helper/WordCloudHelperTests.csTelegramSearchBot/Controller/AI/OCR/AutoOCRController.csTelegramSearchBot/Extension/ServiceCollectionExtension.csTelegramSearchBot/Model/AI/OCRConfState.csTelegramSearchBot/Service/AI/OCR/LLMOCRService.csTelegramSearchBot/Service/BotAPI/SendMessageService.Streaming.csTelegramSearchBot/Service/Manage/EditOCRConfService.csTelegramSearchBot/Service/Storage/MessageService.csTelegramSearchBot/Service/Vector/FaissVectorService.cs
✅ Files skipped from review due to trivial changes (20)
- TelegramSearchBot.LLM.Test/Service/AI/LLM/OpenAIProviderHistorySerializationTests.cs
- TelegramSearchBot.Common/Model/Tools/IterationLimitReachedPayload.cs
- TelegramSearchBot.Common/Attributes/McpAttributes.cs
- TelegramSearchBot/Service/BotAPI/SendMessageService.Streaming.cs
- TelegramSearchBot.LLM.Test/Service/AI/LLM/GeneralLLMServiceTests.cs
- TelegramSearchBot/Extension/ServiceCollectionExtension.cs
- TelegramSearchBot.LLM/Service/AI/LLM/GeminiService.cs
- TelegramSearchBot.Database/Migrations/20260303031828_AddUserWithGroupUniqueIndex.cs
- TelegramSearchBot/Service/Vector/FaissVectorService.cs
- TelegramSearchBot.LLM/Service/Mcp/McpClient.cs
- TelegramSearchBot.LLM/Service/Mcp/McpServerManager.cs
- TelegramSearchBot.LLM.Test/Service/AI/LLM/ModelCapabilityServiceTests.cs
- TelegramSearchBot.Database/Migrations/20260313124507_AddChannelWithModelIsDeleted.cs
- TelegramSearchBot.Test/Helper/WordCloudHelperTests.cs
- TelegramSearchBot.LLM/Service/AI/LLM/OpenAIService.cs
- TelegramSearchBot.LLM/Service/Tools/FileToolService.cs
- TelegramSearchBot.LLM/Service/AI/LLM/McpToolHelper.cs
- TelegramSearchBot.LLM/Service/AI/LLM/OllamaService.cs
- TelegramSearchBot/Model/AI/OCRConfState.cs
- TelegramSearchBot/Service/Storage/MessageService.cs
🚧 Files skipped from review as they are similar to previous changes (2)
- TelegramSearchBot/Service/AI/OCR/LLMOCRService.cs
- TelegramSearchBot/Service/Manage/EditOCRConfService.cs
| var ocrService = _ocrServices.FirstOrDefault(s => s.Engine == engine) | ||
| ?? _ocrServices.First(s => s.Engine == OCREngine.PaddleOCR); | ||
|
|
||
| logger.LogInformation($"使用OCR引擎: {engine}"); |
There was a problem hiding this comment.
Harden the fallback path and log the engine actually used.
This only falls back when the stored value is empty or invalid. If _configService.GetConfigurationValueAsync(...) throws, the handler still fails before OCR starts, and Line 76 still throws if PaddleOCR is not registered. In the fallback case, Line 78 also logs the requested engine instead of the selected service. That regresses the PR’s “default PaddleOCR” behavior under transient config or DI issues.
🛠️ Suggested hardening
- var engine = await GetOCREngineAsync();
- var ocrService = _ocrServices.FirstOrDefault(s => s.Engine == engine)
- ?? _ocrServices.First(s => s.Engine == OCREngine.PaddleOCR);
-
- logger.LogInformation($"使用OCR引擎: {engine}");
+ var engine = await GetOCREngineAsync();
+ var ocrService = _ocrServices.FirstOrDefault(s => s.Engine == engine);
+ if (ocrService is null) {
+ logger.LogWarning(
+ "Configured OCR engine {ConfiguredEngine} is unavailable. Falling back to PaddleOCR.",
+ engine);
+ ocrService = _ocrServices.FirstOrDefault(s => s.Engine == OCREngine.PaddleOCR)
+ ?? throw new InvalidOperationException("No PaddleOCR service is registered.");
+ }
+
+ logger.LogInformation("使用OCR引擎: {Engine}", ocrService.Engine);
OcrStr = await ocrService.ExecuteAsync(new MemoryStream(PhotoStream));
@@
- private async Task<OCREngine> GetOCREngineAsync() {
- var engineStr = await _configService.GetConfigurationValueAsync(EditOCRConfService.OCREngineKey);
- if (!string.IsNullOrEmpty(engineStr) && Enum.TryParse<OCREngine>(engineStr, out var engine)) {
- return engine;
- }
- return OCREngine.PaddleOCR;
- }
+ private async Task<OCREngine> GetOCREngineAsync() {
+ try {
+ var engineStr = await _configService.GetConfigurationValueAsync(EditOCRConfService.OCREngineKey);
+ if (!string.IsNullOrEmpty(engineStr) &&
+ Enum.TryParse<OCREngine>(engineStr, ignoreCase: true, out var engine)) {
+ return engine;
+ }
+ } catch (Exception ex) {
+ logger.LogWarning(ex, "Failed to load OCR engine configuration. Falling back to PaddleOCR.");
+ }
+
+ return OCREngine.PaddleOCR;
+ }Also applies to: 116-121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@TelegramSearchBot/Controller/AI/OCR/AutoOCRController.cs` around lines 75 -
78, The code in AutoOCRController is fragile: wrap the config read
(_configService.GetConfigurationValueAsync) in a try/catch and treat failures as
a missing/invalid engine so OCR selection continues; when selecting the OCR
service, use _ocrServices.FirstOrDefault(s => s.Engine == engine) ??
_ocrServices.FirstOrDefault(s => s.Engine == OCREngine.PaddleOCR) and if that
returns null throw a clear InvalidOperationException indicating no OCR
implementations are registered; finally change the log to emit the actual
selected engine (logger.LogInformation($"使用OCR引擎: {ocrService.Engine}")) instead
of the requested value; apply the same changes to the second selection block
around lines 116-121 in AutoOCRController.
变更摘要
实现Issue #241 - 添加基于大模型的OCR作为PaddleOCR的替代方案
新增功能
IOCRService接口和OCREngine枚举
LLMOCRService实现
OCR配置状态机
AutoOCRController增强
使用方式
通过Bot私聊发送以下命令进行OCR配置:
文件变更
新增文件:
修改文件:
Summary by CodeRabbit
New Features
Chores