Skip to content

Improve telemetry error classification to reduce 'unknown' error types#1409

Merged
eleanorjboyd merged 2 commits intomicrosoft:mainfrom
StellaHuang95:moreErrorTypes
Mar 27, 2026
Merged

Improve telemetry error classification to reduce 'unknown' error types#1409
eleanorjboyd merged 2 commits intomicrosoft:mainfrom
StellaHuang95:moreErrorTypes

Conversation

@StellaHuang95
Copy link
Copy Markdown
Contributor

Problem

The classifyError() function in the telemetry error classifier had limited coverage — only 6 error categories were recognized. Many errors from manager registration (MANAGER_REGISTRATION_FAILED) were falling into the 'unknown' bucket, making it difficult to diagnose failures from telemetry data.

After tracing every code path that leads to MANAGER_REGISTRATION_FAILED telemetry (pipenv, pyenv, poetry, conda, system, and shellStartupVars registration), we identified 6 common error patterns that were unclassified.

Changes

Added 6 new error types to DiscoveryErrorType and corresponding classification logic:

New Type What It Catches Detection Method
tool_not_found "Conda not found", "Python extension not found", "Poetry executable not found" Message pattern
command_failed Failed to run "conda ...", Failed to run poetry ..., Error spawning conda: ... Message pattern
connection_error PET process dies mid-request, JSON-RPC connection closed/disposed instanceof rpc.ConnectionError
rpc_error PET returns a JSON-RPC error response (e.g., internal error, method not found) instanceof rpc.ResponseError
process_crash PET restarting, failed after N restart attempts, failed to create stdio streams Message pattern
already_registered Duplicate manager registration (race condition) instanceof BaseError

Every message pattern is based on exact throw new Error(...) strings found in the codebase — no speculative matching.

Ordering

The check order is designed to avoid misclassification:

  1. instanceof checks first (most reliable, no false positives)
  2. error.code checks for Node.js errno codes
  3. Message patterns from most specific to least specific (e.g., 'not found' is after 'json' to avoid catching JSON validation errors)
  4. Error name fallback for cancellation variants
  5. 'unknown' as final fallback

Testing

Added 7 new test cases covering all new error types with real-world error messages from the codebase. All 15 tests pass.

@eleanorjboyd eleanorjboyd merged commit c6e4fe7 into microsoft:main Mar 27, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants