Skip to content

feat(sdk): Communicate Mode SDK (on_relay) for Python and TypeScript#618

Open
khaliqgant wants to merge 11 commits intomainfrom
feature/communicate-mode-sdk
Open

feat(sdk): Communicate Mode SDK (on_relay) for Python and TypeScript#618
khaliqgant wants to merge 11 commits intomainfrom
feature/communicate-mode-sdk

Conversation

@khaliqgant
Copy link
Member

@khaliqgant khaliqgant commented Mar 21, 2026

Summary

  • Add Communicate Mode SDK (on_relay) enabling agents to receive tasks via relay messaging
  • TypeScript and Python adapter implementations with per-adapter subpath exports
  • Spec-compliant ping/pong, auto-detect module matching, and withRelay alias
  • Connection retry and error handling
  • Multiple rounds of review fixes (spec compliance, adapter improvements, Devin review findings)

Commits

  • Initial on_relay SDK for Python and TypeScript
  • Spec compliance fixes (ping/pong, module matching)
  • Per-adapter subpath exports and withRelay alias
  • Connection retry and error handling
  • Multiple review rounds addressing feedback

Test plan

  • Verify TypeScript adapter connects and receives relay messages
  • Verify Python adapter connects and receives relay messages
  • Test ping/pong keepalive mechanism
  • Test auto-detect module matching
  • Test connection retry on disconnect
  • Verify subpath exports work (@agent-relay/sdk/communicate)

🤖 Generated with Claude Code


Open with Devin

khaliqgant and others added 10 commits March 13, 2026 22:30
Implement the Connect SDK spec — a new "Communicate Mode" that lets
any framework agent join Relaycast with a single on_relay() call.

Python SDK:
- Relay core (lazy WebSocket, send/post/reply/inbox/agents/close)
- RelayTransport (HTTP + WS, auto-reconnect, exponential backoff)
- 6 framework adapters: OpenAI Agents, Claude SDK, Google ADK, Agno, Swarms, CrewAI
- Tier 1 (Push) adapters inject messages mid-execution via hooks/callbacks
- Tier 2 (Poll) adapters surface messages at tool-call boundaries
- 96 tests (unit + integration + cross-framework)

TypeScript SDK:
- Relay core, transport, types mirroring Python API
- 2 framework adapters: Pi (session.steer), Claude SDK (PostToolUse hook)
- 16 tests (unit + integration + cross-framework)

Docs:
- communicate.mdx overview + 7 per-framework guides
- Plain markdown mirrors (docs-sync rule)
- Updated introduction.mdx with Communicate Mode section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves npm ci failure due to missing @sinclair/typebox@0.34.48 in lock file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add subpath exports for ./communicate/adapters/pi and
  ./communicate/adapters/claude-sdk so users can import directly
- Add withRelay as alias for onRelay (spec references both names)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ping/pong handling to both Python and TypeScript transports
  (server pings were silently ignored, causing connection drops)
- Fix OpenAI Agents module check: "agents" not "openai_agents"
  (pip package imports as `agents`, not `openai_agents`)
- Use startswith() instead of `in` for module matching per spec
- Remove Claude SDK from Python auto-detect (spec says import
  adapter directly since it uses query options, not agent objects)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CRITICAL:
- Make communicate import lazy in __init__.py (was crashing SDK users
  without aiohttp installed via eager import chain)
- Guard aiohttp import with helpful ImportError message
- Standardize Claude SDK adapter signature to (options, relay) matching
  other adapters, with relay optional

HIGH:
- Fix OpenAI Agents import: `from agents import function_tool` (not
  `from openai_agents`) — package imports as `agents`
- Fix CrewAI import: `from crewai.tools import tool` (not
  `from langchain_core.tools`) — wrong dependency
- Make relay param optional in all adapters with auto-creation
- Add ImportError messages for missing framework packages
- Fix instructions_wrapper to accept *args, **kwargs (OpenAI Agents
  and Agno pass (ctx, agent) to instructions callable)

MEDIUM:
- Add readonly to TypeScript Message interface fields
- Add per-framework optional dependency groups to pyproject.toml

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Python SDK:
- Fix all 5 adapter compliance issues (prepend inbox, await async
  callables, Content parts for ADK, dynamic crewai backstory,
  claude_sdk signature with explicit name param)
- Fix test import mismatches (agents vs openai_agents, crewai.tools
  vs langchain_core.tools, claude_sdk positional arg order)
- Add fallback HTTP polling when WebSocket fails
- Fix message routing "both" case (callbacks + buffer)
- Conditional __all__ exports for communicate module
- Add auto-detect on_relay() tests
- Add pytest-cov to dev deps

TypeScript SDK:
- Add transport.test.ts with 19 tests (HTTP, WebSocket, errors)
- Add JSDoc to all public functions in core, types, adapters
- Replace any types with proper type guards in index.ts
- Reduce Pi adapter from 129→101 lines via data-driven tool defs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix claude_sdk adapter argument order: on_relay(options, relay, *, name)
  to match other adapters' (agent, relay) pattern
- Fix return type annotation: callable -> Callable[[], None] in core.py
- URL-encode WebSocket token in transport.py to prevent connection failures
- Add retry for aiohttp.ClientError in HTTP transport (was only retrying 5xx)
- Add Relay.peek() method and use it in OpenAI Agents/Agno instructions
  wrappers to avoid draining inbox and starving the relay_inbox tool
- Remove transport re-export from TS communicate/index.ts (internal detail)
- Improve error message for TS onRelay() auto-detection failures
- Add SIGTERM/SIGINT handlers for autoCleanup in TS core.ts
- Use thread-based fallback in _run_sync for async context compatibility
- Fix @agent-relay/config version: 3.1.23 -> 3.2.1 in SDK package.json
- Update spec: withRelay references -> onRelay
- Add per-framework guide links to docs/markdown/communicate.md (docs sync)
- Update all tests for new signatures

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add connect-dedup in Python Relay._ensure_connected to prevent
  concurrent double-registration (mirrors TS connectPromise pattern)
- Guard communicate import in shim agent_relay/__init__.py with
  try/except for users without aiohttp
- TS handleTransportMessage now always buffers messages (spec: "both"
  case — callbacks AND inbox), matching Python behavior
- Google ADK adapter preserves original before_model_callback return
  value for short-circuit support
- TS transport retries on network errors with exponential backoff,
  matching Python transport behavior
- Exclude completion-pipeline and e2e-owner-review vitest tests from
  tsconfig.json (in addition to tsconfig.build.json)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-picked communicate-related changes from 2a9b7ef.
- Python _ensure_connected: wrap in outer try/except to resolve _connect_future on fallback exception
- TypeScript ensureConnected: reset connectPromise on rejection for retry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-picked communicate-related changes from 65cd6cf.
- TS onRelay: empty object {} now falls through to Claude SDK adapter
- Python openai_agents/agno: use inbox() instead of peek() to drain messages
- Python core: re-raise RelayConfigError/RelayAuthError before fallback
- Python crewai: always call inbox_sync() regardless of event loop state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

Resolve conflicts by accepting main's deletions for restructured docs,
preserving feature branch's SDK improvements for communicate adapters,
and accepting main's package-lock.json and config files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
workspace: config.workspace ?? process.env.RELAY_WORKSPACE,
apiKey: config.apiKey ?? process.env.RELAY_API_KEY,
baseUrl: trimTrailingSlashes(config.baseUrl ?? process.env.RELAY_BASE_URL ?? DEFAULT_RELAY_BASE_URL),
baseUrl: (config.baseUrl ?? process.env.RELAY_BASE_URL ?? DEFAULT_RELAY_BASE_URL).replace(/\/+$/, ''),

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data High

This
regular expression
that depends on
library input
may run slow on strings with many repetitions of '/'.
This
regular expression
that depends on
library input
may run slow on strings with many repetitions of '/'.
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

Open in Devin Review

Comment on lines +55 to +58
await new Promise<void>((resolve) => {
socket.once('close', () => resolve());
socket.close();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 WebSocket disconnect can hang indefinitely after timeout removal

The disconnect() method previously used Promise.race with a 2-second timeout to prevent hanging if the WebSocket server is unresponsive during the close handshake. The new code awaits the close event with no timeout at all. If the remote server is unresponsive, the ws library's close() sends a close frame and then waits for the TCP connection to fully terminate, which can take minutes (TCP keepalive/timeout). This causes disconnect() — and anything awaiting it like Relay.close() — to hang indefinitely.

Old code with timeout vs new code without

Old code:

await Promise.race([
  new Promise<void>((resolve) => {
    socket.once('close', () => resolve());
    socket.close();
  }),
  new Promise<void>((resolve) => setTimeout(resolve, 2_000)),
]);

New code:

await new Promise<void>((resolve) => {
  socket.once('close', () => resolve());
  socket.close();
});

Callers like Relay.close() (packages/sdk/src/communicate/core.ts:118) directly await this.transport.disconnect(), so the hang propagates to user code in finally blocks and cleanup paths.

Suggested change
await new Promise<void>((resolve) => {
socket.once('close', () => resolve());
socket.close();
});
await Promise.race([
new Promise<void>((resolve) => {
socket.once('close', () => resolve());
socket.close();
}),
new Promise<void>((resolve) => setTimeout(resolve, 2_000)),
]);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant