Allow simultaneous inputs on AIChat by deathbyknowledge · Pull Request #934 · cloudflare/agents

deathbyknowledge · 2026-02-18T11:17:41Z

Fixes #24: AIChatAgent now handles multiple messages arriving simultaneously by queuing them and processing one at a time.

Problem

When multiple chat messages arrived before the first response completed, the onChatMessage → _reply cycles would race, causing:

Interleaved streaming responses
Potential message corruption
Unpredictable behavior

Solution

Added a request queue that serializes chat message processing:

_chatRequestQueue — Array of pending work functions
_chatRequestInFlight — Guard to prevent concurrent processing
_enqueueChatRequest() — Adds work to queue, kicks off drain if idle
_drainChatRequestQueue() — Processes queue items sequentially

Changes

Queue mechanism for serializing onChatMessage calls
Tool result continuations now use the queue instead of waitForStream() + 500ms timeout hack
Tool approval continuations similarly refactored
Removed dead _streamCompletionPromise / _streamCompletionResolve code (no longer needed)
Queue cleared on destroy() and CF_AGENT_CHAT_CLEAR

changeset-bot · 2026-02-18T11:17:48Z

🦋 Changeset detected

Latest commit: acf2e6f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@cloudflare/ai-chat	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pkg-pr-new · 2026-02-18T11:19:57Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@934

npm i https://pkg.pr.new/cloudflare/agents/@cloudflare/ai-chat@934

npm i https://pkg.pr.new/cloudflare/agents/@cloudflare/codemode@934

npm i https://pkg.pr.new/cloudflare/agents/hono-agents@934

commit: acf2e6f

deathbyknowledge · 2026-02-18T14:32:49Z

/bonk review pretty please

ask-bonk · 2026-02-18T14:44:52Z

remote: Permission to cloudflare/agents.git denied to github-actions[bot].
fatal: unable to access 'https://github.com/cloudflare/agents/': The requested URL returned error: 403

github run

ask-bonk · 2026-02-18T14:44:54Z

Bonk workflow failed. Check the logs for details.

View workflow run

ask-bonk · 2026-02-18T16:06:58Z

I now have a thorough understanding of the PR. Let me write up my review.

PR Review: Allow simultaneous inputs on AIChat

Summary

This PR replaces the _streamCompletionPromise/_streamCompletionResolve mechanism with a request queue (_chatRequestQueue + _enqueueChatRequest + _drainChatRequestQueue) that serializes all onChatMessage → _reply cycles. This is a solid architectural improvement.

What's Good

The core queue design is clean and correct. The _enqueueChatRequest / _drainChatRequestQueue pattern is a well-understood serialization approach. The while loop with shift() in _drainChatRequestQueue naturally handles items added during processing.
Eliminates the 500ms timeout hack. The old code had a fragile setTimeout(resolve, 500) fallback when _streamCompletionPromise was null (a known race). The queue approach is strictly better because it serializes by construction — no timers needed.
Consistent application. All three call sites (initial chat request, tool result continuation, tool approval continuation) use the same _enqueueChatRequest path, making the control flow uniform and easier to reason about.
Proper cleanup. Queue is cleared on destroy() and CF_AGENT_CHAT_CLEAR, matching the lifecycle of other state.

Issues

1. Fire-and-forget `_enqueueChatRequest` — uncaught rejections if DO hibernates (Medium)

packages/ai-chat/src/index.ts:433

The chat request handler now calls this._enqueueChatRequest(...) and immediately returns. The _drainChatRequestQueue method is async but its returned promise is never awaited or passed to ctx.waitUntil(). The comment says "the queue drains via ctx.waitUntil" but I don't see that actually happening.

In the old code, the handler returned the result of this._tryCatchChat(...), which meant the Durable Object's WebSocket handler awaited the full reply. Now, the onMessage handler returns undefined while _drainChatRequestQueue runs in the background. If the DO hibernates or the isolate is evicted mid-drain, queued work silently disappears.

The same issue applies to tool result and tool approval continuations — they were already fire-and-forget via .then(), so this isn't a regression there, but the initial chat request path is a regression.

Suggestion: Either:

Have _enqueueChatRequest call this.ctx.waitUntil(drainPromise) so the runtime knows work is outstanding, or
Return the drain promise from the onMessage handler for the initial chat request case.

2. `_drainChatRequestQueue` is called but not awaited (Low)

packages/ai-chat/src/index.ts:739

if (!this._chatRequestInFlight) {
  this._drainChatRequestQueue(); // async, not awaited
}

This is intentional (fire-and-forget kickoff), but combined with issue #1, it means there's no back-pressure. If the caller doesn't hold the promise, the only thing keeping the drain alive is the event loop. Consider storing the drain promise so destroy() can await it for graceful shutdown.

3. Stale `connection` closure in queued callbacks (Low-Medium)

packages/ai-chat/src/index.ts:433-478

The queued callback captures connection from the enclosing onMessage scope. If multiple messages arrive from different connections, each queued callback closes over its respective connection. This is correct for preserving the sender's identity, but worth noting: if a connection disconnects while its request is still queued, the agentContext.run will execute with a stale connection reference. The abortSignal won't fire because cancellation requires an explicit CF_AGENT_CHAT_REQUEST_CANCEL message.

This isn't a bug per se (the old code had the same behavior with a single in-flight request), but the queue makes it more likely since requests can wait longer. Consider checking connection.readyState before executing a queued callback, or wiring up the abort signal to connection close.

4. Missing changeset (Low)

The changeset bot flagged this — this is a behavioral change to @cloudflare/ai-chat (fixes a race condition) and should have a patch changeset.

Nits

The → in the JSDoc for _enqueueChatRequest (onChatMessage → _reply cycle) was changed from -> in the diff. Either is fine, but keep it consistent with the rest of the codebase (the rest uses -> or plain English).

Missing Test Coverage

There are no tests for the actual queueing behavior — i.e., verifying that two concurrent chat requests are processed sequentially rather than interleaved. The existing client-tools-continuation.test.ts exercises the continuation path but doesn't send overlapping requests. A test that sends two CF_AGENT_USE_CHAT_REQUEST messages back-to-back and asserts both responses complete in order would be valuable.

Verdict

The approach is correct and the code is clean. The main concern is issue #1 — the initial chat request is now fire-and-forget where it wasn't before, which could cause silent work loss on hibernation. If that's addressed (even just a ctx.waitUntil), this is good to merge with a patch changeset.

github run

elithrar · 2026-02-18T16:18:00Z

/bonk review pretty please

ask-bonk · 2026-02-18T16:20:25Z

I read the PR diff, changeset file, and key sections of packages/ai-chat/src/index.ts (the queue implementation, destroy method, cancel logic, and onMessage handler) along with the existing test files to understand the full context, then wrote up a review confirming the queue approach is correct while flagging that the ctx.waitUntil comment is misleading since no such call exists, making the drain fire-and-forget and potentially lossy on hibernation.

github run

threepointone · 2026-02-19T09:04:18Z

May I please have some tests

threepointone · 2026-02-19T11:37:52Z

should we update the ai-chat example for this? what's the ux like? also is this behaviour alsways desirable / should it be a flag?

deathbyknowledge · 2026-02-19T11:54:35Z

May I please have some tests

Donezo. Added one to check response order of new messages and one to make sure the queue empties on clear

Tests verify that: - Multiple simultaneous messages are processed in order without interleaving - Queue is cleared when chat history is cleared

mattzcarey · 2026-02-27T10:27:17Z

Matt's AI review:

Critical concerns:

No abort on clear/close — The queue is cleared but the currently in-flight request keeps running. After a CF_AGENT_CHAT_CLEAR, the active request can still write messages post-clear. Need to abort it or gate on a generation token.
Errors are swallowed — Previously _tryCatchChat was awaited directly. Now errors inside queued handlers are only logged in _drainChatRequestQueue and never propagate to the call site. This changes behavior for anything expecting failures to surface (metrics, retries, WS handler lifecycle).
Stream completion timing — The old explicit wait for stream completion was removed. If _reply resolves before the stream is fully persisted, the next queued message could start too early, re-introducing the interleaving bug the queue was intended to prevent. Worth verifying _reply resolves only after stream end/persistence.
Silent message loss on clear — this._chatRequestQueue.length = 0 drops queued requests with no response or error sent back to the client. Could lead to "lost" user inputs with no feedback.

Test gaps:

Ordering test only checks done messages, not that partial chunks aren't interleaved — could false-positive if chunks interleave but finish in order
Clear test doesn't assert that req-1 was actually dropped or that no response arrives for it after clear
Random delay in TestChatAgent could cause flakes — deterministic delays tied to message content would be more robust

mattzcarey · 2026-02-27T10:31:22Z

looks good otherwise. preemptively stamping but some of the above could be good to finalise.

threepointone · 2026-02-27T10:46:36Z

should we update the ai-chat example for this? what's the ux like? also is this behaviour alsways desirable / should it be a flag?

Still looking for answers on this

deathbyknowledge · 2026-02-27T11:16:09Z

should we update the ai-chat example for this? what's the ux like? also is this behaviour alsways desirable / should it be a flag?

My bad I never saw this comment. There are no changes needed to examples because the UX hasn't been changed at all, the changes are only to the AIChat private internals.

Right now if you send 2 messages you'll get interleaved stream chunks in the same stream and the data race produces garbage outputs so I don't think adding a flag for it is worth it.

That being said, we might want to eventually add a flag to decide when queued messages are processed (after the first tool response or after the final LLM response). Right now only final LLM is available so I think that can be done in a different PR?

threepointone · 2026-02-27T11:29:39Z

hokay. I'll review this later today.

deathbyknowledge marked this pull request as ready for review February 18, 2026 14:57

whoiskatrin requested a review from threepointone February 19, 2026 09:02

deathbyknowledge force-pushed the resumability branch from 3ca4691 to d1e7149 Compare February 19, 2026 11:50

deathbyknowledge force-pushed the resumability branch from d1e7149 to 94b0601 Compare February 23, 2026 10:24

deathbyknowledge added 3 commits February 27, 2026 10:26

use queues for AIChat onChatMessage

8798db3

Add queues to AIChatAgent for simultaneous inputs

e174ae4

Add tests for concurrent message handling

acf2e6f

Tests verify that: - Multiple simultaneous messages are processed in order without interleaving - Queue is cleared when chat history is cleared

deathbyknowledge force-pushed the resumability branch from 94b0601 to acf2e6f Compare February 27, 2026 09:40

mattzcarey approved these changes Feb 27, 2026

View reviewed changes

deathbyknowledge marked this pull request as draft March 2, 2026 10:00

Conversation

deathbyknowledge commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Uh oh!

changeset-bot bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pkg-pr-new bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deathbyknowledge commented Feb 18, 2026

Uh oh!

ask-bonk bot commented Feb 18, 2026

Uh oh!

ask-bonk bot commented Feb 18, 2026

Uh oh!

ask-bonk bot commented Feb 18, 2026

PR Review: Allow simultaneous inputs on AIChat

Summary

What's Good

Issues

1. Fire-and-forget _enqueueChatRequest — uncaught rejections if DO hibernates (Medium)

2. _drainChatRequestQueue is called but not awaited (Low)

3. Stale connection closure in queued callbacks (Low-Medium)

4. Missing changeset (Low)

Nits

Missing Test Coverage

Verdict

Uh oh!

elithrar commented Feb 18, 2026

Uh oh!

ask-bonk bot commented Feb 18, 2026

Uh oh!

threepointone commented Feb 19, 2026

Uh oh!

threepointone commented Feb 19, 2026

Uh oh!

deathbyknowledge commented Feb 19, 2026

Uh oh!

mattzcarey commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattzcarey commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

threepointone commented Feb 27, 2026

Uh oh!

deathbyknowledge commented Feb 27, 2026

Uh oh!

threepointone commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

deathbyknowledge commented Feb 18, 2026 •

edited

Loading

changeset-bot bot commented Feb 18, 2026 •

edited

Loading

pkg-pr-new bot commented Feb 18, 2026 •

edited

Loading

1. Fire-and-forget `_enqueueChatRequest` — uncaught rejections if DO hibernates (Medium)

2. `_drainChatRequestQueue` is called but not awaited (Low)

3. Stale `connection` closure in queued callbacks (Low-Medium)

mattzcarey commented Feb 27, 2026 •

edited

Loading

mattzcarey commented Feb 27, 2026 •

edited

Loading