llmstreamingui

AI chatbot (Gemini)

Streaming chatbot backed by Google Gemini. The DO holds the conversation; the React UI subscribes to state and renders incrementally as setState ticks every SSE chunk. No streaming plumbing in the client.

source on GitHub

What you'll learn

How `ctx.waitUntil` lets the agent stream work after the HTTP response returns
How to turn SSE chunks into setState updates that the UI renders incrementally
Adding secrets via `.dev.vars` without editing the generated wrangler config

01 step

Start from the blank scaffold (with UI)

Same starter as the blank template but with React, react-dom and matching @types preinstalled so your agent.ts can have an app.tsx next to it. The default project is a Counter agent — we'll replace it with the example's agent in the next steps.

~/my-agent-app

my-app/ (--with-ui scaffold)

agent.ts

app.tsx

package.json

tsconfig.json

02 step

Replace counter/ with chat/, add Gemini key

`.dev.vars` is wrangler-native — put env vars there for `bun run dev`, and they appear as `c.env.*`. `.dev.vars` is gitignored by the blank scaffold's .gitignore. For deploys you use `wrangler secret put`.

~/my-agent-app

03 step

agents/chat/agent.ts — streaming into state

Three things make this work: (1) POST returns immediately after kicking off streaming; (2) `ctx.waitUntil` keeps the worker alive long enough for the stream to finish; (3) each SSE chunk calls setState to append to the in-flight assistant message. State sync ships diffs to every connected tab.

agents/chat/agent.ts ts

import { Agent } from "agents";
import type { GeneratedEnv } from "@ayjnt/env";

type Message = { id: string; role: "user" | "assistant"; text: string; at: number };
type State = { messages: Message[]; streaming: boolean; streamingId: string | null };

export default class ChatAgent extends Agent<GeneratedEnv, State> {
  override initialState: State = { messages: [], streaming: false, streamingId: null };

  override async onRequest(request: Request): Promise<Response> {
    if (request.method !== "POST") return Response.json({ instance: this.name, ...this.state });

    const { text } = (await request.json()) as { text: string };
    const userMsg: Message = { id: crypto.randomUUID(), role: "user", text, at: Date.now() };
    const assistantId = crypto.randomUUID();
    const assistantMsg: Message = { id: assistantId, role: "assistant", text: "", at: Date.now() };

    this.setState({
      messages: [...this.state.messages, userMsg, assistantMsg],
      streaming: true, streamingId: assistantId,
    });

    // Fire-and-forget: HTTP returns now, generation continues in the background.
    // ctx.waitUntil keeps the worker alive until the promise resolves.
    this.ctx.waitUntil(this.streamReply(assistantId));

    return Response.json({ ok: true, assistantId });
  }

  private async streamReply(assistantId: string) {
    const history = this.state.messages
      .filter((m) => m.id !== assistantId)
      .map((m) => ({ role: m.role === "user" ? "user" : "model", parts: [{ text: m.text }] }));

    const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=${this.env.GOOGLE_API_KEY}`;
    const res = await fetch(url, { method: "POST", headers: {"content-type":"application/json"}, body: JSON.stringify({ contents: history }) });

    const reader = res.body!.getReader();
    const dec = new TextDecoder();
    let buf = "";
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      buf += dec.decode(value, { stream: true });
      const lines = buf.split("\n"); buf = lines.pop() ?? "";
      for (const line of lines) {
        if (!line.startsWith("data: ")) continue;
        try {
          const chunk = JSON.parse(line.slice(6));
          const text = chunk.candidates?.[0]?.content?.parts?.[0]?.text;
          if (text) {
            this.setState({
              ...this.state,
              messages: this.state.messages.map((m) =>
                m.id === assistantId ? { ...m, text: m.text + text } : m),
            });
          }
        } catch { /* chunk boundary; keep buffering */ }
      }
    }
    this.setState({ ...this.state, streaming: false, streamingId: null });
  }
}

04 step

agents/chat/app.tsx — UI that renders incrementally

The UI doesn't care about streaming. It reads state, renders messages, disables input while `state.streaming` is true. The realtime feel comes from state sync firing every chunk.

agents/chat/app.tsx tsx

import { useState } from "react";
import { useAgent } from "@ayjnt/chat";

export default function Chat() {
  const agent = useAgent();
  const [draft, setDraft] = useState("");
  const messages = agent.state?.messages ?? [];
  const streaming = agent.state?.streaming ?? false;

  const send = async () => {
    if (!draft.trim() || streaming) return;
    const text = draft; setDraft("");
    await fetch(window.location.pathname, { method: "POST", body: JSON.stringify({ text }) });
  };

  return (
    <main>
      <h1>chat — {agent.name}</h1>
      {messages.map((m) => (
        <div key={m.id} className={m.role}>
          <b>{m.role}:</b> {m.text}{m.id === agent.state?.streamingId && <span>▍</span>}
        </div>
      ))}
      <input disabled={streaming} value={draft} onChange={(e) => setDraft(e.target.value)}
        onKeyDown={(e) => e.key === "Enter" && send()} />
    </main>
  );
}

05 step

Run + ask a question

Bun dev picks up `.dev.vars` automatically (wrangler convention). Open the URL, ask something, watch tokens land in the assistant bubble.

~/my-agent-app

06 step

What it looks like

The assistant bubble fills in word by word with a caret. The input is disabled until streaming completes. Open the same URL in a second tab — the streaming text appears there simultaneously.

/chat/demo — streaming result

  ┌──────────────────────────────────────────────────────┐
  │ chat — demo                       [new conversation] │
  ├──────────────────────────────────────────────────────┤
  │                                                      │
  │  USER                                                │
  │  tell me a haiku about cloudflare workers            │
  │                                                      │
  │                                          ASSISTANT    │
  │                     Isolates in flight,               │
  │                     Milliseconds bloom like           │
  │                     code at every edge ▍              │
  │                                                      │
  ├──────────────────────────────────────────────────────┤
  │ [thinking…]                                   [send] │
  └──────────────────────────────────────────────────────┘
                          ↑
           state.messages[-1].text grows every
           setState() call — one per SSE chunk

07 step

Deploy — wrangler secret put

For production, put the key into wrangler's secret store. Same env variable name; same code path — no branch needed for dev vs deploy.

~/my-agent-app