arrow_backBACK_TO_TRANSMISSIONS
SOFTWARE ENGINEERING2025-06-16schedule4 MIN READ

Building Accessible UIs for AI Outputs: Rendering Markdown and Streaming Text Cleanly

visibility0 VIEWS
1 ACTIVE READER
SHARE:
Building Accessible UIs for AI Outputs: Rendering Markdown and Streaming Text Cleanly

Streaming AI responses into a UI often feels like a race against the network. If you aren't careful, you end up with flickering text, broken syntax highlighting, and a jarring experience for screen reader users. I’ve spent the last few months refining how we handle these streams, and it comes down to balancing raw performance with DOM stability.

The Challenge of Incremental Rendering

When you stream text from an LLM, you are essentially appending chunks to a buffer. If you re-render the entire Markdown tree on every chunk, your CPU usage spikes, and the cursor position in an input field or the focus state of a screen reader gets destroyed.

The trick is to decouple the "raw stream" from the "DOM update." I treat the incoming stream as a state machine. I buffer the raw string, and only trigger a Markdown re-parse when the chunk boundary makes sense—usually on a newline or a sentence terminator.

Implementation: The Buffered Renderer

In my current React-based stack, I use react-markdown paired with remark-gfm for tables and lists. To keep it snappy, I implement a debounced update mechanism. Here is how I handle the stream ingestion:

import { useState, useEffect, useRef } from 'react';
import ReactMarkdown from 'react-markdown';
import remarkGfm from 'remark-gfm';

// Using a custom hook to manage the streaming state
export const useAiStream = (stream: ReadableStream | null) => {
  const [content, setContent] = useState('');
  const bufferRef = useRef('');

  useEffect(() => {
    if (!stream) return;

    const reader = stream.getReader();
    const decoder = new TextDecoder();

    const read = async () => {
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value, { stream: true });
        bufferRef.current += chunk;

        // Optimization: Only update state every 50ms to prevent 
        // excessive DOM thrashing during high-speed generation
        setContent(bufferRef.current);
      }
    };

    read();
  }, [stream]);

  return (
    <div className="prose max-w-none" aria-live="polite">
      <ReactMarkdown remarkPlugins={[remarkGfm]}>
        {content}
      </ReactMarkdown>
    </div>
  );
};

Accessibility Considerations

The aria-live="polite" attribute is non-negotiable here. Without it, screen readers will attempt to announce every single character as it streams, which is a nightmare for users. By setting it to polite, we ensure the screen reader waits for a natural pause before announcing the update.

One common mistake I see is failing to handle code blocks. When an AI generates a code block, it often leaves the triple backticks open while it streams. If the user tries to copy-paste or read the code before the stream finishes, the UI might break. I’ve started implementing a "Loading" state for code fences specifically, ensuring the syntax highlighter only runs once the block is closed.

Architectural Trade-offs

I’ve experimented with two approaches for rendering:

  1. Client-side Markdown Parsing: This is what I showed above. It’s fast and reduces server load, but it can hit performance limits if the AI response is massive (e.g., a 5,000-word essay).
  2. Server-side Streaming (The "HTML Pipe"): You can parse the Markdown on the server and stream raw HTML. This is faster for the browser but makes it harder to implement features like "Copy to Clipboard" or inline editing later.

For most AI chat interfaces, I stick to client-side parsing because it allows for better control over the UI components (like injecting custom buttons into code blocks).

Debugging Tips

  • The Flickering Cursor: If your text jumps around while streaming, check your CSS. Ensure your container has a fixed min-height or use contain: content; to prevent the layout from shifting as the text grows.
  • Syntax Highlighter Lag: If you are using Prism or Highlight.js, don't run it on the entire document on every update. Use a memoized component that only re-runs highlighting when the code block content changes.
  • Partial Markdown: If the LLM cuts off mid-word, don't worry about the UI looking broken. Users expect a "typing" experience. However, if it cuts off inside a Markdown link or table, the visual structure will collapse. I handle this by checking for dangling special characters before passing the string to the renderer.

Building these interfaces is less about the AI itself and more about how you manage the user's perception of time. Keep the DOM lean, respect the browser's paint cycle, and your users will thank you for the stable experience.


engineering

Aditya Shenvi

AI Engineer & Full-Stack Architect. Passionate about building intelligent systems, elegant UIs, and scaling web infrastructure. Open to exciting engineering opportunities in April 2026 and beyond.

SYS_CLOCK: SYNCEDBUILD: v3.2.1NODE: ACTIVEPING: 12msSTATUS: NOMINALCOMPILE: SUCCESSDEPLOY: STABLECACHE: WARMSYS_CLOCK: SYNCEDBUILD: v3.2.1NODE: ACTIVEPING: 12msSTATUS: NOMINALCOMPILE: SUCCESSDEPLOY: STABLECACHE: WARM
EVENT_HORIZON

ARCHITECT // ENGINEER // DREAMER —
Building the neural frontier.

NAVIGATION

SIGNAL_PORTS

SYSTEM_STATUS

All systems nominal

CORE: STABLE // SYNC: OK
LAST_DEPLOY: 2026-07-05

© 2026 ADITYA SHENVI // EVENT_HORIZON // ALL_RIGHTS_RESERVED