Enabling event-streaming operations

Server-sent events (SSE) is a core web feature that provides servers with a low overhead solution to push real-time events to the client when they become available. It can be used to stream chat completions from a large language model, real-time stock prices and sensor readings to clients. SSE is similar to web sockets in that it uses a persisent connection but differs in that it is unidirectional - only the server is sending events - and simpler to implement in many existing backend HTTP frameworks.

Info Icon

INFO

Speakeasy makes it easy to build this feature into your SDKs without any vendored extensions or heuristics. It can be leveraged purely by modeling SSE streams as text/event-stream responses with pure OpenAPI!

Here's a short example of using an SDK to chat with an LLM and read its response as a stream.


import { SDK } from '@speakeasy/sdk';
const sdk = new SDK()
const response = await sdk.chat.create({
prompt: "What are the top 3 French cheeses by consumption?"
})
for await (const event of response.chatStream) {
process.stdout.write(event.data);
}

Info Icon

INFO

This feature is currently supported in TypeScript, Python and Go. Let us know if you'd like to see support for other languages.

Modelling SSE in OpenAPI

To get started, you'll need to model an API endpoint that serves an event stream in your OpenAPI document. The main requirement to consider is that each server-sent event can contain up to 4 types of fields: id, event, data, retry. Below is an example of an operation that streams events containing only a data field that holds string content:


paths:
/chat:
post:
summary: Create a chat completion from a prompt
operationId: create
tags: [chat]
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/ChatRequest'
responses:
'200':
description: Chat completion created
content:
text/event-stream:
schema:
$ref: '#/components/schemas/ChatStream'
components:
schemas:
ChatRequest:
type: object
required: [prompt]
properties:
prompt:
type: string
ChatStream:
description: A server-sent event containing chat completion content
type: object
required: [data]
properties:
data:
type: string

We aren't limited to string data however. If you specify that data is an object then SDKs will assume the field will contain JSON content. When raw data is received from the server, it will be deserialized into an object for application code to consume.


components:
schemas:
ChatStream:
description: A server-sent event containing chat completion content
type: object
required: [data]
properties:
data:
type: object
properties:
content:
type: string
model:
type: string
enum: ["foo-gpt-tiny", "foo-gpt-small"]
created:
type: integer

As an example for Typescript, the generated SDK will now allow users to access this object:


for await (const event of response.chatStream) {
const { content, model, created } = event.data;
process.stdout.write(content);
}

Other streaming APIs send multiple types of events which have an id and event fields. These can be described as a union (oneOf) with the event field acting as a discriminator:


components:
schemas:
ChatStream:
oneOf:
- $ref: '#/components/schemas/HeartbeatEvent'
- $ref: '#/components/schemas/ChatEvent'
discriminator:
propertyName: event
mapping:
ping: '#/components/schemas/HeartbeatEvent'
completion: '#/components/schemas/ChatEvent'
HeartbeatEvent:
description: A server-sent event indicating that the server is still processing the request
type: object
required: [event]
properties:
event:
type: string
const: "ping"
ChatEvent:
description: A server-sent event containing chat completion content
type: object
required: [id, event, data]
properties:
id:
type: string
event:
type: string
const: completion
data:
type: object
required: [content]
properties:
content:
type: string

Note that across all these examples the schema for the events only ever specifies one or more of the 4 recognized fields. Adding other fields will trigger a validation error when generating an SDK with the speakeasy CLI or GitHub action.