6.1 KiB

Raw Blame History

Litecord Voice Server Protocol (LVSP)

LVSP is a protocol for Litecord to communicate with an external component dedicated for voice data. The voice server is responsible for the Voice Websocket Discord and Voice UDP connections.

LVSP runs over a long-lived websocket with TLS. The encoding is JSON.

OP code table

"client" is litecord. "server" is the voice server.

opcode	name	sent by
0	HELLO	server
1	IDENTIFY	client
2	RESUME	client
3	READY	server
4	HEARTBEAT	client
5	HEARTBEAT_ACK	server
6	INFO	client / server

Message structure

Message data is defined by each opcode.

Note: the snowflake type follows the same rules as the Discord Gateway's snowflake type: A string encoding a Discord Snowflake.

field	type	description
op	integer, opcode	operator code
d	map[string, any]	message data

High level overview

connect, receive HELLO
send IDENTIFY or RESUME
if RESUME, process incoming messages as they were post-ready
receive READY
start HEARTBEAT'ing
send INFO messages as needed

Error codes

code	meaning
4000	general error. reconnect
4001	authentication failure
4002	decode error, given message failed to decode as json

HELLO message

Sent by the server when a connection is established.

field	type	description
heartbeat_interval	integer	amount of milliseconds to heartbeat with
nonce	string	random 10-character string used in authentication

IDENTIFY message

Sent by the client to identify itself.

field	type	description
token	string	`HMAC(SHA256, key=[secret shared between server and client]), message=[nonce from HELLO]`

READY message

The health field is described with more detail in the HEARTBEAT_ACK message.

field	type	description
`health`	Health	server health

HEARTBEAT message

Sent by the client as a keepalive / health monitoring method.

The server MUST reply with a HEARTBEAT_ACK message back in a reasonable time period.

There are no other fields in this message.

HEARTBEAT_ACK message

Sent by the server in reply to a HEARTBEAT message coming from the client.

The health field is a measure of the servers's overall health. It is a float going from 0 to 1, where 0 is the worst health possible, and 1 is the best health possible.

Servers SHOULD use the same algorithm to determine health, it CAN be based off:

Machine resource usage (RAM, CPU, etc), however they're too general and can be unreliable.
Total users connected.
Total bandwidth used in some X amount of time.

Among others.

field	type	description
health	float	server health

INFO message

Sent by either client or a server to send information between eachother. The INFO message is extensible in which many request / response scenarios are laid on.

This message type MUST be replayable.

field	type	description
type	InfoType	info type
data	Any	info data, varies depending on InfoType

InfoType Enum

value	name	description
0	CHANNEL_REQ	channel assignment request
1	CHANNEL_ASSIGN	channel assignment reply
2	CHANNEL_DESTROY	channel destroy
3	VST_CREATE	voice state create request
4	VST_DONE	voice state created
5	VST_UPDATE	voice state update
6	VST_LEAVE	voice state leave

CHANNEL_REQ

Request a channel to be created inside the voice server.

The Server MUST reply back with a CHANNEL_ASSIGN when resources are allocated for the channel.

field	type	description
channel_id	snowflake	channel id
guild_id	Optional[snowflake]	guild id, not provided if dm / group dm

CHANNEL_ASSIGN

Sent by the Server to signal the successful creation of a voice channel.

field	type	description
channel_id	snowflake	channel id
guild_id	Optional[snowflake]	guild id, not provided if dm / group dm
token	string	authentication token

CHANNEL_DESTROY

Sent by the client to signal the destruction of a voice channel. Be it a channel being deleted, or all members in it leaving.

Same data as CHANNEL_ASSIGN, but without token.

VST_CREATE

Sent by the client to create a voice state.

field	type	description
user_id	snowflake	user id
channel_id	snowflake	channel id
guild_id	Optional[snowflake]	guild id. not provided if dm / group dm

VST_DONE

Sent by the server to indicate the success of a VST_CREATE.

Has the same fields as VST_CREATE, but with extras:

field	type	description
session_id	string	session id for the voice state

VST_DESTROY

Sent by the client when a user is leaving a channel OR moving between channels in a guild. More on state transitions later on.

field	type	description
session_id	string	session id for the voice state

Common logic scenarios

User joins an unitialized voice channel

Since the channel is unitialized, both logic on initialization AND user join is here.

Client will send a CHANNEL_REQ.
Client MAY send a VST_CREATE right after as well.
The Server MUST process CHANNEL_REQ first, so the Server can keep a lock on channel operations while it is initialized.
Reply with CHANNEL_ASSIGN once initialization is done.
Process VST_CREATE

Updating a voice channel

Client sends CHANNEL_UPDATE.
Server DOES NOT reply.

Destroying a voice channel

Client sends CHANNEL_DESTROY.
Server MUST disconnect any users currently connected with its voice websocket.

User joining an (initialized) voice channel

Client sends VST_CREATE
Server sends VST_DONE

User leaves a channel

Client sends VST_DESTROY with the old fields

User moves a channel

Client sends VST_DESTROY with the old fields
Client sends VST_CREATE with the new fields
Server sends VST_DONE

6.1 KiB Raw Blame History