How To Build A Discord Clone - Part 3

Impress your next system design interviewer by learning how to build your very own, highly scalable Discord clone.

Continuing from last time

In parts 1 and 2, we have been investigating how we can build our very own Discord clone. So far, we have discussed implementing the following functional requirements:

  • Adding friends.

  • Sending DMs to friends.

  • Message persistence.

  • Client-side implementation (not a functional requirement but critical to the overall design).

  • Receiving messages after being offline.

  • Receiving notifications for messages.

In this final part of the series, we will wrap it all up by discussing the following:

  • Sending attachments.

  • Presence indicators (i.e. active, away, offline).

  • Messaging with servers/rooms.

Users can send attachments

In Discord, not only can you send text, but you can also attach rich media such as GIFs, images, and videos.

We can support this by using a Markdown format for all of our messages. Markdown is an extremely lightweight and standard text format that supports creating documents that contain embeds such as images, videos, and GIFs, as well as other text features such as hyperlinks, bold, italics, etc. Discord uses a restricted version of Markdown for its own messaging for this reason.

Because it is such a common format, nearly all client-side frameworks and languages (i.e. React) have some library for parsing Markdown and displaying it nicely, as well as a rich text editor for actually writing the Markdown. Of course, you can always build your own, but it’s best to not reinvent the wheel if you can avoid it.

When using Markdown, adding images to text is super easy, we just need to use the format:

![alt text](link to your image)

And for adding videos:

![alt text](link to your video)

If this format is not supported by the Markdown processor,
might need to fallback to this format

<video width="320" height="240" controls>
  <source src="video.mov" type="video/mp4">
</video>

However, we need a way to get a link to the image/video we want to display to the user, and we want to make sure that only the allowed users can see it.

To start, we must discuss how we will store these files. We could store it locally on our web server file storage, but then we have to deal with security concerns, and also, our storage is proportional to the number of web servers we have. We then have issues like high availability for files if our web server goes down, web servers are usually considered stateless for horizontal scalability, but by allowing them to store files they are now stateless, making it difficult to scale. We could also store the data as a BLOB in a database, however, we do not need the additional features of the database such as transactions, meaning it's not worth the hassle of running and trying to scale a database just to serve files.

Instead, we should use a distributed object store such as Amazon S3, which is designed to handle file uploads and reads at a huge scale whilst being highly available, fault-tolerant, and durable. In addition, object stores are typically much cheaper than databases for serving files. To ensure a great user experience when reading files, we can easily introduce a CDN, which will cache our files around the globe so that when the user accesses the file, they will retrieve it from the closest edge cache near them.

How will we upload files to our service? It would be incredibly insecure to allow any user to upload whatever files they wanted to our object store. Instead, we can generate a pre-signed URL on the server side with a specific file ID (generated using Snowflake) that we can reference internally which the user can use to upload their file. With this approach, the user uploads directly to the object store without needing our services to ever interact with the raw file, ensuring security through malicious file uploads and reducing the amount of bandwidth we use. Note that when we store the file in S3, the file path we use should have some differing element such as the time i.e. /year/month/day/hour/minute/second, to avoid hot partitions.

For the choice of database, as we only need to do a key-value lookup of our file ID and store some metadata about the file such as the uploader and recipient, we can use a DynamoDB table with a partition key of the file ID. Since we have no need for range queries, we don’t need a sort key/clustering key. The reason for using DynamoDB here is that it integrates nicely with our AWS product, and we have a read-heavy workload compared to writes. Since DynamoDB B-Trees, it will perform better for reads. It also has native integration for change data streams and can detect deletions that come from TTLs.

Most object stores have an “on upload” trigger, whereupon a file being successfully uploaded, can run some code. Examples of this are the event triggers for Amazon S3 where you can invoke a Lambda once a file has been uploaded. Within this callback, we will callback to our own file service to store the file ID in our database, along with some metadata.

From the client, when they go to upload the file, they will trigger this API to get their upload URL along with the final image URL if the file is successfully uploaded. In the meantime, they can display the local file as a preview on the frontend. They will then upload their file to the upload URL, and on success, the client can add the inline link to the final image URL to the markdown.

When a user goes to read the file, they first make a request to our file service with the file ID, we will look up the file ID from our database, check if the user has permission to access it, and if they do generate a pre-signed read URL and redirect the user to it.

We can optimize older files using a lifecycle policy to move them into longer-term storage which is cheaper. Additionally, at a business level, to avoid infinitely growing storage, we can set a time to live on the records in our database so that they will expire after a given amount of time, and then have an event trigger to delete the file from the object storage if it has been deleted from the database.

But what about files that get uploaded that the user decides not to use? We can have a periodic batch processing job that occasionally runs, looks at all of our messages from our messages DB, compares them to what we have stored in our file database, and deletes the files that are not used in any messages to save us space.

Users can see the online status of their friends.

On Discord, a green status indicator means the user is actively online and looking at their chats right now, yellow means they are away from the Discord app but still online (i.e. might have walked away from their computer for a few minutes), and dull means the user is not online.

To build this, we could reuse the keep alive system from before, where we will occasionally ping the server and ask for a response to indicate we are still online. All we need to do is check if our user exists in our messaging lookup table (which had a TTL built into it). However, while this works for the orange status indicator, it doesn’t help with the green indicator. The reason for this is because the user could have walked away from their computer and kept the client open which would keep pinging the server on the user's behalf.

Therefore, we need to have a new kind of event that the user will only send to the server upon them performing some action, such as sending a message, typing, clicking, or maybe even moving their mouse (you can determine this from the client side).

Since users will be continuously polling their status, we need high write throughput, so an in-memory data store like Redis makes sense. For the online status, when a user performs an action, the client can send a message to the server, where we will set a key in the form active:user1 with the value representing the timestamp at which the request was submitted to the server, and a TTL being our active presence expiry time (5 minutes).

We can then have a separate key for the away status, the client will send messages every 5 minutes to set a key away:user1 with the value being the timestamp at which the request was submitted to the server, and then we might have a TTL for longer than the polling frequency which will be the time since shutting down the client it takes for the user to show up as offline if the server has not received a message from the client in this time, we assume the client is now offline.

When it comes to checking the status of a user, we attempt to retrieve the active and away keys for the user. If the active status exists, we return the user is active. If the active status does not exist (meaning the key expired), we return away. If the away status does not exist because the key has also expired, we return that the user is offline.

To avoid our clients having to continuously poll for the active status of each user they want to know which would put additional strain on the system, we can use WebSockets once again, where users can subscribe to the status of different users in the system. If we wanted to provide a system where we just send the user's current status in real-time to other users, we would effectively need some kind of stream processing, where we listen for new events and key deletions and keep an active state using something like Apache Flink which we can then send to a Redis pub-sub channel to notify subscribed users. However, this is a lot of overhead.

Instead, when User 1 subscribes to User 2, we can subscribe User 1 to the pub-sub channel active:user2 and away:user2. From there, they will receive all the new events directly, and using the timestamps provided, they can keep an accurate local state of which users are online and which users are away, effectively mirroring the logic we stated earlier. We can also update our user status check endpoint from before to return the user's current active and away timestamps, returning null if a particular key does not exist (meaning it has expired).

This is how our overall final architecture will look:

In terms of read and write volume, when we connect for the first time we will need to request the current status of each user we desire to look up. Every action and poll period we will send events to indicate our current status, but we will then propagate them through our pub-sub channel to the users who are subscribed to our status. Therefore, it is likely we will have a higher number of reads than writes.

To ensure high availability, we will use Redis cluster. Our keys will be sharded uniformly across the cluster, ensuring uniform distribution of load. We do not need any durability on our cluster and keeping it in memory is OK because this data is ephemeral and we can easily rebuild it when the user sends new events if we somehow lose it. When it comes to reads and pub-sub messages, eventual consistency is OK for our use case, as it is not a huge end-user experience if the status is slightly inaccurate, and since we update in real-time this will be quickly updated anyway. Therefore, for each shard, we can have a single master that will serve write throughput and message publishing, and then serve reads and pub-sub subscriptions through read replicas which will replicate asynchronously.

If we wanted to guarantee high availability of the user status, we could have a failover setup where for each shard we have an additional node that the master replicates synchronously to and we use as a standby, so that if the master goes down we can failover to this and it will have all up to date information. However, having an active standby is very expensive. We could just promote one of our read replicas which will have most of the data (maybe missing some changes that have not yet been replicated), which can be used to continue serving the status. However, even this might be overkill, as a temporary outage of the user status service might be acceptable while we wait for the data to rebuild which will happen when users send active events and keep alive messages, meaning in the worst-case scenario we will recover after every user has sent their away status.

Servers

If we were only dealing with DMs, what we already have would be enough to continue with the rest of our chat design. However, because we need some way of supporting group messaging in the form of servers, we must introduce a new service our “Server” service (very confusing naming I know).

I think it is reasonable we can assume the same kind of traffic conditions in terms of reads-to-writes as our previous User service, given they will be used in certain circumstances.

Our table will have the following format:

-- Server table
CREATE TABLE servers (
    server_id BIGINT PRIMARY KEY,
    server_name VARCHAR(255),
    created_at DATETIME
);

Our database will also need to store the different databases that our server belongs to, which will involve a many-to-many relationship:

-- Server-user table
CREATE TABLE user_id_server_id_map (
    server_id BIGINT,
    user_id BIGINT,
    created_at DATETIME,
    PRIMARY KEY (server_id, user_id)
);

If we need to partition our data so we can scale our writes as we did with our User service, we can use a similar approach from before, where we use the server_id as our partition key, and then the user_id as a sort key. We can then have a GSI on our user_id so that we can find all of the servers a given user belongs. In my opinion, it is OK for this to be eventually consistent, as when the user joins the client can keep track of this until the data has been synced on the server for future loads.

For simplicity, we will use a similar choice of technologies as before MySQL with single leader replication and a cache.

So now, what modifications will we need to make so that we can support all of the features we have discussed so far, but instead using servers?

Joining a server

On Discord, when a user wants to join a server, they need an invite link that has been generated by someone already within the server. For this, we will maintain a new table in our database called invite ID. When a user wants to create an invite link for the server, they will create a new invite ID using Snowflake with some expiration date, which will be stored in the database with some TTL being the expiration time.

The user can then send this link to the user they want to join, and when the joining user clicks on this link and sends a request to the server, we can look up the server from the ID and then add them to it if they do not already exist.

Since it is simple, when a user wants to leave a server, they just make an API request to the Servers service with their server ID and the user will be removed from the user_id_server_id_map table.

Adapting the current system to support server messaging

Right now, we have already built a fully working system for sending messages directly between users. We need to add a new feature to the system where instead of sending one message to one other user, we instead send the message to a chat server.

If we look at our previous sections, we see that to represent a conversation between two users, we use the following key user1:user2. However, now we want to model a new type of conversation, this one being represented by the server. To differentiate these two, we can make some slight modifications to our system.

For a DM conversation, we will use the key dm:user1:user2 and for a server-based conversation, we will use the key server:serverID. We can now separate the two types of conversations while reusing our existing infrastructure. In addition, if custom functionality is required based on the different conversation types (which it is), we can look at the prefix of the key to determine how we should process the message.

Now when the user sends a message, they will use the following format (or something similar):

{
	"fromUserID": 1234,
	"fromClientID": 5678,
	"conversationID": "dm:user1:user1" | "server:serverid",
	"body": "Hello this is my message",
	"messageID": 910
}

And then when they receive the message, it will be in the following format (or something similar):

{
	"fromUserID": 1234,
	"conversationID": "dm:user1:user1" | "server:serverid",
	"body": "Hello this is my message",
	"messageID": 910,
	"timestamp": 1234
}

While this will work for the majority of our features including inbox messages, notifications, message history, and client-side implementation, we need to make some modifications to the messaging service.

We can keep our routing table logic, but we need to add a check for whether the conversation is a DM or server conversation. We can do this by first looking up the prefix, and if the prefix is dm we can proceed with the old logic. However, if the prefix is server, we will instead look up the users within the server from our Servers service, and then fan out the message to all client IDs for each user in the server.

Because of this fanout pattern, it might be a good idea to put a hard limit on the maximum number of users per server at a business level.

Conclusion

So now we have created a system that handles all of the functional requirements of Discord, as well as the non-functional requirements we put onto the system.

If you have any suggestions or feedback, you can send it to my email directly at [email protected].

In addition, if you think this article or newsletter would be useful for someone you know, I would greatly appreciate it if you could share it with them!

All the best with your systems design journey, and stay tuned for some of my upcoming designs (spoiler: I will be covering UUID generation next).