In this tutorial, you will learn the system design process for Instagram. This is designed to help you prepare for your interview
These are the six steps to take for Instagram System Design.
- Identify the Functional Requirements
- Identify the Non-Functional Requirements
- List the System Components and Services
- Storage and Memory Requirements
- Storage and Scalability
- User Feed/Timeline Generation
1. Identify the Functional Requirements
- Post images and videos
- Get images and videos
- Follow some user
- Like a Post, Comment
- Comment on an post
- Publish a news feed
2. Identify the non-Functional Requirements
- The system should be able to scale
- Use an efficient protocol for routing messages
- Adopt a microservice-based design
- Design for improved performance
3. Identify the System Components and Services
- Gateway – the connection between the client device and the backend services
- Session Service – Handles end to end communication between users
- Posts Service – Manages posts
- User Service – Used for new user creation
- User Profile Service – Used for managing user profile
- Follow Service – handles follow
- Likes Service – handles likes
- Comments Service – Handles adding comments to posts
- User Timeline Service – Used for precomputing the user timeline
- Image Handler Service – Handles images. Uploads image to the blob storage and returns the corresponding url to the App Server.
- Load Balancer – Load balancer is needed in front of all the services as well as in front of the Application Server.
- Proxy Server – This could be part of the LB placed in front of the services
- Additional components includes discovery server and service registry. These manages existing services.
Complete architecture
4. Storage and Memory Requirements
Object Storage – For storing images and other media. T
Cache – Used for storing precomputed user feed which is displayed in the user timeline. This cache could be implemented using Redis and is recommended to be part of the User Feed service
Relational Database – For storing all the relational data.
Read-Only DB – This is need to improve performance
5. Discuss Performance and Scalability
We would need a CDN between the user and the object store. In this way, frequently accessed content could be easily accesed.
We would need a load balancer in front of the App servers. In this way we would load-balance incoming traffic.
A proxy can handle both load balancing and routing
We would need to place a cache behind the read servers. That is between the app server and DB.
Update your cache(Redis) when a write occurs in your database. A distributed cache system like Redis would be fine.
Sharding could be used to perform horizontal partitioning of the relational database. In this way, we have a number of database servers each server hold a partition based on some index.
6. Discuss User Feed/Timeline Generation
We would need a separate service to handle the timeline generation. That is the User Feed Service.
It would have access to both the cache and the metadata db. This service would run some computation and update the cache. Likely in the background.
So we need some algorithm to generate the user feed. For instance, you could select latest posts from post made by someone a user follows. Another option is to display recommended posts based on user location.