The process of establishing system aspects such as modules, architecture, components and their interfaces, and data for a system based on specified requirements is known as systems design. It is the process of identifying, creating, and designing systems that meet a company's or organization's specific objectives and expectations. Systems design is more about system’s analysis, architectural patterns, APIs, design patterns, and glueing it all together than it is about coding. Because your application will be able to handle the architectural load, designing your system adequately for the requirements of your application will eliminate unnecessary costs and maintenance efforts, as well as provide a better experience for your end-users.
It's impossible to overlook system design when it comes to tech interviews! In the interview, almost every IT giant, whether it's Facebook, Amazon, Google, or another, asks a series of questions based on System Design concepts like scalability, load balancing, caching, and so on. So without any further adieu, let us go through the most frequently asked interview questions on System Design.
CAP(Consistency-Availability-Partition Tolerance) theorem says that a distributed system cannot guarantee C, A and P simultaneously. It can at max provide any 2 of the 3 guarantees. Let us understand this with the help of a distributed database system.
The following image represents what databases guarantee what aspects of the CAP Theorem simultaneously. We see that RDBMS databases guarantee consistency and Availability simultaneously. Redis, MongoDB, Hbase databases guarantee Consistency and Partition Tolerance. Cassandra, CouchDB guarantees Availability and Partition Tolerance.
Vertical scaling refers to the concept of upgrading the resource capacity such as increasing RAM, adding efficient processors etc of a single machine or switching to new machine with more capacity. The capability of the server can be enhanced without need for code manipulation.
Horizontal scaling refers to addition of more computing machines to the network that shares the processing and memory workload across distributed network of devices. In simple words, more instances of server are added to the existing pool and the traffic load is distributed across these devices in an efficient manner.
This has been demonstrated in the image below:
Horizontal scaling is different from vertical scaling in following ways:
Category | Horizontal Scaling | Vertical Scaling |
---|---|---|
Load Balancing | Requires load balancing for distributing request traffic across multiple machines | Since there is just one single machine, load balancer is not required. |
Failure Resilience | This is more resistant to application failure because if one server fails, traffic is routed to other server. | This is more prone to failure as there is only one machine and failure of this results in failure of entire application. |
Machine Communication | Since there are multiple machines being involved, it is very much necessary to have network communication. | Vertical scaling makes use of inter-process communication within the machine which makes it quite fast. |
Data Consistency | There exists possibilites of data inconsistencies here because there are different machines for handling different requests which might result in data being out of sync. | As there is only one machine, there is no issue of data inconsistency. |
Limitations | Since this scaling requires multiple servers, there might be concerns on budget and space but the scaling of the application can be done as much as needed based on the business needs. | Vertical scaling has a limit on the capacity of the resources that are achievable. If the resources are scaled up above this limit, then the application might crash and result in downtime. |
Load balancing refers to the concept of distributing incoming traffic efficiently across group of various backend servers. These servers are called as server pool. The modern day websites are designed to serve millions of requests from clients and return the responses in a fast and reliable manner. In order to serve these requests, addition of more servers is required. In such scenario, it is essential to distribute request traffic efficiently across each server so that they do not face undue loads. Load balancer acts as a traffic police cop facing the requests and routes them across the available servers in a way that not a single server is overwhelmed which could possibly degrade the application performance.
When a server goes down, the load balancer redirects traffic to remaining available servers. When a new server gets added to the configuration, the requests are automatically redirected to it. Following are the benefits of load balancers:
Performance is an important factor in system design as it helps in making our services fast and reliable. Following are the three key metrics for measuring the performance:
Sharding is a process of splitting large logical dataset into multiple databases. It also refers to horizontal partitioning of data as it will be stored on multiple machines. By doing so, a sharded database becomes capable of handling more requests than a single large machine. Consider an example - in the following image, assume that we have around 1TB of data present in the database, when we perform sharding, we divide the large 1TB data into smaller chunks of 256GB into partitions called shards.
Sharding helps to scale databases by helping to handle increased load by providing increased throughput, storage capacity and ensures high availability.
Category | SQL | NoSQL |
---|---|---|
Model | Follows relational model. | Follows non-relational model. |
Data | Deals with structured data. | Deals with semi-structured data |
Flexibility | SQL follows strict schema. | NoSQL deals with dynamic schema and is very flexible. |
Transactions | Follows ACID (Atomicity, Consistency, Isolation, Durability) properties. | Follows BASE (Basic Availability, Soft-state, Eventual consistency) properties. |
Database Sharding - Sharding is a technique for dividing a single dataset among many databases, allowing it to be stored across multiple workstations. Larger datasets can be divided into smaller parts and stored in numerous data nodes, boosting the system's total storage capacity. A sharded database, similarly, can accommodate more requests than a single system by dividing the data over numerous machines. Sharding, also known as horizontal scaling or scale-out, is a type of scaling in which more nodes are added to distribute the load. Horizontal scaling provides near-limitless scalability for handling large amounts of data and high-volume tasks.
Database Partitioning - Partitioning is the process of separating stored database objects (tables, indexes, and views) into distinct portions. Large database items are partitioned to improve controllability, performance, and availability. Partitioning can enhance performance when accessing partitioned tables in specific instances. Partitioning can act as a leading column in indexes, reducing index size and increasing the likelihood of finding the most desired indexes in memory. When a large portion of one area is used in the resultset, scanning that region is much faster than accessing data scattered throughout the entire table by index. Adding and deleting sections allows for large-scale data uploading and deletion, which improves performance. Data that is rarely used can be uploaded to more affordable data storage devices.
The following table lists the differences between sharding and partitioning:
Partitioning | Sharding |
---|---|
A partition is a logical database's split into separate, independent portions. Database partitioning is commonly used for load balancing, manageability, performance, and availability. | Sharding is a type of partitioning and is also referred to as horizontal partitioning. Sharding can also be defined as replicating the schema and then dividing the data based on a shard key. |
The advantages of partitioning include all that of sharding since sharding is a type of partitioning. Besides this, partitioning includes the benefits of vertical partitioning as well which involves dividing the schema of the database. | The advantages of sharding include the following: 1. Increased Read/Write Throughput – Distributing the dataset across several shards increases both read and write operation capacity, as long as read and write operations are limited to a single shard. 2. Increased Storage Capacity – Boosting the number of shards allows for near-infinite scalability by increasing overall total storage capacity. 3. High Availability - Every piece of data is copied since each shard is a replica set. Moreover, because the data is dispersed, even if an entire shard goes down, the database as a whole remains partially functional, with separate shards hosting different parts of the schema. |
A system is said to be scalable if there is increased performance in proportional to the resources added. Generally performance increase in terms of scalability refers to serving more work units. But this can also mean being able to handle larger work units when datasets grow. If there is a performance problem in the application, then the system will be slow only for a single user. But if there is a scalability problem, then the system may be fast for a single user but it can get slow under heavy user load on the application.
Consistency from the CAP theorem states that every read request should get the most recently written data. When there are multiple data copies available, there arises a problem of synchronizing them so that the clients get fresh data consistently. Following are the consistency patterns available:
Caching refers to the process of storing file copies ina. temporary storage location called cache which helps in accessing data more quickly thereby reducing site latency. Cache can only store limited amount of data. Due to this, it is important to determine cache update strategies that is best suited for the business requirements. Following are the various caching strategies available:
Content delivery network or in short CDN is globally distributed proxy server network that serves content from locations closeby to the end users. Usually in websites, static files like HTML, CSS, JS files, images and videos are served from CDN.
Using CDN in delivering content helps to improve performance:
There are two types of CDNs, they are:
In a distributed environment where there are multiple servers contributing to the availability of the application, there can be situations where only one server has to take lead for updating third party APIs as different servers could cause problems while using the third party APIs. This server is called as primary server and the process of choosing this server is called as leader election. The servers in the distributed environment has to detect when the leader server has failed and appoint other one to become a leader. This process is mostly suitable in high availability and strong consistency based applications by using a consensus algorithm.
The primary objective of system design interviews is to evaluate how well a developer can plan, prioritize, evaluate various options to choose the best possible solution for a given problem.
Following are some of the issues found in distributed systems:
What are some of the required features?
What are some of the common problems that can be encountered?
Possible tips for consideration:
TinyURL or bit.ly takes long URL and generates new unique short URL. These systems are also capable of taking the shortened URL and returning original full URL.
What are some of the Required Features?
What are some of the Common Problems encountered?
Possible tips for consideration:
These sites are meant for posting questions and answering them, showing newsfeed highlighting popular questions based on tags and related topics.
What are some of the Required Features?
What are some of the Common Problems encountered?
Possible tips for consideration:
Facebook's newsfeed allows user to see what is happening in their friends circle, liked pages and groups followed.
What are some of the Required Features?
What are some of the Common Problems encountered?
Possible tips for consideration:
What are some of the Required Features?
What are some of the Common Problems encountered?
Possible tips for consideration:
Recommendation systems are used for helping users identify what they want efficiently by assisting them by offering various choices and alternatives based on their history or interests.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
API Rate Limiters limit the API calls that a service recieves in a given time period for avoiding request overload. This question can start with coding algorithm on a single machine to distributed network.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
This service partially completes the search queries by displaying n number of suggestions for completing the query that the user intended to search.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
Netflix is a video streaming service.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
Tic-tac-toe game involves two players where one player chooses 0 and other player chooses X for marking the cells. The player who fills a row/column/diagonal with their selected character wins.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
Generally, in a traffic control system, we see that the lights transition from RED To GREEN, GREEN to ORANGE and then to RED.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
The Web crawler is a search engine-related service like Google, DuckDuckGo and are used for indexing website contents over the Internet for making them available for every result.
What are some of the Required Features?
What are some of the common problems encountered?
Possible tips for consideration:
ATMs are used for depositing and withdrawing money from the customers. It is also useful for checking the account balance.
What are some of the required features?
What are some of the common problems encountered?
Possible tips for consideration:
These platforms help user request rides and the driver picks them up from the location and and drop at the destination selected by the user.
What are some of the required features?
What are some of the common problems encountered?
Possible tips for consideration:
In this article, we have covered the most frequently asked interview questions on System Design. The key element to clear a System Design interview is that you should have a clear understanding of the approach that you are taking while designing a particular system. For instance, in a system, if you choose to store the data in a No SQL database, you should be clear with the reason that made you choose a No SQL database over a SQL database. You should be clear with the differences between SQL and No SQL databases. In other words, every proposition of yours must be backed by some logical reasoning. This will give you an edge in your interviews.
Question 1: In a System Design interview question, which of the following options would be the correct sequence to follow?
Statements:
Statement I: Specifying the key features to be included
Statement II: Discussing each feature one by one
Statement III: Clarifying any doubts with regards to the question asked
Statement IV: Clarifying if any other feature needs to be incorporated
Options:
Option A: I, II, III, IV
Option B: III, I, IV, II
Option C: IV, II, I, III
Option D: III, IV, II, I
Correct Answer:
Option B: III, I, IV, II
Question 2: Which strategy can help you configure the cache to refresh the cache entry automatically before its expiration?
Options:
Option A. Refresh-ahead
Option B. Cache aside
Option C. Refresh-through
Option C. Refresh-back
Correct Answer:
Option A. Refresh-ahead
Question 3: Which of the following options can be a design issue in a distributed system?
Options:
Option A: Scalability
Option B: Fault-tolerance
Option C: Clustering
Option D: All of the Above
Correct Answer:
Option D: All of the above
Question 4: Which of the following is not a cache update strategy available in caching?
Options:
Option A: Cache-aside
Option B: Write-through
Option C: Write-behind
Option D: Refresh-ahead
Option E: None of the above
Correct Answer:
Option E: None of the above
Question 5: Which of the following options is correct about horizontal scaling and vertical scaling?
Options:
Option A: Horizontal scaling requires load balancing whereas vertical scaling does not require a load balancer.
Option B: Horizontal scaling is more resistant to application failure as compared to vertical scaling.
Option C: Horizontal scaling may lead to data inconsistencies whereas this is not the case with vertical scaling.
Option D: All of the above.
Correct Answer:
Option D: All of the above.
Question 6: Which of the following options is not a consistency pattern available in system design?
Options:
Option A: Weak Consistency
Option B: Eventual Consistency
Option C: Strong Consistency
Option D: Permanent Consistency
Correct Answer:
Option D: Permanent Consistency
Question 7: Which of the following options is an important factor in determining the performance of a system?
Options:
Option A: Latency
Option B: Throughput
Option C: Availability
Option D: All of the above
Correct Answer:
Option D: All of the above
Question 8: Which of the following options is correct about load balancing?
Options:
Option A: Load balancing is responsible for preventing requests to go to unhealthy servers.
Option B: Load balancing helps to prevent resource overloading.
Option C: Load balancing aids in SSL termination and the need to install X.509 certificates on every server.
Option D: All of the above.
Correct Answer:
Option D: All of the above.
Question 9. Which of the following is not true about strong consistency pattern?
Options:
Option A. After a data write, the subsequent reads will see the latest data.
Option B. The data is replicated asynchronously.
Option C. It is suitable in systems requiring transactions of data.
Option D. All of the above
Correct answer:
Option B. The data is replicated asynchronously.
Question 10: Which of the following statements is/are true about sharding?
Options:
Option A. Sharding refers to horizontal partitioning of data as it will be stored on multiple machines.
Option B. A sharded database becomes capable of handling more requests than a single large machine.
Option C. Sharding helps to scale databases by helping to handle the increased load by providing increased throughput, storage capacity and ensuring high availability.
Option D. All of the above
Correct Answer:
Option D. All of the above.