System Design - HackMD

# System Design 101 --- # There's no free lunch --- # Trade off --- ## Trade off - Performance vs Scalability - Latency vs Trhoughput - Availability vs Consistency --- # Performance # vs # Scalbility --- ## How do I know if I have a "performance" problem? --- ### If your system is slow for a single user --- ## How do I know if I have a "scalability" problem? --- ### If your system is fast for a single user but slow under heavy load --- # Latency # vs # Trhoughput --- ## Latency The time to perform some action or to produce some result. --- ## Throughput The number of such actions of results per unit of time. --- Aim for **maximal throughput** with **acceptable latency** --- # Availability # vs # Conssitency --- ## CAP theorem --- ![](https://github.com/donnemartin/system-design-primer/blob/master/images/bgLMI2u.png?raw=true) --- In a distributed computer system, you can only support two of the follwing guarantees: - Consistency - Availability - Partition Tolerance --- ### Consistency Every read reveives the most recent write or an error --- ### Availability Every request reveives a response, without guarantee that it contains the most recent version of the information --- ### Partition Tolerance The system continues to operate despite arbitrary partitioning due to network failures --- ![](https://i.imgur.com/GfTiaRS.png) --- # But! --- ### Networks aren't relibale So, you'll need to support Partition tolerance. You'll need to make a software tardeoff between **consistency** and **abilability** --- ### CP - Consistency and Partition tolerance Waiting for a response from the partitioned node might result in a timeout error. --- ### AP - Availability and partition tolerance Responses return the most readily abailable version of the data available on any node, which might not be the latest. Writes might take some time to propagate when the partition is resolved. --- # Consistency pattern --- With multiple copies of the same data, we are faced with options on how to synchronize them so clients have a consistent view of the data. --- ## Weak Consistency After a write, reads may or may not see it. A best effort approach is taken. Ex. VOIP, etc. --- ## Eventual Consistency After a write, reads will eventually see it (typically within milliseconds). Data is replicated asynchronously. Ex. DNS, email, etc. --- ## Strong Consistency After a write, reads will see it. Data is replicated synchronously. --- # Availability patterns --- ## Availability patterns - Failover - replication --- ## Failover --- ### Failover - Active/Passive - Active/Active --- ![](https://i.imgur.com/GCFam8f.png) --- #### Active/Passive - only 1 server is running - Heartbeats active <=> passive - 1 down then 1 up --- #### Active/Active - Both server are running - Spreading the load --- ### Disadvantage(s): failover - Fail-over adds more hardware and additional complexity. - There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive. --- ## Replication - Master-Slave - Master-Master --- ### Master-slave replication - --- ![](https://i.imgur.com/i1lNHAn.png) --- ### Master-master replication --- ![](https://i.imgur.com/Epu66qh.png) --- ### Thank you!