# ASA Architectural Review > **Review subject:** [Sentry](https://github.com/getsentry/sentry) > **Reviewers:** > 1. Alexander Kurmazov > 2. Maxim Ksenofontov > 3. Vladislav Lamzenkov > > **Table of contents** > [1. Overview](#1-Overview) > [2. Executive Summary](#2-Executive-summary) > [3. Business Context](#3-Business-context) > [4. QAS](#4-QAS) > [5. Views](#5-Views) > [6. Rationale/ATAM](#6-RationaleATAM) > [7. Recommendations](#7-Recommendations) ## 1. Overview *Sentry* is an open source tool that is primarily used for the purpose of monitoring and tracing exceptions in realtime. It is written in Python, but offers an API for any language. It also has a great number of SDKs for the most of the popular languages. ## 2. Executive summary During the comprehensive analysis of the *Sentry* project's architecture it was discovered that many existing decisions fit the requirements quite well. However, the project team should seriously consider moving the architecture to the *microservices* basis. ## 3. Business context The main goal of *Sentry* product is to help IT-companies to quickly inform teams about bugs and instability of their systems. *Sentry* is an open source product with possibility of using their paid *SaaS* infrastructure. ### Business requirements 1. Product is able to process many requests in parallel. 2. The product retains event data for at least 90 days. 3. Customers should have access to their events at any time. 4. Product conforms to the best security standards in the industry. 5. The system has a mechanism for disaster recovery. 6. Customers have paid usage constraints (quotas) when using cloud-instance *Sentry*. 7. Client's errors are alerted through a variety of notification tools, including email and SMS. 8. Support for the most popular programming languages used on the market. 9. Product is open source. 10. The application supports multiple languages. # 4. QAS ### Availability 1. The system when run locally should be available 99.999% of the time. 2. The cloud instance of the product should be available 99.9% of the time on the production stage. ### Modifiability 1. A new event(a new performance metric or an alert) initiated by the client should be added to the system without its rebooting. ### Performance 1. Client accesses logs in the system, they should be shown in the user interface within 5 sec. 2. Client creates a new project on the cloud instance of product using web interface, and the project is created and deployed during 60 sec. 3. The system should be able to handle 1 request in 70ms. 4. In case of client's error, the system should notify the customer via notification systems for the last 5 minutes. ### Testability 1. Code should have test coverage of at least 75% at the end of every release cycle. ### Reliability 1. The service saves customers' logs, the data should be replicated at least twice in different places of the system. 2. In case of software failure, the system is automatically rebooted within the time interval of 300 sec. ### Scalability 1. There should be a load balancer between the user and the *Sentry* application, which evenly distributes the requests among instances. 2. If the load of any node exceeds 80%, one more instance is automatically run within 3 minutes. ### Auditability 1. System should log any changes made by the users along with their auth credits. ### Security 1. When the system processes customers' events, it automatically removes sensetive information before saving to the database. 2. Clients' requests are processed over TLS / SSL connections. 3. If the system suspects a malicious activity at runtime, it is automatically reported to the client within 5 seconds. ### Durability 1. Customers' data is stored for at least 90 days. ### Deployability 1. The product should be deployed on a local machine. 2. The product should be deployed in the cloud. ## 5. Views The architecute presented by the official documentation is as follows ***Highlevel view*** ![](https://i.imgur.com/6pmnqCf.png) > **Components** > > 1. **Your Application** — basically, some code that is being watched by Sentry. > 2. **SDK** — the dev kit, a library in this case, that is integrated within the code, and is responsible for communication with the Sentry server. > 3. **Load Balancer** — also, a reverse proxy — **nginx**. > 4. **Relay** — a standalone service that can be used as a middle layer between **Your Application** and **Sentry**. Provides great data security, and can be deployed within a corporation. > 5. **Snuba** — *Sentry's* query service and storage for event data. > 6. **Sentry (web)** — an http server, accepts data directly from **Your Application**, passes it to processing in **Sentry (worker)** via a **message broker** (e.g., kafka). > 7. **Sentry (worker)** — a worker that does the processing of the data coming from the application. Reads scheduled for processing jobs from the **message broker**. > 8. **Kafka**, **clickhouse**, **redis**, etc. — DB backends that Sentry modules use. **Snuba Architecture** > Can be queried eirther directly by HTTP clients (read-only) or via Kafka (read and write operations) ![](https://i.imgur.com/NdAz4lp.png) A more detailed view on the architecture within Sentry can be found [here](https://getsentry.github.io/snuba/architecture/overview.html#snuba-within-a-sentry-deployment) ***Event Pipeline view*** ![](https://i.imgur.com/zpmFYlc.png) > A more detailed view on **Relay** + **Snuba** ***Traceback lifecycle*** ![](https://i.imgur.com/B8DXhyz.png) * **Your Application** sends data (in an HTTP request) to the **Sentry Server** via **Sentry SDK** * **Sentry Server**, a Django application, registers new processing jobs in the **Queue** * **Worker**, a Celery application, periodically grabs spawned jobs from the **Queue**, processes them, and writes them to the **Storage** ## 6. ATAM We prioritize and refine the most important quality attribute goals by building the following **utility tree**. ![](https://i.imgur.com/lv13dHs.png) ![](https://i.imgur.com/7vtsqlC.png) Here are the **risks**, **tradeoffs** and **sensitivity points** introduced by some of the architectural decisions. ### Availability | Architectural decision | Sensitivity Point | | -------- | -------- | | Analysing workers and storing workers use the same I/O channel | At the time of huge analysis workload, the storing functionality might lag because of the limited bandwith. | ### Security | Architectural decision | Risk | | -------- | -------- | | Monolithic application | Unauthorized access to one module means access to the whole app. | | Open source | Malicious users are to able find and inspect potential vulnerabilities and exploits directly from the source code. | ### Isolation | Architectural decision | Risk | | -------- | -------- | | Monolithic application | A critical error in one module may result in a complete system failure. | ### Scalability | Architectural decision | Sensitivity point | | -------- | -------- | | Monolithic application | Individual parts of the application do not scale independently. As a result, more system resources are used to support possibly excessive functionality. | ### Consistency / Availability | Architectural decision | Tradeoff | | -------- | -------- | | Distributed data storage | Distributed storage provides greater **Availability**. However, it usually supports only **eventual** consistency, as opposed to **strong** consistency. | ### Deployability | Architectural decision | Sensitivity point | | -------- | -------- | | No container orchestration in the source code repository | Local deployment from the source code lacks the ability of orchestrating containers automatically, which is a problem in case of greater workload. | ## 7. Recommendations 1. Take advantage of the microservices architecture: reduce the cohesion between services, and make it **easier to scale**. Also, it **enhances the security** part of the system: if there will be some security issues, the attacker will not be able to access all components of the application. --- 2. Use different communication channels for different types of workers — **avoid the risk of bottlenecking the network**. ---