**About arc42** arc42, the Template for documentation of software and system architecture. By Dr. Gernot Starke, Dr. Peter Hruschka and contributors. Template Revision: 7.0 EN (based on asciidoc), January 2017 © We acknowledge that this document uses material from the arc 42 architecture template, <http://www.arc42.de>. Created by Dr. Peter Hruschka & Dr. Gernot Starke. \newpage Introduction and Goals ====================== In a time of almost infinite sources of information, it is good to have the possibility to read only what you really want to read. News App is a project for filtering only the news articles that are relevant for the user. This can be achieved by making a custom news feed consisting out of self added RSS feeds and the possibility to add several filter configurations. A goal for all stakeholders is to have an application that is actually useful and fulfills all requirements that are listed in the next sections of this document. The developer's main goal is to learn and practice techniques and apply technologies and concepts that are related to Cloud and Distributed Computing in order to gain experience and achieve a very good grade in this lecture. [MS] Requirements Overview --------------------- The main purpose of News App is to add RSS feeds and filter them later on. Accordingly the main features are: - Extracting articles from RSS feeds - Articles will be analyzed and keywords will be extracted - Filter articles by several criteria - Register and log in to the app to select and create filters and to add RSS feeds. By using News App the experience of browsing news in the internet will be improved. If you want to see a a current topic in the news, it is possible to filter only articles concerning this topic. And if the news are overloaded from a topic (e.g. COVID19 pandemic) you also have the ability to hide these articles by excluding articles that contain specific keywords. [MS] \newpage Architecture Constraints ======================== The constraints for this project are defined in the Canvas Course of this lecture: [Project 21S](https://aalen.instructure.com/courses/3212/pages/project-21s?module_item_id=75334) > Constraints (requirements): > - Web App (native app not expected, if you want to make one check with me first) > - Web UI: React (Links zu einer externen Website), Angular (Links zu einer externen Website), Vue.js (Links zu einer externen Website), Svelte (Links zu einer externen Website), Ionic (Links zu einer externen Website) (with React, Angular, or Vue) > - UX: works well on mobile browser (smartphone) and desktop browser > - Microservice architecture style with each microservice (incl. DB) in its own Docker container > - Pub/sub real-time data (use real or fake web-based source for real-time data) > - Backend options: NodeJS/Express or LoopBack (Links zu einer externen Website), Python/Falcon or Flask or Django REST Framework (Links zu einer externen Website), Spring Boot, ktor (Links zu einer externen Website), Go (frameworks (Links zu einer externen Website), Micro (Links zu einer externen Website), Microservices mit Go) (Links zu einer externen Website); if other ask my permission first > - Use Swagger (Links zu einer externen Website), apiDoc (Links zu einer externen Website), or equivalent to test and document APIs > - NoSQL: domain focus and scenarios, consider how NoSQL data storage could make sense for at least some part (NoSQL recommended, mix with SQL (hybrid) preferred) > - Cloud deployment and docker-based microservices on backend: (ensure service is free) > - bwCloud (Links zu einer externen Website) provides free fair-use hosting for student classroom use > - Azure (should be free but limited with student account either here (Links zu einer externen Website) or here (Links zu einer externen Website)) Service Fabric or Kubernetes Services > - If using another service than these, check with me > - Require code comments and interfaces names in English. Do not use special German characters in your code (umlaut, double-s). > - Package names for your own code should start with de.hsaalen. > > Scenarios/Epics (clarify regularly during class with me as Product Owner) > Repo: Bitbucket repository from me required (use BitBucket Git/Trello/Issues actively): > - Sign up for BitBucket free (Links zu einer externen Website) and email me the BB email address you used and BitBucket username so I can add you to your team's project repository. > - Use Git regularly > - Keep the Bitbucket integrated Trello board updated (used for status reports) [taken from [Project 21S](https://aalen.instructure.com/courses/3212/pages/project-21s?module_item_id=75334)] \newpage System Scope and Context ======================== For the News App service to work, a number of external resources are utilized. How they are accessed and by which parts of the application is shown in the context diagram in the next section. Technical Context ----------------- ![](https://i.imgur.com/bVkzROv.png) News App uses two different backend services which have different connections to the outside. While the *User & Content Service* only communicates internally with the MongoDB and over a REST API with the Frontend, the *News Fetch Service* gets supplied with feeds from the user. The articles inside the feeds as well as their metadata are then downloaded and analyzed by *News Fetch*. The *News App Frontend* is a client-side rendered SPA (Single Page Application). For it to properly work external stylesheets are being loaded. To display the articles provided by the backend images from the websites that host the articles are loaded. All external resources are loaded via HTTP or HTTPS over the internet. | Frontend communication partner | transferred data | | -------- | -------- | | CSS Provider | stylesheets for Slick Carousel and Bootstrap | | Webserver (news sources) | images to display in the article preview | | News Fetch Backend communication partner | transferred data | | -------- | -------- | | RSS Feed Providing websites | RSS feeds | | Webserver (news sources) | articles with their metadata for analysis | [JA] \newpage Solution Strategy ================= As per the predefined constraints, News App follows the microservice architecture pattern. To have the ability to make use of one of the pattern's strengths, the granular scalability, the News App Service is split into four parts. * *News Fetch Service* for handling feeds * *User & Content Service* for creating filters and retrieving articles * A shared *MongoDB* to store all permanently required objects * A *Frontend Service* to allow for the interaction with the News Fetch and User & Content Services *(More about the specific technology decisions can be found in the design decisions section)* So the basic idea is that the News Fetch Service fills the database with articles and the User & Content backend allows the user to filter them. While some refactoring would be required the separation into three parts theoretically allows for the spawning of new User & Content Services while the load on the News Fetch Service remains more constant. To deploy new microservices Docker is utilized. Docker allows easy deployment on any Docker compatible machine and the build is easily configurable via env files. System administrators don't have to install all dependencies by hand and configure an on-premise server but can instead use a tool they are familiar with as Docker is widely used. To get new articles into the database the decision was made to rely on RSS feeds due to their widespread use and continuous supply of live data. Feeds can be added by the user to account for the sheer amount of news providers on the web. The communication between the frontend and the backend is performed via a REST API. That allows custom Frontends to connect to the backend and is capable of delivering decent performance as the transmitted JSON files are fairly efficiently handled by the MongoDB, the python backends and the react frontend. Security was not a focus in this project but one easy step to increase it would be to use a [nginx reverse proxy](https://www.freecodecamp.org/news/docker-nginx-letsencrypt-easy-secure-reverse-proxy-40165ba3aee2/). Additionally changes to the authentication part of the application would be necessary. As a major goal of this project is the learning of new technologies and their application best practices aren't necessarily applied throughout the architecture and code. To avoid bugs pull requests are reviewed usually by multiple members of the team. [JA] \newpage Building Block View =================== Whitebox Overall System (Level 0) ----------------------- The application consists out of three independent main components that are directly or indirectly connected to the MongoDB. ![](https://i.imgur.com/8R2dIk1.png) ## Level 1 ### White Box Frontend The diagram for the News App's frontend is a variant of a functional decomposition diagram and shows important react components. The order from top to bottom roughly reflects their nesting. ![](https://i.imgur.com/XUUtyds.png) The actual components that make News App are inside the `containers` folder. Most of the containers make use of generated code to communicate with the backend. The generated code is inside the `codegen` directory. To configure the backend URLs the `Config` file is used. The code inside the `codegen` will not be shown closer here. The `contextLib` shown on the left is used for authentication purposes. ### Contained Blackboxes | Name | Description | | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | | `Index` | At the index the document root is rendered. Provides application-wide capabilities. | | `App` | Holds the navbar renders the different application routes definend in `Routes` below the navbar. | | `Routes` | Maps relative URLs to specific react components defined inside the `containers` folder. | | `BaseComponent` | Used to apply a coherent style to certain components inside the `containers` directory. | | `containers` | Holds the various components to be rendered below the navbar. | | `codegen` | Contains code generated using the [OpenApi Generator](https://openapi-generator.tech/). This code is used by the components to call the backends. | | `Config` | Specifies the base paths of the backends. | | `contextLib` | Enables the usage of the AppContext for authentication purposes. | [JA] ### White Box News-Fetch ![](https://i.imgur.com/UzSs8WF.png) ### Contained Blackboxes | Name | Description| | ------------------ |------------| | `News-Fetch-Client` | Extracts article URLs from RSS-Feed and calls News-Fetch-Library | | `News-Fetch-Library` | Third Party Python Library. With an article URL it can analyze it and extract information about it. Also uses Natural Language Processing (NLP) to estimate keywords| |`Database Client`| Persistence Layer that contains database-specific operations| |`Database`| Read/Write RSS feeds from/to it and also store the fetched articles| |`REST Controller`|Implements the REST API endpoints that are needed to communicate to the Frontend| |`Frontend`| see Frontend for further information| Swagger is used to define an interface for the communication between the services. To keep track of things auto-generated code from Swagger that defines the model schema and routes of the REST API will not be described further. [MS] ### White Box User&Content ![](https://i.imgur.com/wjGMXdT.png) ### Contained Blackboxes | Name | Description| | ------------------ |------------| |`Database Client`| Persistence Layer that contains database-specific operations| |`Database`| Read and write user data and filters, read articles| |`REST Controllers`| Implement the REST API endpoints that are needed to communicate to the Frontend| |`Frontend`| see Frontend for further information| Level 2 ------- ### Frontend React Components This is a white box view of the `containers` folder. The connected components are nested. Lower components are children of the components above them. ![](https://i.imgur.com/NkFxdJp.png) All components use the generated APIs to make backend calls. ### Contained Blackboxes (only top level components) | Component | Functionality | | --------- | ------------- | | `AddAFeed` | Allows the user to add feeds. Gets and posts feeds to the backend. | | `NotFound` | Fallback component for illegal URLs | | `Results` | Shows the result of an applied filter. | | `Dashboard` | Displays statistics about the application and a selection of articles. | | `CreateAFilter` | Lets the user create filters through a selection of multiple inputs. | | `EditFilter` | Lets the user update his own or copy and adapt filters from other users. Uses the same inputs as `CreateAFilter` | | `Signup` | A few input fields to register a new user. Uses the contextlib for authentication purposes.| | `Login` | A few input fields to log in a new user. Uses the contextlib for authentication purposes. | | `ListGlobalFilters` | Lists filters that were shared by the creating user. Allows for usage or adaptions on the filters. | | `ListUserFilters` | Lists filters that were created by the current user. Allows for deletion modification or usage of the filters. | [JA] ### News-Fetch Class Diagram ![](https://i.imgur.com/XmOG9Ow.png) | Class | Responsibilities | | -------- | -------- | |`NewsFetchClient`| Fetching and analyzing articles from RSS feeds| |`FeedParser`|Third Party Library that can extract article URLs from RSS feeds| |`News-Fetch`|Third Party Library that can analyze articles and extract keywords by passing the URL| |`DbClient`| Responsible for Database access| |`FeedsController`|Implements the endpoints for the REST API regarding the RSS feeds| |`FeedsModel`|Defines the schema for the RSS feeds. Extends BaseModel| |`BaseModel`|Swagger-generated BaseModel that provides function for parsing JSON documents| |`JsonEncoder`| Helper class for the BaseModel that encodes JSON documents| [MS] ### User&Content Class Diagram ![](https://i.imgur.com/AGeRkSP.png) | Class | Responsibilities | | -------- | -------- | |`ArticlesController` |Implements the endpoints for the REST API regarding the articles| |`FiltersController`|Implements the endpoints for the REST API regarding the filters| |`DashboardController`|Implements the endpoints for the REST API regarding the Dashboard of the application| |`UsersController`|Implements the endpoints for the REST API regarding the users interacting with filters| |`UserManagementController`|Implements the endpoints for the REST API regarding the login/logout and registration functionality| |`AuthorizationController`|Implements a method that will check the authorization status of the user| |`DbClient`|Responsible for database access| |`ArticleModel`|Defines the schema for the articles. Extends BaseModel| |`FilterModel`|Defines the schema for the filters. Extends BaseModel| |`UserModel`|Defines the schema for the User. Extends BaseModel| |`BaseModel`|Swagger-generated BaseModel that provides function for parsing JSON-documents| |`JsonEncoder`|Helper class for the BaseModel that encodes JSON-documents| [MS] \newpage Runtime View ============ ## Add a feed ![](https://i.imgur.com/wbTkznh.png) 1. The user pastes an RSS feed's URL. 2. The URL gets saved to the MongoDB, feedback is returned to the backend service and displayed by the frontend 3. The saved RSS feeds are continuously monitored by the News Fetch Backend Service. Any new article in one of the feeds is analyzed and the results get stored in the database. ## Create a filter ![](https://i.imgur.com/6kAME1l.png) 1. The user specifies the criteria for the new filter which should be used in combination to retrieve the articles. During user input the number of articles matching the current state of the filter is continuously updated by the User & Content Backend Service and displayed. 2. When the user saves the filter, a filter object gets passed to the backend service where the filter criteria are used to create and execute a MongoDB insert command. 3. Feedback on the success of the operation is passed to the frontend from the database over the User & Content Backend. 4. The filter is applied. (see section "Apply a filter"). ## Apply a filter ![](https://i.imgur.com/9zn0gIo.png) 1. The selected filter object is passed to the User & Content Backend Service. 2. Each property of the filter is transposed to one part of a MongoDB find command which is then executed on the database. 3. The User & Content Backend passes the returned articles to the Frontend Service. 4. Each article's info is displayed as a card in a slideshow. [VH] \newpage Deployment View =============== ![](https://i.imgur.com/wha1krs.png) | Node/Artifact | Description | | -------- | -------- | | Frontend Docker-Container | Contains the Frontend component of the application. The docker image is based on the node:alpine Docker image and customized with several environment variables | |News-Fetch Docker-Container| Contains the News-Fetch component of the application. The docker image is based on the python:3.6 Docker image (not from the python:3.6-alpine Image because several issues can occur during the installation of some python libraries using pip)| |Content&User Docker-Container|Contains the Content&User component of the application. The docker image is based on the python.3.6 Docker-image| |MongoDB Docker-Container| Uses the basic MongoDB Docker image. In the docker-compose file it is linked to both backends to access the database from there| |Docker-Compose| The containers are orchestrated with docker-compose which allows us to deploy the multi-docker application easily | |Virtual Machine (VM)| BW Cloud provides for every user a VM from which the application can be started. The VM can be accessed via SSH | |BW Cloud Server|The server is provided by [BW Cloud](https://www.bw-cloud.org/).|| [MS] \newpage Cross-cutting Concepts ====================== ## Domain Models News-App is a datacentric application whose technical main functionality includes reading and writing from/to the database. Therefore everything is based around an Entity Relationship Diagram: ![](https://i.imgur.com/XY9JCP4.png) | Name | Description | | ------- | ----------- | | Article | Stores all relevant data to show and filter by attributes of the specific article | | Feed | Stores an URL of an RSS feed to the source of the articles | | Filter | Stores criteria for filtering articles and their visibility for other users than the creator (public) | | User | Stores the username and the password of the user. Also stores the ID of the created filters | Because the entities only consist of primitive data we can save them as JSON documents in multiple collections and convert them to python dictionaries when necessary during runtime. [MS] ## Persistency Using a NoSQL-Database is one of the requirements for this project. News-App uses a MongoDB for storing the data as JSON documents. This document-based database allows us to convert Python dictionaries seamlessly into JSON documents and vice versa. During development the database was set up locally. In production it is deployed on a server. MongoDB allows to partition the database on server clusters. This gives the opportunity to scale horizontally as much as needed to adapt to a growing user count and growing amount of articles, filters and RSS feeds to be stored. [MS] ## User Interface The user interface is written in react as a single-page application to realize fast and dynamic updates of the content. Most parts of the user interface are built using Bootstrap components to make use of its responsive design principles. [VH] ## Communications and Integration The microservices of News App communicate via a REST API. The following diagram depicts the architecture and communication of the microservices: ![](https://i.imgur.com/peeVfyy.png) | Node / artifact | Description | | ---------------------- | ---------------------------------------------------------------------------------------------------------------- | | Frontend | A microservice implemented with react that lets the user interact with the backends through their web browser. | | News Fetch Service | A microservice implemented with flask that extracts articles from RSS feeds and analyzes and saves the articles | | User & Content Service | A microservice implemented with flask that handles creating, editing and deleting filters. It also is responsible for the user management and provides the frontend with all data from the database (except the RSS feeds that are coming from the other backend) | |Database | For the database a MongoDB is used. It contains all articles, filter descriptions, RSS feeds and user information| | Browser | A recent browser to access the React Frontend. Chrome, Firefox, Safari and Edge should work | The components are interacting with each other using a REST API implemented and documented with Swagger. The database is accessed by linking it with docker-compose to the two backends. [MS] \newpage Design Decisions ================ ## Using Python with Flask ### Problem The core functionality of fetching news from RSS feeds and analyzing them afterwards is a relatively big component that could be time-consuming and complicated to implement. ### Decision After research and exploration, several good approaches for the analyzing of news articles were already implemented in Python and Open-Source. Based on this fact we decided to use Python with Flask and the Third-Party-Python-Library News-Fetch in order to not re-invent the wheel. You can find the source code of News-Fetch [here](https://github.com/santhoshse7en/news-fetch) and the official page for more information about it [here](https://santhoshse7en.github.io/news-fetch/). ### Considered Alternatives - Writing an own News Article Analyzer in another language of our choice [MS] ## Using React Choosing react as a library for the frontend is a consequence of its matching capabilities and good documentation. Several react-compatible npm packages are used throughout the project. One of these packages is react-bootstrap that allows the creation of responsive websites. Bootstrap is the most popular CSS framework and as such well documented. Because of react's JSX support the UI and logic can be kept in one file. This lowers the barrier for beginners and gave it an edge above angular in the decision for a technology. React also supports typescript which allows to detect a whole class of bugs at compile time and thus improves the code quality. [JA] ## Using MongoDB When choosing the database model, [MongoDB](https://www.mongodb.com/) was chosen because of its high flexibility and good documentation. MongoDB is based on the JSON/BSON format and is based on an ordered set of key/value pairs, whereby arrays or documents are also stored under a key. Due to the large community of MongoDB users, numerous libraries have been created that make it possible to efficiently integrate and use a MongoDB database in another programming language. In addition, it seemed sensible to select a database model that could be used flexibly by the developed backends. Like the third-party Python library News-Fetch used in the News Fetch backend, this library saves the data of the articles received in the Python internal dictionary data structure, which can be efficiently and easily converted into a JSON data structure. [AF] ## Using REST & Swagger **Re**presentational **S**tate **T**ransfer is a very common concept in web APIs that lays the foundation for an architecture system that is defined for web services. REST offers a stateless architecture for data transmission, which means that every REST-compliant web service can interact statelessly with textual resource representation. These operations are defined as interactions with the HTTP methods such as GET, POST, PUT, etc. A large number of resources can be combined, which can be requested in different formats for different purposes. One of the main features of REST is the use of hypermedia. With hypermedia, client and server can be loosely coupled, which grants both clients and servers extreme freedom in resource manipulation, thereby enabling faster iteration and server development. In addition, with efficient caching, multi-layer architecture and high scalability, REST is an efficient and high-performance solution for modern API microservices. [Swagger](https://swagger.io/) is an open source framework based on OpenAPI, which is intended to simplify the documentation of REST APIs and renders them directly as interactive API documentation with the help of the Swagger UI. The framework has 3 tools that simplify the documentation and development of REST APIs and can be used in combination: 1. Swagger Editor, browser-based editor for designing APIs using the OpenAPI specification 2. Swagger UI, visualization of the OpenAPI specification in an interactive user interface 3. Swagger Codegen, generation of server stubs and client SDKs from the OpenAPI specification with support for numerous programming languages An OpenAPI file allows to describe an entire API, including: * Available endpoints and operations on each endpoint * Operation parameters Input and output for each operation * Authentication methods The OpenAPI file can later be used for visualization in Swagger UI or for code generation in Swagger Codegen. [AF] ## Using Docker [Docker](https://www.docker.com/) is a containerization technology that enables the creation and operation of Linux containers. With Docker it is possible to treat containers like extremely lightweight, modular virtual machines. These containers allow a high degree of flexibility, they can be created, deployed, copied and moved between environments, which in turn supports the optimization of apps for the cloud. This allows a collection of small, independent and loosely coupled services to be created, the so called microservices. With the use of docker-compose, the independent microservices can be connected with each other so that they perform their tasks together. [AF] \newpage Risks and Technical Debts ========================= As a major part of this project consisted of the exploration of technologies the current codebase contains several inconsistencies that impede maintainability, security, performance and user experience. As the time spent on this project was limited inconsistencies often couldn't be resolved. Remaining issues include: * There are bugs in libraries used that are currently met with workarounds. Ideally would be patching those as they are open source. (see `Results.tsx` line 366) * The frontend uses more dependencies than necessary. Using fewer external dependencies does not only exclude security risks but also makes updating the dependencies easier. * While a good UI design doesn't need a lot of tooltips and descriptions, for complicated user interactions as creating a filter additional feedback and tips would be helpful. * The use of constants in communication between database clients, REST controllers and REST clients would minimize the risk of errors due to spelling errors. (see `dashboard_controller.py` lines 31ff.) Due to the lack of systematic testing (e.g. Unit Tests) it cannot be ruled out that there are no further bugs or issues. [JA] [VH]