# Cloud App – Backend Design
## Overview
The purpose of the backend system for the Cloud App, is to present a REST API for invoking long-running background processes that in turn perform jobs on various Oracle Cloud applications.
By creating a REST API, we can offer the frontend application an easy way to invoke a job and check its status similar to how Oracle allows us to run jobs and check status.
In order to provide a good User Experience (UX) as well as a good Developer Experience (DX), the backend system needs to be built with the right set of tools.
At a high level the system needs to support the following:
- Asynchronous job execution
- Authentication and authorization
- Archiving of job artifacts
## Design Considerations
We would need asynchronous job execution to provide good UX. The user's browser will freeze up, waiting for the job to complete on the server, if job execution is synchronous. To avoid this, our REST API should receive the user's request to start a job, create the task to be run, and then respond immediately with a url endpoint to query their job's status.
The web application can be designed to query the endpoint repeatedly while displaying an 'in-progress' UI element (or a spinner), and then display the result in the UI.
So the REST API only receives tasks, creates tasks, and retrieves task status. It is not responsible for running the job itself. This is where the Message Queue comes in.
For asynchronous job execution, the REST API would need to use a Message Queue. A Message Queue is a design pattern that includes a producer pushing tasks on a "first-in, first-out" queue, and a consumer retrieving tasks from the queue. It's a way to send messages between two, independently running processes so that they can work together.
On the server, the REST API serves as the producer of a task and a worker script, running as a separate process, is the consumer of the task.
When the REST API receives a POST request to create a task, it will initialize a Task object using the data in the request, and _emit_ the task on the Message Queue. The worker process automatically receives the queued task and should begin execution of the task, and listen for the next task. Here we encounter another instance where we need asynchronous behavior. Once the worker receives the task from the queue and begins running it, it shouldn't stop listening for more tasks waiting in the queue - it should simply start the job, and listen for the next job immediately.
This async behavior is achieved by making the message handler function in the worker bound to an 'on message' _event_. As soon as a message event occurs, the handler function runs. This handler function is non-blocking, and as soon as another 'on message' event fires, another call is made to the handler function _while the previous one is still running_. Now we have two calls of the handler running at the same time - however, internally, the event loop is simply switching contexts between the two functions. As soon as one function encounters an _await_, the event loop switches context to the other function until it encounters an _await_. So only one thing is happening at a time internally but because our functions are non-blocking, we get concurrent, asynchronous behavior that can service our requests and jobs on the queue as quickly as we receive them.
This is recommended design pattern for implementing long-running background processes initiated via HTTP.
## Alternative implementation without message queue
Without a message queue, the alternative is to use a database as the task repository. The producer will insert tasks as rows with a waiting status in the db. The worker constantly pings the db for waiting tasks. Once it finds waiting tasks, it will begin processing each one. As each task completes, the worker updates the status of that row in the db. This method is not as efficient as using a message queue because a process unnecessarily pings the db even when there are no tasks and then sleeps for a specified time to try again. While this works for some use cases, a message queue removes the need to ping the db since the worker is now waiting for items to remove from the message queue. It never pings the message queue, so it doesn’t waste resources or time, and automatically starts as soon as a message appears.
## System Diagram

The green boxes represent one independent process. The blue box is an independent process. There is one-way communication from the REST API to the Worker through the Message Queue.
This design is influenced by microservices.
## Requirements/Conventions for the User (to be finalized)
- _All files to be used as part of a job must be uploaded as a zipped archive - even if there is a single .txt or .csv file._
The user will upload files as part of their requests. These may be metadata files, or data load files, etc and there will likely be more than one file to be used as part of a job and they could be large. In order to reduce the bandwidth needed for file transport, as well as allow easy grouping of files, the user _must_ upload a single zip file.
- _The file names for metadata import should match the import job name._
This reduces the amount of information needed for running the metadata import job and its very explicit which file is used by which job.
- ... add more requirements as needed
## Folder Structure (to be finalized)
Uploaded zip files should be written by the POST request handler to an
_/uploads/<appName>/<taskName>_ folder and then extracts the zip file. The contents of the zip file are extracted under a new folder called _inbound_. The app name, and task name would be required on all tasks, so they can be used to create these directories.
When the worker starts a job, it creates 2 subdirectories under the _uploads/<appName>/<taskName>_ folder called _logs_, and _outbound_. The outbound directory is only created if there are files that have to be served back to the user so they can download them.
The files required for the job should be under inbound.
If the job completes successfully, the log file contents are saved in a column in the database corresponding to that task's row. This allows the users to query the logs for their tasks and have it served up to them via the REST API.
The inbound folder is cleared upon job completion (failure or success). The outbound folder is cleared after 24 hours so that the user has enough time to download the files generated from their jobs.
The .zip file is also deleted once its contents are extracted to inbound.
Basically, for security, we don't keep any files we don't need on the server since they may contain sensitive info. The task log is saved in the database. Note that this task log is different from our REST API's log and the worker's log.
