![](https://i.imgur.com/Yy3WqUe.png) # Staple AI Platform Deployment Documentation This document is a guide to installing the Staple architecture on premises. This is for the complete Staple platform, including user interface and API interface, with all features and additions. The general deployment is achieved using Helm, which provided direct deployments of Kubernetes containing docker containers of code. Kubernetes is used to ensure a reliable, scalable deployment, while the use of Helm makes for a quicker and cleaner deployment process. This document is **specifically** tailored for AWS deployment, although deployments to other cloud enviroments and private enviroments should be similar. For specific details regardng AWS, these are highlighted as such in this installation guide. The deployment process consists of these major steps: **1. Database provisioning and migration 2. Storage provisitoning 3. Cluster Creation 4. Helm installation 5. DNS provisioning** Deployment should take a single developer 1-2 working days. ## General System Requirements There are three types of system requirements: 1. Databases: MySQL, MongoDB, Postgres 2. Document File Storage 3. Kubernetes Cluster Each is discussed in detail in the proceeding sections. The minimum required hardware for the cluster to handle all Staple services are: * Total of 28 CPU cores * Total of 75 GB memory However, for instances where the Staple system should be processing vast numbers of documents (hundreds of thousands to millions of documents monthly), these requirements will increase. Please contact the Staple team for confirmation of exact requirements in such cases. ## Code Transfer Deployment scripts will be provided to you by Staple as ZIP, and transferred via SFTP as required. # Step 1: Databases The first step to installing the Staple system is setting up the databases. Staple uses multiple databases to store user, model and document data. The database should be provisioned according to your usual methods. Database types are as follows: * MySQL * Version: `8.0.11` * Database name: `staple_dashboard` * MongoDB * Version: `4.2.11` * Database name: `staple_scanning` * PostgresQL * Version: `12.3` * Database name: `staple_creation` For each database, a user must be provided to the Staple deployment with full CREATE/READ/UPDATE/DELETE permissions for that database. Once the database is provisioned, the schema migration will be handled automatically using the migration files we will provide you. These files will take care of setting up the database structure. They will require the following information to run: * Host * Username * Password * Database name * Application version (provided by Staple) These values will also be needed later for the Helm deployment. **If you can not provision these databases please let Staple know.** We can provide containerised databases as part of the deployment if required. For performance reasons thoug, this is not preferred. # Step 2: Document Storage The original files Staple proesses are held in a general file system. The storage requirements for the system vary on the expected number of documents expected to be processed, and for how long the documents should be kept. It is generally advised that once a document has been exported from the Staple system to and ERP, Account system, or other long erm storage system, is is deleted after 3 months. However, the time for which documents must be saved may be defined by the client. Please inform Staple the timeline for which documents must be stored. ### Storage on AWS For AWS, Images and other large files are stored in S3. Staple requires that two AWS components be manually provisioned: #### S3 Bucket: The name of the S3 bucket can be anything, but the contents must follow this structure: ``` BUCKETNAME/ > crawlers/ > documents/ > originals/ > scanning_logs/ > docscan/ > galaxy/ > nebula/ > templates/ ``` The `BUCKETNAME` and AWS `region` it is deployed in will also be needed later for the Helm deployment. #### AWS IAM user Role: The Staple deployment needs access to the S3 bucket provisioned above, so the installation will require an `AWS_IAM_USER_KEY` and a `AWS_IAM_USER_SECRET`. It is recommended that a new AWS user or role is created for this purpose, with permissions restricted to the S3 bucket. ### Alternative Storage Systems **If you wish to use an alternative file storage system, please inform Staple of your requirements as soon as possible.** We are happy to work with you to deliver Staple with a different storage system. The system must have an API with the following functionalities: 1. Create new folder 2. Upload document 3. Retreive document 4. List items in folder Batch upload would be beneficial, but is not necessary. Once we have these API endpoints, Staple will quickly integrate them into your custom build. Staple will require a folder structure similar to that given in the prior section. # Step 3: Deployment of Kuberenetes Cluster ### Kubernetes Requirements Staple runs on Kubernetes. This means that it is responsive to load and resilient to faults. For deployments to AWS, a Kuberenetes cluster can be provisioned through AWS Elastic Kubernetes Service. The minimum cluster requirements are: * Total of 28 CPU cores * Total of 75 GB memory * At least one node type available with 5 CPU cores and 10 GB memory For best results, we recommend enabling a Cluster Autoscaler to automatically provision new nodes on demand. [Here](https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html) are the instructions for an AWS EKS cluster. #### Cluster dependencies In order to successfully deploy Staple to the cluster, these client utilities are required: * `kubectl`: version 1.16 or higher * Installation instructions: [here](https://kubernetes.io/docs/tasks/tools/install-kubectl/) * `Helm`: major version 3 * Installation instructions: [here](https://helm.sh/docs/intro/install/) To successfully use these utilities, you will need to connect `kubectl` to the cluster. The exact steps required differ between different Kubernetes providers. Staple can provide clear documentation for connecting to an AWS EKS cluster if required, and is happy to assist with any other requirements. #### Namespace All Staple objects will be created in the same namespace. The name of the namespace is arbitrary, but must be provided to `.namespace.name` at installation or it will default to `staple-RELEASENAME`. The Helm installation will create the namespace itself by default, but this can be done manually for security reasons if necessary. If the namespace is created manually, inform Helm by setting the `.namespace.exists` flag to `true`. # Step 4: Deployment Staple is deployed on Kubernetes via a Helm Chart. Helm is a package manager for Kubernetes that makes the installation process significantly smoother and easier. All you need to do is provide user-specific configuration values, and Helm will manage the rest of the installation. Staple will provide a connection to our container registry and a folder `Staple_Deployment_v1.5/` containing the packaged Chart and a template Config.yaml file. Proceed with deployment by: 1. Updating the template `Values.yaml` with your configuration details. These will be described in greater detail at the time. 2. In your command line, navigate to the provided `Staple_Deployment_v1.5/` folder. 3. Install the chart with `helm install -f Config.yaml staple staple-1.5.0.tgz`. It may take some time for the Chart to completely install. For better understanding of the Helm installation process, see [the official website](https://helm.sh/). #### Networking To guarantee privacy and security, Staple implements `CORS` and `TLS`. Both the User interface and the API interface will require their own URLs to be provisioned. These must be provided at installation. Instructions on DNS resolution are later in this document. # Step 5: DNS Resolution The Staple installation can only be accessed through the URLs provided at install time. DNS resolution must be manually implemented for these to resolve appropriated. To find the target host, ensure you are connected to the correct cluster and run: ``` kubectl get ingress -n <NAMESPACE> ``` Replace `NAMESPACE` with the name of the Staple namespace, which was provided to `.namespace.name` at install time. If none is provided, it will default to `staple-RELEASENAME`. Staple services will soon be available at the given URL. ## Questions Staple provides complete *hand-holding* support for every deployment, as you require. We are also happy to deploy the system on your behalf should you wish, or provide a DevOps professional to assist you if you wish. For any questions or issues, please contact: * Joshua Kettlewell: josh@staple.io * Hebe Hilhorst: hebe@staple.io Staple AI is committed to making the installation process as seemless as possible. All feedback and questions will be immediately actioned up.