# KAM - AWS API GATEWAY
### Our v2 Architecture
Our goal was to have a scalable and performant apigateway-centralized architecture, we've moved almost everything to a serverless configuration, except for our webapp API that will keeps things running smoothly until we finish up the migration

### Postman Testing
Refer to the bottom of the document for the [Postman](https://www.postman.com/) collection, instructions to import them are [here](https://kb.datamotion.com/?ht_kb=postman-instructions-for-exporting-and-importing)
The expected POST body format for any question in KAM's API is the following
```javascript=
{
"provider": {"model" : "single"},
"query": {query}
}
```
Let's take COVID's question as an example
* `{{gatewayUrl}}`/covid
```javascript=
{
"provider": {"model" : "single"},
"query": {
"inputs":
[ { "domain": "apple.com",
"country": "angola",
"industry": "airlines/aviation"
}]
}
}
```
The output of this call is:
```javascript=
[{
"input": {
"country": "united states",
"domain": "apple.com",
"industry": "consumer electronics",
"size_range": "10001+"
},
"prediction": 0.9
}]
```
The same input and output are valid for each model
### Gateway
We implemented a REST API Gateway from AWS, build entirely on [SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html). You can check it out in our [repository](https://github.com/KindlyAnswerMe/kam-api-gateway)
#### Lambdas
Each question has it's own lambda function, each lambda contains mapping functions and required headers for each individual model, ensuring proper data structures, here's an example:
For the Covid endpoints we used the most complex format as a standard
```javascript
{ "inputs":
[ { "domain": "apple.com",
"country": "angola",
"industry": "airlines/aviation" } ] }
```
Covid Simple is a model that is expected a string containing comma separated domains:
```javascript
"apple.com,twitter.com"
```
This is one of our most representative use cases, doing that in a fully fledged NodeJS environment it's really easy if you had experience with it
here's the code for the COVID question
```javascript
const axios = require("axios");
var models = require("./covid/models");
var mappingFunction = require("./covid/mappingFunctions");
exports.handler = async function(event) {
const e = JSON.parse(event.body);
const selectedModel = e.provider.model ? e.provider.model : null;
const existentModel = models[selectedModel];
const mappedBody = mappingFunction[selectedModel]
? mappingFunction[selectedModel](e.query)
: e.query;
try {
const res = existentModel
? await axios({
...existentModel,
data: mappedBody
})
: await axios({
...models["ranking"], // For now this is the default model
data: mappedBody
});
console.log("response de la api:", res);
return {
statusCode: 200,
body: JSON.stringify(res.data)
};
} catch (e) {
console.log(e);
return {
statusCode: 400,
body: JSON.stringify(e)
};
}
};
```
And this would be it's mapping functions file
```javascript=
const mappingFunctions = {
simple: (body) => {
try {
return body.inputs.map((input) => input.domain).join(",");
} catch (e) {}
},
};
module.exports = mappingFunctions;
```
Also, each of the questions has a JSON containing it's request information
```javascript=
var options = {
simple: {
method: "POST",
url: "[REDACTED]",
headers: {
Authorization: "[REDACTED]",
"Content-Type": "text/plain",
}
},
ranking: {
method: "POST",
url: "[REDACTED]",
headers: {
Authorization: "[REDACTED]",
"Content-Type": "application/json"
}
},
single: {
method: "POST",
url: "[REDACTED]",
headers: {
Authorization: "[REDACTED]",
"Content-Type": "application/json"
}
}
};
module.exports = options;
```
This leaves us with the following folder structure:

There's a [global dependency layer](https://aws.amazon.com/blogs/compute/working-with-aws-lambda-and-lambda-layers-in-aws-sam/), that for now is just consisting of axios. This allows us to reuse code and deploy lightweight, more performant functions

As you can see here, the dependencies aren't deployed in the lambda itself, they live in a shared layer, this can be achieved at any step of the hierarchy, not just globally.
#### Build and Deploy
To build and deploy the app run the following commands
```shell=
$ npm run build
$ npm run deploy
```
With these commands we create a ZIP file of our code and dependencies, and uploads it to Amazon S3, we also create an output yamel file who contains the configuration for the deployment.
##### Concepts
In AWS Lambdas, the way to think about development is the following:
You're exporting a handler that will be executed when the function is called, it recieves 2 parameters:
* **event**: This is the main means of passing data, when a lambda is called from the API Gateway, it contains all of the request's information: headers, body, etc.
* **context**: We didn't encounter a use for context so far, but it's the execution context, we might define env variables or something on there, research pending
### Authorization
#### API Gateway Lambda authorization workflow

1. The client calls a method on an API Gateway API method, passing a bearer token or request parameters.
2. API Gateway checks whether a Lambda authorizer is configured for the method. If it is, API Gateway calls the Lambda function.
3. The Lambda function authenticates the caller by means such as the following:
* Calling out to an OAuth provider to get an OAuth access token.
* Calling out to a SAML provider to get a SAML assertion.
* Generating an IAM policy based on the request parameter values.
* Retrieving credentials from a database.
4. If the call succeeds, the Lambda function grants access by returning an output object containing at least an IAM policy and a principal identifier.
5. API Gateway evaluates the policy.
* If access is denied, API Gateway returns a suitable HTTP status code, such as 403 ACCESS_DENIED.
* If access is allowed, API Gateway executes the method. If caching is enabled in the authorizer settings, API Gateway also caches the policy so that the Lambda authorizer function doesn't need to be invoked again.
We are taking advantage of JWT Authentication, if you're interested about how this works in comparison to plain API keys, [here's an interesting blog post from our OAuth provider](https://auth0.com/blog/using-json-web-tokens-as-api-keys/)
### Next Steps
- [x] This POC was entirely done via the aws console, we need to replicate this in some Infrastructure as Code solution, [SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html) is the most suggested option.
- [x] Tidy up lambda infrastructure. Look into using layers to abstract transformation logic from the request itself, ideally we'd have a shared "request" layer, taking the http options and request body to perform the request; Everything else should be built on top of this
- [ ] Finish up Auth0 JWT token Integration, so we can reuse our authorization and authentication scheme and have a per-user control of incoming traffic
- [ ] Look into Websockets, for big volume of data or complex ML models async calls will be the way to go
### Postman Collection *(UPDATED)
```javascript
{
"info": {
"_postman_id": "26851645-e5dd-444d-bb34-0de5f09bc85f",
"name": "KAM - AWS API Gateway ",
"description": "Kindly Answer Me - AWS API Gateway POC Collection",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json"
},
"item": [
{
"name": "Which companies are most impacted by COVID-19?",
"request": {
"auth": {
"type": "bearer",
"bearer": [
{
"key": "token",
"value": "token",
"type": "string"
}
]
},
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n\t\t\"provider\": {\"model\" : \"ranking\"},\n\t\t \"query\": {\n\t\t \t\"inputs\":\n\t\t\t\t[ { \"domain\": \"apple.com\",\n\t\t \t\t\"country\": \"angola\",\n \t\t\t\t\"industry\": \"airlines/aviation\"\n \t\t\t}] \n\t\n\t}\n}"
},
"url": {
"raw": "{{gatewayUrl}}/covid",
"host": [
"{{gatewayUrl}}"
],
"path": [
"covid"
]
}
},
"response": []
},
{
"name": "What is the MITRE ATT&CK mapping for this CVE?",
"request": {
"auth": {
"type": "bearer",
"bearer": [
{
"key": "token",
"value": "token",
"type": "string"
}
]
},
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n\t\t\"provider\": {\"model\" : \"intelapi\"},\n\t\t \"query\": \"CVE-2020-0614,CVE-2020-0614\"\n}"
},
"url": {
"raw": "{{gatewayUrl}}/cve",
"host": [
"{{gatewayUrl}}"
],
"path": [
"cve"
]
}
},
"response": []
}
],
"event": [
{
"listen": "prerequest",
"script": {
"id": "9cac2445-af3a-404c-8863-f867f5dc04c2",
"type": "text/javascript",
"exec": [
""
]
}
},
{
"listen": "test",
"script": {
"id": "dc393b2f-b30f-4617-a997-15299084f96f",
"type": "text/javascript",
"exec": [
""
]
}
}
],
"variable": [
{
"id": "27c99362-bcf1-4d31-8b99-423e8675aacd",
"key": "gatewayUrl",
"value": "https://g32x6sw95j.execute-api.us-west-2.amazonaws.com/Staging",
"type": "string",
"description": ""
}
]
}
```