AWS-ECR-EKS MODEL DEPLOYMENT

# AWS-ECR-EKS MODEL DEPLOYMENT ## Description Demo Pipeline for model deployment using ECR for container registry and EKS clusters serving the application. ## Use Case The repository goal is to provide a quick way to deploy a PyTorch model for Fashion MNIST Classification to Kubernetes in an AWS EKS cluster. ## Set up ### Creating EKS cluster with runway Before starting the pipeline for the first time, if you don't have an existing EKS cluster, it is necessary to set up a cluster on AWS correctly. You could do it manually, but we recommend the [terraform-aws-eks](https://github.azc.ext.hp.com/runway/terraform-aws-eks) runway project. Our tests are made using the "[eks-with-new-vpc](https://github.azc.ext.hp.com/runway/terraform-aws-eks/tree/master/examples/eks-with-new-vpc)" example. With the cluster created, it is needed to log the credentials in kubeclt to permit a role ARN to alter the cluster. To log in, use the command: ``` aws eks update-kubeconfig --name EKS_CLUSTER_NAME --region YOUR_AWS_REGION ``` Now alter the configmap to enable the role to make modifications: ``` kubectl edit -n kube-system configmap/aws-auth ``` Add the credentials like so: ```txt apiVersion: v1 data: mapRoles: | ... - groups: - system:masters rolearn: arn:aws:iam::YOUR_IAM_ROLE/role username: default ... ``` Now, your IAM role can create the services with codeway. ### Changes to run the codeway template It is essential to change some values at [codeway.yaml](./codeway.yaml) to suit your needs: roleARN , awsAccountNumber , externalId, awsRegion, and the eks cluster name. Those values are located at "registryParameters", the second "parameters" indentation, and the first bash command. ```yaml ... registryParameters: - roleARN: 'YOUR_ROLE_ARN' awsAccountNumber: 'YOUR_AWS_ACCOUNT_NUMBER' externalId: 'YOUR_EXTERNALID' awsRegion: 'YOUR_AWS_REGION' ... parameters: roleArn: 'YOUR_ROLE_ARN' externalId: 'YOUR_EXTERNALID' - bash: aws eks update-kubeconfig --name EKS_CLUSTER_NAME --region YOUR_AWS_REGION ... ``` ### Inference To send a request to the server is necessary to pass encoded images from Fashion MNIST as a parameter in an HTTP request. To encode the images, use the code in [encode_img.py](./encode_img.py). There are three example images in the [examples.txt](./exemple_inference_imgs.txt) file. The server address is the external-IP that will take a few minutes to be updated, so to check it, you need to run: ``` kubectl get svc ``` It will show: ``` NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE example-service LoadBalancer 10.100.157.50 EXTERNALIP 5000:30729/TCP 3s ``` To compose the link, do the following: ``` EXTERNAL-IP + ":5000?data=" + encoded image ``` example: "afe1bd276f20141ba96388a78259f638-1063200090.us-east-1.elb.amazonaws.com:5000?data=AAAAAAAAAA..." ### Maintained by: * Heitor de Castro Felix - heitor.felix@hp.com * Filipe Figueredo Monteiro - filipe.monteiro@hp.com * Paulo de Oliveira Guedes - paulo.guedes@hp.com * Davi Monteiro Paiva - davi.paiva@hp.com * Felipe de Melo Battisti - felipe.battisti@hp.com ## Roadmap We want to integrate the pipeline to use models from the databricks model registry. ## Limitations Currently, the pipeline does not have CI/CD components, the execution of the first iteration is automatic but not automatically updated by triggers.