## Exercise 1
### 1.1
>**a** Is a single Docker image for this application enough? Justify your answer.
It is stated that clients may run on different hosts. Since we have a single server and possibly multiple clients, it is necessary that we separate the image corresponding to the server from the image corresponding to the clients (otherwise we would have at least as many servers as the number of hosts, even if we launch only one container per host).
>**b** Given your answer to the previous question, how many Docker images do you need to create?
We need to create one image for the server and one for the client.
---
### 1.2
We create two dockerfiles, one for the client and one for the server:
**For the client: `client.Dockerfile`**
```dockerfile=
FROM python:3.7-slim
WORKDIR /app
COPY utils.py ./
COPY client.py ./
ENTRYPOINT ["python", "client.py"]
```
**For the server: `server.Dockerfile`**
```dockerfile=
FROM python:3.7-slim
WORKDIR /app
COPY utils.py ./
COPY server.py ./
ENTRYPOINT ["python", "server.py"]
CMD ["8080"]
```
### Commands to run:
```bash=
docker build -t client -f client.Dockerfile .
docker build -t server -f server.Dockerfile .
```
Since we put both dockerfiles in the same folder, we name them differently and use the -f option to specify the path to the two dockerfiles and -t to name the images.
---
### 1.3
> If you built more than one image:
> Do they have some common layers?
Since the code of the server and of the client are susceptible to changes and likely more often than the utilities file, we have chosen to have copied the `utility.py` file in a separate layer.
Yes, the two images have some common layers, since the following lines of their dockerfiles are identical:
```dockerfile=
FROM python:3.7-slim
WORKDIR /app
COPY utils.py ./
```
>If so, is it something that has an impact on the build time and how?
The build time for the second image takes less than than if we were to build it before building the first image.
The common layers of the two images are not rebuilt a second time since have been cached during the building process of the first image. The building process of the second image is thus faster.
In the case where we would want to change `server.py` or `client.py` without changing the utilities, since only the layers following the first line that changes are rebuilt, the building time after changing `server.py` or `client.py` would be faster than the time required to build the image for the first time.
---
### 1.4 & 1.5
>We want to make sure that:
>- The server and the clients can communicate with each other.
>- Other Docker containers cannot communicate neither with the server nor with the clients.
>How can you satisfy both requirements in Docker? Explain your solution and justify it.
We need to create a network and attach the server instances and the client instances to it.
In the terminal, we create a `client-server-comm` (comm for communication) network:
```bash=
docker network create client-server-comm
```
When we run the `docker run` command, we specify that the instance should be attached to the `client-server-comm` network that we created above:
```bash=
docker run --name server --network client-server-comm -it server
```
The server uses by default the port 8080 (as specified in the built image "server").
Then, we get the IP address of the server instance in the network (Docker prints it in the terminal), we use it as a parameter [server-ip-address] in the creations of client instances when launching them:
```bash=
docker run -it --name client1 --network client-server-comm client [server-ip-address] 8080
docker run -it --name client2 --network client-server-comm client [server-ip-address] 8080
```
We launched these three instances in interactive mode in order to see the messages that are printed during the chat.
---
### 1.6 & 1.7
>1.6 Why can’t you launch the containers with the same settings as in Exercise 1.5?
We cannot use the same configuration as previously, since a client container wouldn't be able to reach the server container using the ip address of the server container(local to its own network) if they are not on the same network. One solution would be to publish the 8080 port of the server to a port of the host.
We create a network for the server:
```bash=
docker network create server-network
```
When we launch the server container instance, we publish the port 8080 to localhost:3000 of the host. (The port was chosen arbitrarily among the unused ports)
```bash=
docker run -it --name server --network server-network -p 3000:8080 server
```
We use the ip address of our host machine (in our cased, it is `138.195.242.106`) to connect the clients to the server, using the command lines below:
```bash=
docker run -it --name client1 --network host -p 3000:8080 client 138.195.242.106 3000
docker run -it --name client2 -p 3000:8080 client 138.195.242.106 3000
```
---
## 2 Multi-service application: Docker Compose
### 2.1
>Explain the structure of the content of the folder tripmeal_sujet.
We have two files at the root level of the project folder:
- a docker-compose file `docker-compose.yml` in which we will define the services, volumes, networks etc. of our application, so that they can run together in an isolated environment.
- `tripmeal.env` in which we define the environment variables used for the project.
The two folders `web` and `database` corresponds to the web server and the database services.
In `database`:
- `Dockerfile` is the dockerfile containing the instructions for creating the image of the database.
- `init-db.sql` contains instructions for initializing the database (tables to create, types of the fields etc.).
In `web`:
- The `static` folder, as its name indicates, contains static files served to the users. In our case, it only contains the minimized code of the bootstrap CSS framework and the faviron of the website.
- The `templates` folder, as its name indicates, contains template html files which are used by the server to dynamically generate html files that are sent to each user depending on the data.
- It also contains `requirements.txt` with information relative to the dependencies of the web service, `app.py` for launching the web server, `dbconnect` for connection to the database, a license file and a Dockerfile.
---
### 2.2
>How many services does the application have? Which technologies (i.e., programming languages, databases) are used in each service?
The application has 2 services: the database and the server.
- The database service uses MySQL, a relational database management system.
- The web service uses Python as its programming language (along with HTML/CSS/Javascript for the frontend), with Flask as the backend framework. We do not have a separate frontend service since the html pages are dynamically generated using templates by Flask.
---
### 2.3
>Write the Dockerfile in the directory web. You need a Python 3.7 environment to run the application.
```dockerfile=
FROM python:3.7-slim
WORKDIR /web
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . ./
ENTRYPOINT ["python", "app.py"]
```
---
### 2.4
>What does SERVER_PORT refer to? A port opened at one of the host network interfaces or a port opened at one of the container’s network interfaces?
- `SERVER_PORT` corresponds to the port of the container's networks' interfaces.
- `DATABASE_NAME` gives us the name of the database service.
---
### 2.5
>What do you need to define in file docker-compose.yml to enable the communication between the different services?
>
We need to defined a network for the two services to communicate with each other.
---
### 2.6
>Which base image is used to build a container for the database? Where is this base image stored? Can you find the documentation of this image in the Internet? Write down the link to the documentation page in the answer.
The base image of the database service is the mysql image from the Docker hub registry.
Here is a link to the documentation of the image:
`https://hub.docker.com/_/mysql`
---
### 2.7
>By looking at the documentation of the database image, what do you need to define in docker-compose.yml to make sure that the data is not deleted once the application is taken down?
In order for the data to not be deleted after we shut down the container we need to specify a volume to attach to for the database service. Since the volume is stored on the host machine, it is not deleted after we shut down the service.
---
### 2.8
>Exercise 2.8 Write the file docker-compose.yml. Build, deploy and test your application.
Since we have a dotenv file `tripmeal.env` and that we do not want to put sensitive information directly into the `docker-compose.yml`, we can pass in the environment variables when we run docker-compose in the command line, and gitignore the environment file:
```bash=
docker-compose --env-file ./tripmeal.env up
```
Alternatively, as we've done so in the tutorials, we can also put them down explicitly in the Dockerfile of the database service.
```dockerfile=
version: "3"
services:
web:
build: web
image: aubynken/tripmeal-web:1.0
networks:
- tripmeal-net
ports:
- 3030:5000
environment:
- TRIPMEAL_KEY=my-secret-key
- SERVER_PORT=5000
- DATABASE_NAME=tripmealdb
- DATABASE_USER=root
- MYSQL_ROOT_PASSWORD=my-secret-pw
- DATABASE_HOST=db
- DATABASE_PORT=3306
db:
build: database
image: aubynken/tripmeal-db:1.0
networks:
- tripmeal-net
volumes:
- tripmeal-data:/var/lib/mysql
environment:
- DATABASE_NAME=tripmealdb
- TRIPMEAL_KEY=my-secret-key
- DATABASE_USER=root
- MYSQL_ROOT_PASSWORD=my-secret-pw
- DATABASE_HOST=db
- DATABASE_PORT=3306
networks:
tripmeal-net:
volumes:
tripmeal-data:
```
---
### 2.9
>Use Docker Compose to push all the images that you built in this section to your DockerHub registry.
>In the answer to this exercise write:
> - the command that you execute to push the images.
> - the link to the uploaded images.
Before we push our images, we have given the images proper names and versioning (for eg. we used `image: aubynken/tripmeal-db:1.0` instead of `image: tripmeal-db`)
- We build the images of the two services:
```bash=
docker-compose build
```
- And we push the images of our application to the docker registry:
```bash=
docker-compose push
```
The links to the two images on the Docker registry are:
- https://hub.docker.com/repository/docker/aubynken/tripmeal-web
- https://hub.docker.com/repository/docker/aubynken/tripmeal-db
---
## 3. Local Cubernetes Cluster
### 3.1
>Specify which services are stateless and which ones are stateful. Justify your answer.
The database service is **stateful** because it creates and modifies persistent data (recipes, user infomation, etc.).
The web service is however not bound to persistent data. The data it uses is stored in the database service. When a request that requires information of the database is made to the web service, any of our web-pods can be scheduled to contact the database backend and respond to that request, independently of the data that is requested. Therefore it is **stateless**.
### 3.2
>For each service of the application TripMeal, specify the Kubernetes objects that you need to create and their types. Justify your answers.
For each service of the application, a workload resource and a service resource.
#### WebService
- The workload resource will be a **Deployment**.
Here. We simply want to deploy replicas of a same pod. A Deployment can monitor the actual state of the web service and update it to fulfill the desired state.
- The service resource will be a **LoadBalancer**.
Using a LoadBalancer, we can map the internal IP address of web pods to a publicly accessible address, while balancing the load throughout all available pods.
#### Database service
- The workload resource will be a **StatefulSet**.
As mentioned previously, the database service is stateful. In order to manage persistent data, the data in the pods must be coherent. A StatefulSet is necessary here because it provides guarantees about the ordering and uniqueness of our Pods.
- The service resource will be a **ClusterIP**.
The client interacts with the web service and doesn't need to (and should not) be able to contact the database service directly. A ClusterIP is adapted in this situation, since it makes the database service reachable by the web service, without exposing it to external actions.
### 3.3
>For each Kubernetes object:
>1. Configure. Write a Yaml configuration file. Remember that you need to pass the application the environment variables.
Here we use one single configuration file to configure our four k8s objects for simplicity.
```yaml=
# frontend
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
selector:
matchLabels:
app: tripmeal
service: web
template:
metadata:
labels:
app: tripmeal
service: web
spec:
containers:
- image: aubynken/tripmeal-web:1.0
name: web
ports:
- containerPort: 5000 # arbitrary
protocol: TCP
env:
- name: DATABASE_NAME
value: "tripmealdb"
- name: DATABASE_USER
value: "root"
- name: MYSQL_ROOT_PASSWORD
value: "my-secret-pw"
- name: DATABASE_HOST
value: "db"
- name: DATABASE_PORT
value: "3306"
- name: TRIPMEAL_KEY
value: "'my-secret-key'"
- name: SERVER_PORT
value: "5000"
---
apiVersion: v1
kind: Service
metadata:
name: web
spec:
type: LoadBalancer
ports:
- port: 8080
targetPort: 5000
protocol: TCP
selector:
app: tripmeal
service: web
---
# database
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: db
spec:
selector:
matchLabels:
app: tripmeal
service: db
serviceName: db
template:
metadata:
labels:
app: tripmeal
service: db
spec:
containers:
- image: aubynken/tripmeal-db:1.0
name: db
ports:
- containerPort: 3306
volumeMounts:
- mountPath: /var/lib/mysql
name: tripmeal-data
env:
- name: DATABASE_NAME
value: "tripmealdb"
- name: DATABASE_USER
value: "root"
- name: MYSQL_ROOT_PASSWORD
value: "my-secret-pw"
- name: DATABASE_HOST
value: "db"
- name: DATABASE_PORT
value: "3306"
- name: TRIPMEAL_KEY
value: "'my-secret-key'"
- name: SERVER_PORT
value: "5000"
volumeClaimTemplates:
- metadata:
name: tripmeal-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi # arbitrary, for illustration
---
apiVersion: v1
kind: Service
metadata:
name: db
spec:
type: ClusterIP
ports:
- port: 3306
protocol: TCP
selector:
app: tripmeal
service: db
```
>2. Create. Create the object in Kubernetes with kubectl.
After storing the configuration file in the previous question as `tripmeal.yaml` and making sure that the current working directory contains this file, we run the following command:
```bash=
kubectl create -f tripmeal.yaml
```
>3. Analyze. Execute the command kubectl get all and explain in detail which objects appear in the output as the result of step 2.
Here is the output that we get after running `kubectl get all` in the terminal:
```bash=
NAME READY STATUS RESTARTS AGE
pod/db-0 1/1 Running 0 5s
pod/web-84f7cbb6f9-swhv7 1/1 Running 0 5s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/db ClusterIP 10.102.164.127 <none> 3306/TCP 5s
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 19d
service/web LoadBalancer 10.111.108.24 localhost 8080:30794/TCP 5s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1/1 1 1 5s
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-84f7cbb6f9 1 1 1 5s
NAME READY AGE
statefulset.apps/db 1/1 5s
```
Although all objects are created in one shell command, we can still infer which objects are created in which step. Had we put the code in four different files and ran the commands separately:
- After deploying the **Deployment** object of the front end, we would obviously have `deployment.apps/web`, which runs on top of `replicaset.apps/web-84f7cbb6f9`. The ReplicaSet object controls `pod/db-0`. These are the three objects created.
- After deploying the **LoadBalancer** object for the front end, we would have `service/web`.
- After deploying the **StatefulSet** for the database, we would have `statefulset.apps/db` and the pod that it controls, `pod/db-0`.
- After deploying the **ClusterIP** for the database, we would have `service/db`
---
## 4 Kubernetes cluster on Microsoft Azure
### Exercise 4.1 In your view, why isn’t it a good idea to have production servers directly connected to the Internet?
**Answer**:
Connecting production servers directly to the Internet would pose a serious risk to the infrastructure since we're exposing our servers to external hosts. Whereas when the servers are not connected to the Internet, it becomes a lot harder to attack our infrastructure.
---
### Exercise 4.2 What is a resource group in Azure and how is it useful?
**Answer**:
According to the documentation, a resource group is a container that holds related resources for an Azure solution.
Using resource groups, we can deploy, manage, and monitor all the resources for a group of services at once, rather than handling resources individually.
---
### Exercise 4.3 In the previous command, what does sku refer to? Feel free to look up on the Azure website to find the answer.
**Answer**:
SKU stands for **stock keeping unit**. It usually means item of a store that is tradable. In the context of Microsoft Azure, the choice of sku determines the resources of our virtual machines (Disk size, RAM size, number of cores etc.)
---
### Ex 4.4 Write the exact Docker command that you use to tag the images.
**Answer**:
The command that we executed is:
```bash=
docker tag aubynken/tripmeal-web:1.0 tripmeallab.azurecr.io/tripmeal-web:latest
```
And:
```bash=
docker tag aubynken/tripmeal-db:1.0 tripmeallab.azurecr.io/tripmeal-db:latest
```
---
### Exercise 4.5 Can you tell the meaning of the options in the previous command?
**Answer**:
- `--resource-group` is used to indicate the resource group that the cluster corresponds to
- `name` is simply the name of the cluster
- `node_count` is the number of nodes we want to have in our node pool
- `generate-ssh-keys`: generate SSH public and private key files if missing.
- `attach-acr` : give the "acrpull" role assignment to the azure container registry. This role is used to pull images.
---
### Exercise 4.6 Look at the names of the images of each service in that file. How must these names change? Modify the file accordingly
**Answer**:
`image: aubynken/tripmeal-web:1.0` should be changed to `xxx.azurecr.io/tripmeal-web:latest`, where xxx.zaurecr.io represents the "acrloginserver" that azure asigned to us. The same goes for the database image.