# Docker
Docker is a platform for building, shipping, and running applications in containers. Containers are lightweight and portable environments that package all the necessary dependencies and libraries needed for an application to run. Docker allows developers to easily deploy and manage applications across different environments, from development to production, without worrying about compatibility issues or differences in underlying infrastructure. It provides a consistent and reliable way to build, ship, and run applications, making it a popular tool for modern software development and deployment.
*Is Docker a virtual machine?*
It looks like a virtual machine but the functionality is not the same.
Unlike Docker, a virtual machine will include a complete operating system. It will work independently and act like a computer.
Docker only shares the resources of the host machine to run its environments.
### Terminology
* **Image:** is a template for creating Docker containers.
* **Container:** is an isolated runtime environment that runs an application using an image.
* **Docker daemon:** is a background process that manages containers and images.
* **Docker client:** is a command-line interface used to interact with the Docker daemon.
* **Docker Hub:** is a cloud-based registry where Docker users can store, share, and download images.

### Commands
* `docker version` : View installed version
* `docker run hello-world` : Run container to validate that docker is installed correctly.
* `docker pull busybox` : Pulls the busybox image from the Docker registry and saves it to the system.
* `docker images` : See all the images that are in our system.
* `docker ps` : Shows all containers running on the system.
* `docker ps -a` : Shows all containers that have been run on the system.
* docker run -it busybox sh : Running run with the -it argument allows access to the terminal, which allows you to run commands in the container.
* `docker rm container_id` : Delete the created container.
* `docker rm $(docker ps -a -q -f status=exited)` : Delete containers with finished status.
* `docker container prune` : The latest version of Docker already ships with prune which cleans finished containers.
* `docker rmi image_id` : Delete an image.
## Run Airflow Locally With Docker
On Windows, to install Docker Compose, open the official Docker Website. There you can download and install Docker Desktop which will also install Docker Compose for you. After that, reboot the system and start using Docker Compose on Windows.
Before doing anything check your version > `docker --version`
```
Docker version 20.10.23, build 7155243
```
Check version > `docker-compose version`
```
Docker Compose version v2.15.1
```
### There are two main ways to do it
1. By the 'docker-compose.yaml' file
2. By running it through the terminal
---
## 'docker-compose.yaml' file ([web](https://towardsdatascience.com/run-airflow-docker-1b83a57616fb))
* *Step 1*
- [ ] Fetch docker-compose.yaml
The first thing we’ll need is the `docker-compose.yaml` file. Open WSL terminal (Ubuntu) and create a new directory on your home directory (Sprint-1) (let’s call it `airflow-local`):
```
$ mkdir airflow-local
$ cd airflow-local
```
- [ ] Change directory to `airflow-local`
- [ ] [Fetch](https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html#fetching-docker-compose-yaml) the docker-compose.yaml file (note that we will be using Airflow v2.3.0) by running
```
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.5.3/docker-compose.yaml'
```

* *Step 2*
- [ ] Create thre aditional directories on the `airflow-local `directory
- `dags`
- `logs`
- `plugins`
* *Step 3*
- [ ] Setting up the Airflow user
Now we would have to export an environment variable to ensure that the folder on your host machine and the folders within the containers share the same permissions. We will simply add these variables into a file called `.env`.
`echo -e "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env`

For other operating systems, you may get a warning that `AIRFLOW_UID` is not set, but you can safely ignore it. You can also manually create an .env file in the same folder as docker-compose.yaml with this content to get rid of the warning:
```
AIRFLOW_UID=50000
```
* *Step 4*
- [ ] Initialise the Airflow Database
Now we are ready initialise the Airflow Database by first starting the `airflow-init` container:
`docker-compose up airflow-init`
This service will essentially run airflow db init and create the admin user for the Airflow Database. By default, the account created has the login `airflow` and the password `airflow`.
#### If you have error with [WSL](https://docs.docker.com/desktop/windows/wsl/):

In Terminal Windows you need to have a version 2
```
wsl.exe --set-version Ubuntu 2
```

* *Step 5*
- [ ] Start Airflow services
The final thing we need to do to get Airflow up and running is start the Airflow services we’ve seen in Step 1.
`$ docker-compose up`

Note that the above command may take a while since multiple services need to be started. Once done, you can verify that these images are up and running using the following command in a new command-line tab:
`docker ps`

In Docker Desktop is the same:

* *Step 6*
- [ ] Access Airflow UI
In order to access Airflow User Interface simply head to your preferred browser and open `localhost:8080`.

Type in your credentials (as already noted, by default these will be both set to `airflow` and hit ‘Sign in’. You should now gain access to the Airflow Dashboard where you can see some of the example DAGs patched with Airflow.

* *Step 7*
- [ ] Enter the Airflow Worker container
You can even enter the worker container so that you can run airflow commands using the following command. You can find'`<container-id>` for the Airflow worker service by running `docker ps`:
```
$ docker exec -it <container-id> bash
```
For example:
```
$ docker exec -it d2697f8e7aeb bash
$ default@d2697f8e7aeb:/opt/airflow$ airflow version
2.3.0
```
* *Step 8*
- [ ] Cleaning up the mess
Once you are done with your experimentation, you can clean up the mess we’ve just created by simply running
```
$ docker-compose down --volumes --rmi all
```
This command will stop and delete all running containers, delete volumes with database data and downloaded images.
If you run docker ps once again you can verify that no container is up and running
```
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
```
---
## Docker through the terminal (Ubuntu)
* Check if you have cointaners running > `docker ps`

* If you have stop them > `docker container stop <container id>
`
- [ ] 1. Sign Up [Docker](https://hub.docker.com/)
Dowload airflow the image (you can check for more images on Docker Desktop by running CRTL+K). In this case we will use this one:
```
$ docker pull puckel/docker-airflow
pull puckel/docker-airflow
Using default tag: latest
^[[Alatest: Pulling from puckel/docker-airflow
bc51dd8edc1b: Pull complete
dc4aa7361f66: Pull complete
5f346cb9ea74: Pull complete
a4f1efa8e0e8: Pull complete
7e4812fc693b: Pull complete
f46373e205f2: Pull complete
3c982a1645fa: Pull complete
c39994a04957: Pull complete
8eece23a38e7: Pull complete
Digest: sha256:e3012994e4e730dccf56878094ff5524bffbe347e5870832dd6f7636eb0292a4
Status: Downloaded newer image for puckel/docker-airflow:latest
docker.io/puckel/docker-airflow:latest
```

- [ ] 2. Check that your image was downloaded correctly:
`$ docker images`
```
gonza@GonzaDiez:/mnt/c/Users/Usuario/Desktop/Gonza/Cursos/AnyoneAI/Sprint1$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
redis latest 33e3db53b328 40 hours ago 117MB
postgres 13 ab3945c8cf71 42 hours ago 374MB
apache/airflow 2.5.3 954e772ab9a0 13 days ago 1.35GB
puckel/docker-airflow latest ce92b0f4d1d5 3 years ago 797MB
```
- [ ] 3. Change directory to your project and run:
```
$ docker run -d -p 8080:8080 -v ${PWD}:/usr/local/airflow/dags puckel/docker-airflow webserver
```
* -d: is to keep control of the terminal after running the project
* -p to connect both ports, the one from you local airflow to the one in the image. Where the first 8080 is the one from your PC and the second 8080 is the one from the airflow image
* -v is to connect the path in your PC where you have the Project to the Cointainer
- [ ] 4. Run the container (<container name> == naughty_margulis)
```
$ docker exec -ti <container name> bash
$ docker exec -ti naughty_margulis bash
```
- [ ] 5. Now you can run Airflow in the browser:
```
[local](http://localhost:8080/)
```
- [ ] 6. Once the cointainer is run we will ned to get inside of it to install the environment requirements:
```
$ docker exec -ti <container name> bash
```

- [ ] 7. Look for the requirements.txt file inside de dags folder and install
```
$ pip3 install -r requirements.txt
```

## Some tools for ML ops
[Website of ML programs](https://mlops.toys/)
[Tools for ML ops](https://github.com/darkanita/MLOps-Dummies-DVC)
