Basic Docker Setup Note

## Basic Commands ### First time to create / run the container #### docker run ```bash! docker --debug run --interactive --tty --publish <port-on-host>:<port-within-container> --volume "/path/outside/container/folder1:/path/inside/container/folder1" --volume "/path/outside/container/folder2:/path/inside/container/folder2" --name <CONTAINER_NAME> <valid-image-name>:<supported-tag> ``` Note : - `<port-within-container>` indicates the port to access internal application within the container - `<port-on-host>` indicates external TCP port on host machine exposed by docker file. - for `<valid-image-name>` and `<supported-tag>` , search specific image in [Docker Hub](https://hub.docker.com) #### docker compose - this allows you to start multiple containers at once - collect all settings / options in a YAML file, without appending them to command. - successive configuration file can overwrite configuration options in preceeding configuration file. For example , in the command below, options in `overwrite-compose.yml` will overwrite the same options in `base-compose.yml` ```bash! docker compose --file ./base-compose.yml --file ./overwrite-compose.yml up --detach ``` ### query docker status #### query container ```bash! # list all existing images docker images --all # list existing images with filtered keyword docker images --filter=reference='*whatever*' # list all existing containers docker ps --all # check detail attributes of specific container docker inspect <specific-container-id> # check disk usage from images, containers, cache ...etc docker system df ``` - the command above lists all existing containers with important attributes - such as its user-defined name, identity, how long it is running, the port each image takes ...etc - several different containers can refer to the same image, run with separate configuration settings. - TODO, review this statement - detail attributes of a container includes : mount points (volumes) #### query network ```bash! # check summary of all existing networks docker network ls # check detail of specific network, such as which containers are running in docker network inspect <your-network-name> ``` ### remove existing container by its user-defined name ```bash! docker rm <CONTAINER_NAME> ``` ### run / stop existing container ```shell! docker start <CONTAINER_ID_OR_NAME> docker stop <CONTAINER_ID_OR_NAME> ``` ## Run shell command inside container ### run a single shell command inside container ```shell! docker exec --interactive --tty <CONTAINER_ID_OR_NAME> <ANY_SHELL_COMMAND_TO_RUN> ``` ### Create interactive shell environment within the container - To let users run multiple commands inside docker container. ```bash! docker exec --interactive --tty --user <UID_OR_USERNAME_ADDED_INSIDE_CONTAINER> <CONTAINER_ID_OR_NAME> <shell-env-in-your-host> ``` - `<shell-env-in-your-host>` could be basic `/bin/sh` or colored `bash` (typically in `/bin/bash`) - `--user` is optional, mostly it could be just root ## Troubleshooting ### Logging at container level ```bash! # read log message for specific container docker logs <CONTAINER_ID_OR_NAME> # monitor all events to all docker containers docker event ``` ### Internal User Identity Check UID / GID of internal user if your container support the command `id` , this is helpful for setting access control of mounting point paths in volumes ```bash! docker exec <CONTAINER_ID_OR_NAME> id whatever-user-inside ``` check out the example : [user `mysql` in mariaDB server container](https://stackoverflow.com/a/67775426/9853105) ## Create Your Own Docker Image ### Dockerfile Imagine you want to create a Docker image for a Python web app. Here’s a simple `Dockerfile`: ```dockerfile! FROM ubuntu:22.04 RUN apt-get update && apt-get install -y python3 python3-pip COPY app.py / RUN pip install flask CMD ["python3", "/app.py"] ``` - `FROM ubuntu:22.04`, this starts with an official Ubuntu image. - `RUN apt-get update && apt-get install ...`, installs Python and pip. - `COPY app.py /`, copies your application code into the image. - `RUN pip install flask`, installs the Flask library. - `CMD ["python3", "/app.py"]`, sets the default command to run your app. ### Start Building Custom Image ```bash! docker build --tag=your-img-name --file=/path/to/Dockerfile <PATH> ``` - the working directory `<PATH>` should contain all files / folders required for this `docker build` ; in other words there should be any relative path to upper layer such as `../` or `../../` in your `Dockerfile` , docker will report error and terminate immediately for this. - once your custom image is successfully built, check the latest image by command `docker image list` or `docker images` , the previous old image will be tagged with `<none>`, you can manually remove it by `docker image rm` #### working directory for the build The `PATH` argument in the `docker build` command specifies the build context, which is the set of files available to the Docker build process. This is not the same as the `PATH` environment variable inside the container. ##### Example: Using the `PATH` Argument in `docker build` Suppose you have the following project structure: ``` my-app/ ├── Dockerfile ├── app.py └── requirements.txt ``` To build a Docker image using the files in the `my-app` directory, you would run: ```sh docker build -t my-app-image ./my-app ``` Here, `./my-app` is the `PATH` argument specifying the build context. Docker will look for the `Dockerfile` and all referenced files (like `app.py` and `requirements.txt`) in this directory. If you run the command from inside the `my-app` directory, you can use `.` as the context: ```sh docker build -t my-app-image . ``` This makes all files in the current directory available to the build process. If the `Dockerfile` or any files referenced in it are not present in the specified context, the build will fail [Building images](https://docs.docker.com/get-started/docker-concepts/building-images/build-tag-and-publish-an-image/) [Build context](https://docs.docker.com/build/concepts/context/). **Key Point:** The `PATH` argument is always the last argument in the `docker build` command and points to the directory (or tarball, or URL) that serves as the build context. For more details, see the official documentation on [build context](https://docs.docker.com/build/concepts/context/). ### Share parameters in Dockerfile - use `ARG` in a single build stage - for multi-stage - declare parameters with default value before very first base image - then in each stage you simply declare the parameter again without any value (inherent from the default value). ## Orchestrate containers with docker compose ### External variables #### Variables to Container xx #### Variable Interpolation https://docs.docker.com/compose/how-tos/environment-variables/variable-interpolation/ ### determine image source xx ### share parameters in compose file xx ### determine volumes it is file mapping across the container so containers can access files located in host machine. ### define your own health check mechanism xxx --- ## Concept and Good Practice ### Layers > official Dockerfile documentation describes the internal command `RUN` as following : >>The RUN instruction will execute any commands to create a new layer on top of the current image. The added layer is used in the next step in the Dockerfile. RUN has two forms > > My question is : what is `layer` in this context ? why and when does a docker image have several layers ? please explain this with an example to those who are new to docker and have zero knowledge about containerize technology . A layer in Docker is a set of filesystem changes (like adding, modifying, or deleting files) that results from a single instruction in a Dockerfile, such as RUN, COPY, or ADD. Each time you use one of these instructions, Docker creates a new layer on top of the previous one. Layers are stacked to form the final image. #### Why does a Docker image have several layers? Docker images are made up of multiple layers to: - Reuse and share common parts: If two images share the same base (for example, both start from ubuntu:22.04), Docker only needs to store and download those shared layers once, saving space and bandwidth. - Speed up builds: If you change only one part of your Dockerfile, Docker can reuse the layers that haven't changed, making builds much faster. - Make images smaller and more efficient: By only adding what's necessary in each layer, images can be kept lightweight. Imagine a Dockerfile below : ```dockerfile! FROM ubuntu:22.04 RUN apt-get update && apt-get install -y python3 python3-pip COPY app.py / RUN pip install flask CMD ["python3", "/app.py"] ``` - `FROM ubuntu:22.04`, Layer 1: Ubuntu is the chosen base image. - `RUN apt-get update && apt-get install ...`, Layer 2: Adds Python and pip to Ubuntu. - `COPY app.py /`, Layer 3: Adds your `app.py` file. - `RUN pip install flask`, Layer 4: Adds Flask to the image. - `CMD ["python3", "/app.py"]`, This is metadata, not a new layer. Each of these steps (except CMD) creates a new layer. If you later change only app.py, Docker will reuse the first three layers and only rebuild the last one, making the process much faster. TODO , arrange messages with dockerdoc AI below ### Base Image for Each Build Stage > does the command `FROM` come from a valid image name and tag on Docker Hub ? ### Image Size #### Image Size Estimation check out [this decent study note](https://gist.github.com/MichaelSimons/fb588539dcefd9b5fdf45ba04c302db6) in GitHub Gist #### Factors that affect Final Image Size ##### Question > if I don't change OS-level or python-level dependencies to installed in the `RUN` section, does docker also reuse previous built / downloaded files if exists , without rebuilding / reinstalling all stuff ? ##### Question > Assume I don't install any huge 3rd-party dependency , does number of layers directly affect final image size ? such as the more layers the bigger the image would be. ### docker network This enables containers to communicate with each other, the Docker host, and external networks with varying levels of isolation and scope. - any 2 containers communicate to each other through `container name`, if they join at least one common docker network - docker's internal domain name resolver translate the container name to valid low-level IP address (e.g. `172.19.xx.xxx`) - otherwise, if they don't have the same network in common, they communicate by IP address or domain name declared at host machine level. ### Disk Uasge Management #### monitor usage TODO - check dangling images / containers #### clean up reclaimed space - clean, dangling images - clean builder cache, `docker builder prune`, use the command with `--all` (clean all cacaed items) or `--filter` (partial deletion, TODO) - log cannot be cleaned ? TODO ### Question 5 - xx xx xx --- ## Reference - [Difference Between Docker Images and Containers -- AWS doc](https://aws.amazon.com/compare/the-difference-between-docker-images-and-containers/) - [docker doc -- compose quickstart](https://docs.docker.com/compose/gettingstarted/) - [docker doc -- Publishing and exposing ports](https://docs.docker.com/get-started/docker-concepts/running-containers/publishing-ports/) - [How Container Networking Works: a Docker Bridge Network From Scratch](https://labs.iximiuz.com/tutorials/container-networking-from-scratch) - [Docker Subreddit -- Is it a good idea to move the docker "volumes" directory to another path](https://www.reddit.com/r/docker/comments/1d8op63) - [Docker Study Note in Miro (traditional Chinese)](https://miro.com/app/board/uXjVPxDUwN8=/)