--- tags: BigData-MS-2019, BigData-BS-2019, BigData-MS-2020, BigData-BS-2020 title: Lab Block 1.2. Docker. --- # Lab Block 1: Docker This tutorial guides you through creating your first Docker container. It was heavily drawn from the official [Docker Getting Started Guide](https://docs.docker.com/get-started/). :::info This tutorial comes with a troubleshooting section that lists the most common problems. ::: <!-- [TOC] --> <!-- ## Install Docker Install Docker from Linux: https://phoenixnap.com/kb/how-to-install-docker-on-ubuntu-18-04 or https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04 You can install MacOS: Windows: Vagrant uses VirtualBox as a standard hypervisor. At least one hypervisor provider required to run a VM. Optional: Download Box images - [i386](https://cloud-images.ubuntu.com/vagrant/trusty/current/trusty-server-cloudimg-i386-vagrant-disk1.box) - [amd64](https://cloud-images.ubuntu.com/vagrant/trusty/current/trusty-server-cloudimg-amd64-vagrant-disk1.box) - Download from [edisk](https://edisk.university.innopolis.ru/edisk/Public/Fall2018/) ## Setting up Vagrant project with a local box image Alternatively, you can pre-download a box image and use it to init your VM. ```bash vagrant init my-box /path/to/my-box.box ``` --> ## How is Docker different from VMs? Virtual Machine emulates a fully working isolated OS. It requires the same resources from the host as a normal OS would, meaning it would load its kernel into the memory, load all necessary kernel modules, all the libraries to work with the software and only then will allocate resources for a user application. If you run 100 identical VMs, they would occupy 100 times more resources. A Docker container, on the other hand, runs natively on Linux and shares the kernel of the host machine with other containers. It runs a discrete process, taking no more memory than any other executable, making it lightweight. The difference can be intuitively demonstrated by the following image ![](https://docs.docker.com/images/VM%402x.png =300x) ![](https://docs.docker.com/images/Container%402x.png =300x) *Source: [Docker: Get Started](https://docs.docker.com/get-started/)* Follow [official guidelines](https://docs.docker.com/engine/install/) to install docker. Mac and Windows users should install Docker Desktop. Linux users should follow instructions for servers. :::warning Linux users should complete istructions to manage Docker as non-root user [post-installation instructions](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user). ::: ## Check Docker Info We assume you have already installed Docker on your OS. Open your favorite terminal emulator and try running ``` docker --version && docker info ``` <!-- 1. Run docker --version and ensure that you have a supported version of Docker: ```bash docker --version Docker version 17.12.0-ce, build c97c6d6 ``` 2. Run docker info (or docker version without --) to view even more details about your Docker installation: ```bash docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 17.12.0-ce Storage Driver: overlay2 ... ``` --> ## Test Docker installation :::danger Avoid using Docker in the form of snap container. People reported multiple issues with this type of installation. *16/09/2019* ::: Test that your installation works by running the simple Docker image, `hello-world`: ``` docker run hello-world ``` List local images and find `hello-world` that was downloaded to your machine: ``` docker image ls ``` List the `hello-world` container (spawned by the image) which exits after displaying its message. If it were still running, you would not need the --all option: ``` docker container ls --all CONTAINER ID IMAGE COMMAND CREATED STATUS 54f4984ed6a8 hello-world "/hello" 20 seconds ago Exited (0) ``` A container is launched by running an image. An image is an executable package that includes everything needed to run an application -- code, runtime, libraries, environment variables, and configuration files. A container is a runtime instance of an image -- what the image becomes in memory when executed (that is, an image with state, or a user process). You can see a list of your running containers with the command, `docker ps`, just as you would in Linux. ## Build Your Image and Instantiate Your Container `Dockerfile` defines what goes on in the environment inside your container. Access to resources like networking interfaces and disk drives is virtualized inside this environment, which is isolated from the rest of your system, so you need to map ports to the outside world, and be specific about what files you want to “copy in” to that environment. However, after doing that, you can expect that the build of your app defined in this Dockerfile behaves exactly the same wherever it runs. ### Before Building an Image Create `Dockerfile` (simple text document, no extension) with the the following content ```dockerfile # Use an official Python runtime as a parent image FROM python:2.7-slim # Set the working directory to /app WORKDIR /app # Copy the current directory contents into the container at /app COPY . /app # Install any needed packages specified in requirements.txt RUN pip install --trusted-host pypi.python.org -r requirements.txt # Make port 80 available to the world outside this container EXPOSE 80 # Define environment variable ENV NAME World # Run app.py when the container launches CMD ["python", "app.py"] ``` Try to understand what this file defines. This `Dockerfile` refers to a couple of files we haven’t created yet, namely `app.py` and `requirements.txt`. Create them and put them in the same folder with the `Dockerfile`. Create `requirements.txt` with the following content ``` Flask Redis ``` and `app.py` ```python from flask import Flask from redis import Redis, RedisError import os import socket # Connect to Redis redis = Redis(host="redis", db=0, socket_connect_timeout=2, socket_timeout=2) app = Flask(__name__) @app.route("/") def hello(): try: visits = redis.incr("counter") except RedisError: visits = "<i>cannot connect to Redis, counter disabled</i>" html = "<h3>Hello {name}!</h3>" \ "<b>Hostname:</b> {hostname}<br/>" \ "<b>Visits:</b> {visits}" return html.format(name=os.getenv("NAME", "world"), hostname=socket.gethostname(), visits=visits) if __name__ == "__main__": app.run(host='0.0.0.0', port=80) ``` ### Build the App We are ready to build the app. Make sure you are still at the top level of your new directory. Here’s what ls should show: ```bash $ ls Dockerfile app.py requirements.txt ``` Now run the build command. This creates a Docker image, which we’re going to name using the `--tag option`. Use `-t` if you want to use the shorter option. ```bash docker build --tag=friendlyhello . ``` The image is placed into Docker's local image registry: ```bash $ docker image ls REPOSITORY TAG IMAGE ID friendlyhello latest 326387cea398 ``` Note how the tag defaulted to latest. The full syntax for the tag option would be something like `--tag=friendlyhello:v0.0.1`. ### Run the app Run the app, mapping your machine’s port `4000` to the container’s published port `80` using `-p`: ``` docker run -p 4000:80 friendlyhello ``` If you are unsure what a port is, we have some additional homework for you :) You should see a message that Python is serving your app at `http://0.0.0.0:80`. But that message is coming from inside the container, which doesn’t know you mapped port `80` of that container to `4000`, making the correct URL `http://localhost:4000`. Go to that URL in a web browser to see the display content served up on a web page. You can stop the web server by hitting `CRTL+C` in your terminal. Now let’s run the app in the background, in detached mode: ``` docker run -d -p 4000:80 friendlyhello ``` You will get a container ID in return. You can check available containers with `docker container ls`. Use the ID to stop the container: ``` docker container stop ID ``` ## Repository for Docker Images Try uploading your image to Docker repository. The following commands are self sufficient. ```bash docker login # Log in this CLI session using your Docker credentials docker tag <image> username/repository:tag # Tag <image> for upload to registry docker push username/repository:tag # Upload tagged image to registry docker run username/repository:tag # Run image from a registry ``` ## Docker for Services In a distributed application, different pieces of the app are called “services”. For example, if you imagine a video sharing site, it probably includes a service for storing application data in a database, a service for video transcoding in the background after a user uploads something, a service for the front-end, and so on. Services are really just “containers in production.” A service only runs one image, but it codifies the way that image runs — what ports it should use, how many replicas of the container should run so the service has the capacity it needs, and so on. Scaling a service changes the number of container instances running that piece of software, assigning more computing resources to the service in the process. And containers help to create a controlled environment for software execution. First, verify that you have `docker-compose` installed ``` docker-compose --version ``` If not, read how to install [Docker Compose](https://docs.docker.com/compose/install/) on your system. ### Define a Service With Docker it is easy to define, run, and scale services -- just write a `docker-compose.yml` file ```yaml version: "3" services: web: # replace username/repo:tag with your name and image details image: username/repo:tag deploy: replicas: 5 resources: limits: cpus: "0.1" memory: 50M restart_policy: condition: on-failure ports: - "4000:80" networks: - webnet networks: webnet: ``` This `docker-compose.yml` file tells Docker to do the following: * Pull the image we uploaded before from the registry * Run 5 instances of that image as a service called web, limiting each one to use, at most, 10% of a single core of CPU time (this could also be e.g. “1.5” to mean 1 and half core for each), and 50MB of RAM * Immediately restart containers if one fails * Map port 4000 on the host to web’s port 80 * Instruct web’s containers to share port 80 via a load-balanced network called `webnet`. (Internally, the containers themselves publish to web’s port 80 at an ephemeral port) * Define the webnet network with the default settings (which is a load-balanced overlay network) ### Deploying a Service Initialize Docker swarm: ``` docker swarm init ``` Now let’s run it. You need to give your app a name. Here, it is set to getstartedlab: ``` docker stack deploy -c docker-compose.yml getstartedlab ``` You can list running services with `docker service ls `. A single container running in a service is called a task. Tasks are given unique IDs that numerically increment, up to the number of replicas you defined in docker-compose.yml. List the tasks for your service: ``` docker service ps getstartedlab_web ``` Alternatively, you can list your running containers `docker container ls -q`. You can run `curl -4 http://localhost:4000` several times in a row, or go to that URL in your browser and hit refresh a few times. ### Scaling Your App You can scale the app by changing the replicas value in `docker-compose.yml`, saving the change, and re-running the docker stack deploy command: ``` docker stack deploy -c docker-compose.yml getstartedlab ``` Docker performs an in-place update, no need to tear the stack down first or kill any containers. Now, re-run docker `container ls -q` to see the deployed instances reconfigured. If you scaled up the replicas, more tasks, and hence, more containers, are started. ### Shutting Down Your Service Take the app down with `docker stack rm`: ``` docker stack rm getstartedlab ``` Take down the swarm. ``` docker swarm leave --force ``` ## More About Swarms Make sure you have `docker-machine` installed ``` docker-machine --version ``` If not, follow the instructions for your OS [here](https://github.com/docker/machine/releases/). ### Set Up Your Swarm A swarm is made up of multiple nodes, which can be either physical or virtual machines. The basic concept is simple enough: run `docker swarm init` to enable swarm mode and make your current machine a swarm manager, then run` docker swarm join` on other machines to have them join the swarm as workers. We use VMs to quickly create a two-machine cluster and turn it into a swarm. Create a couple of VMs using docker-machine, using the VirtualBox driver: ``` docker-machine create --driver virtualbox myvm1 docker-machine create --driver virtualbox myvm2 ``` Alternatively, you can use a real servers or other VMs that have Docker installed. You can read more about `docker-machine` in the [official documentation](https://docs.docker.com/machine/reference/create/). List VMs and get their IP addresses ``` docker-machine ls ``` Now you can initialize the swarm and add nodes. The first machine acts as the manager, which executes management commands and authenticates workers to join the swarm, and the second is a worker. You can send commands to your VMs using `docker-machine ssh`. Instruct `myvm1` to become a swarm manager with` docker swarm init` and look for output like this: ``` $ docker-machine ssh myvm1 "docker swarm init --advertise-addr <myvm1 ip>" Swarm initialized: current node <node ID> is now a manager. To add a worker to this swarm, run the following command: docker swarm join \ --token <token> \ <myvm ip>:<port> To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. ``` As you can see, the response to `docker swarm init` contains a pre-configured `docker swarm join` command for you to run on any nodes you want to add. Copy this command, and send it to `myvm2` via `docker-machine ssh` to have `myvm2` join your new swarm as a worker: ``` $ docker-machine ssh myvm2 "docker swarm join \ --token <token> \ <ip>:2377" This node joined a swarm as a worker. ``` Run `docker node ls` on the manager to view the nodes in this swarm ### Deploy You App to the Swarm Cluster So far, you’ve been wrapping Docker commands in `docker-machine ssh` to talk to the VMs. Another option is to run `docker-machine env <machine>` to get and run a command that configures your current shell to talk to the Docker daemon on the VM. This method works better for the next step because it allows you to use your local `docker-compose.yml` file to deploy the app “remotely” without having to copy it anywhere. Type `docker-machine env myvm1`, then copy-paste and run the command provided as the last line of the output to configure your shell to talk to myvm1, the swarm manager. ``` $ docker-machine env myvm1 export DOCKER_TLS_VERIFY="1" export DOCKER_HOST="tcp://192.168.99.100:2376" export DOCKER_CERT_PATH="/Users/sam/.docker/machine/machines/myvm1" export DOCKER_MACHINE_NAME="myvm1" # Run this command to configure your shell: # eval $(docker-machine env myvm1) ``` Run the given command to configure your shell to talk to myvm1. ```bash eval $(docker-machine env myvm1) ``` On Windows it looks differently ``` PS C:\Users\sam\sandbox\get-started> docker-machine env myvm1 $Env:DOCKER_TLS_VERIFY = "1" $Env:DOCKER_HOST = "tcp://192.168.203.207:2376" $Env:DOCKER_CERT_PATH = "C:\Users\sam\.docker\machine\machines\myvm1" $Env:DOCKER_MACHINE_NAME = "myvm1" $Env:COMPOSE_CONVERT_WINDOWS_PATHS = "true" # Run this command to configure your shell: # & "C:\Program Files\Docker\Docker\Resources\bin\docker-machine.exe" env myvm1 | Invoke-Expression ``` Run the given command to configure your shell to talk to myvm1. ``` & "C:\Program Files\Docker\Docker\Resources\bin\docker-machine.exe" env myvm1 | Invoke-Expression ``` Run `docker-machine ls` to verify that `myvm1` is now the active machine, as indicated by the asterisk next to it. ### Deploy the App on the Swarm Manager Now that you have configured your environment to access `myvm1`, your swarm manager, you can easily deploy your application on the cluster ``` docker stack deploy -c docker-compose.yml getstartedlab ``` Now you can access your app from the IP address of either `myvm1` or `myvm2`. The reason both IP addresses work is that nodes in a swarm participate in an ingress routing mesh. This ensures that a service deployed at a certain port within your swarm always has that port reserved to itself, no matter what node is actually running the container. Here’s a diagram of how a routing mesh for a service called my-web published at port `8080` on a three-node swarm would look: ![](https://docs.docker.com/engine/swarm/images/ingress-routing-mesh.png) You can tear down the stack with `docker stack rm`. For example: ```bash docker stack rm getstartedlab ``` ## More About Stacks A stack is a group of interrelated services that share dependencies, and can be orchestrated and scaled together. A single stack is capable of defining and coordinating the functionality of an entire application (though very complex applications may want to use multiple stacks). ### Adding a New Service It’s easy to add services to our `docker-compose.yml` file. First, let’s add a free visualizer service that lets us look at how our swarm is scheduling containers. ```yaml version: "3" services: web: # replace username/repo:tag with your name and image details image: username/repo:tag deploy: replicas: 5 restart_policy: condition: on-failure resources: limits: cpus: "0.1" memory: 50M ports: - "80:80" networks: - webnet visualizer: image: dockersamples/visualizer:stable ports: - "8080:8080" volumes: - "/var/run/docker.sock:/var/run/docker.sock" deploy: placement: constraints: [node.role == manager] networks: - webnet networks: webnet: ``` The only thing new here is the peer service to web, named `visualizer`. Notice two new things here: a volumes key, giving the visualizer access to the host’s socket file for Docker, and a placement key, ensuring that this service only ever runs on a swarm manager -- never a worker. That’s because this container, built from an open source project created by Docker, displays Docker services running on a swarm in a diagram. We talk more about placement constraints and volumes in a moment. Make sure your shell is configured to talk to `myvm1`. Run docker-machine ls to list machines and make sure you are connected to `myvm1`, as indicated by an asterisk next to it. If needed, re-run `docker-machine env myvm1`. Now, re-run the `docker stack deploy` command on the manager, and whatever services need updating are updated ``` $ docker stack deploy -c docker-compose.yml getstartedlab Updating service getstartedlab_web (id: angi1bf5e4to03qu9f93trnxm) Creating service getstartedlab_visualizer (id: l9mnwkeq2jiononb5ihz9u7a4) ``` Now you can take a look at the visualizer in your browser at port `8080`. Does the display match what you would expect? ### Adding One More Service Let's add Redis database for storing app data. Modify `docker-compose.yml` ```yaml version: "3" services: web: # replace username/repo:tag with your name and image details image: username/repo:tag deploy: replicas: 5 restart_policy: condition: on-failure resources: limits: cpus: "0.1" memory: 50M ports: - "80:80" networks: - webnet visualizer: image: dockersamples/visualizer:stable ports: - "8080:8080" volumes: - "/var/run/docker.sock:/var/run/docker.sock" deploy: placement: constraints: [node.role == manager] networks: - webnet redis: image: redis ports: - "6379:6379" volumes: - "/home/docker/data:/data" deploy: placement: constraints: [node.role == manager] command: redis-server --appendonly yes networks: - webnet networks: webnet: ``` Redis has an official image in the Docker library and has been granted the short image name of just `redis`, so no `username/repo` notation here. The Redis port, `6379`, has been pre-configured by Redis to be exposed from the container to the host, and here in our Compose file we expose it from the host to the world, so you can actually enter the IP for any of your nodes into Redis Desktop Manager and manage this Redis instance, if you so choose. Most importantly, there are a couple of things in the redis specification that make data persist between deployments of this stack: * `redis` always runs on the manager, so it’s always using the same filesystem. * `redis` accesses a directory in the host’s file system that is linked to the directory `/data` inside the container, which is where Redis stores data. Together, this is creating a “source of truth” in your host’s physical filesystem for the Redis data. Without this, Redis would store its data in `/data` inside the container’s filesystem, which would get wiped out if that container were ever redeployed. This source of truth has two components: * The placement constraint you put on the Redis service, ensuring that it always uses the same host. * The volume you created that lets the container access `./data` (on the host) as `/data` (inside the Redis container). While containers come and go, the files stored on `./data` on the specified host persists, enabling continuity. You are ready to deploy your new Redis-using stack. Create a `./data` directory on the manager ```bash docker-machine ssh myvm1 "mkdir ./data" ``` Run `docker stack deploy` one more time ```bash docker stack deploy -c docker-compose.yml getstartedlab ``` And visit the web page that you serve on the manager in your browser. ## Afterword You have completed two tutorials on virtualization. In the next classes, you are going stick with `vagrant` to create virtual machines. If you decide to try using Docker for the next several lab sessions - you have this freedom, but you will be on your own (no technical support). ## Links - [Docker Tutorial](https://docs.docker.com/get-started/) - [Install on Linux](https://phoenixnap.com/kb/how-to-install-docker-on-ubuntu-18-04) # Docker Tutorial: Troubleshooting <!-- ## Cannot install Docker Desktop on Windows Docker desktop supported only on Windows Pro or Windows Enterprise. For other versions of Windows use [Docker Toolbox](https://docs.docker.com/toolbox/toolbox_install_windows/) --> ## I have installed Docker Toolbox, but starting script fails If your starting script fails with a message saying that it cannot find `docker-machine.exe` or `vboxmanage.exe`, open the script itself, i.e. ` C:\Program Files\Docker Toolbox\start.sh ` and enter respective paths explicitly. ## Docker Toolbox cannot download `boot2docker.iso` image Docket Toolbox installation comes with `boot2docker.iso`. Locate it in `C:\Program Files\Docker Toolbox\boot2docker.iso` and copy it to `C:/Users/User/.docker/cache` ## `docker-machine` does not see Virtual box You will see similar error ``` Running pre-create checks... Error with pre-create check: "VBoxManage not found. Make sure VirtualBox is installed and VBoxManage is in the path" ``` [Solution](https://github.com/docker/machine/issues/4590) ## Command `eval $(sudo docker-machine env myvm1)` does not do what I want **Solution:** [configure Docker to work without `sudo`](https://docs.docker.com/install/linux/linux-postinstall/) ## `docker-machine` does not work without `sudo` Some problems with Docker arise from the snap version of Docker application. If your version is installed in the form of snap container, remove it and install the latest version from the web site. ## After shutting down workers, cannot use swarm, it says that certificate has expired [**Solution provided by Michail**](https://docs.google.com/document/d/1-bnE1KSa05TjE2k1wLrls_wbuHJ6UhWMHczIA48psWY/edit) ## `docker-machine` complains on lack of virtualization support If you run Docker on VM, make sure *nested virtualization* is enabled for your VM. If you work on you host machine, make sure virtualization is enambed in BIOS. Some versions of Windows do not support virtualization. Error example: ```bash $ docker-machine create --driver virtualbox myvm1 Running pre-create checks... Error with pre-create check: "This computer doesn't have VT-X/AMD-v enabled. Enabling it in the BIOS is mandatory" ``` ## `docker-machine` does not work on mac Sometimes after starting the `docker-machine` on mac you will get the message `Killed`. This likely means that `docker-machine` execution was blocked by your privacy settings. Go to Settings, Privacy and Security, General tab, and allow the evecution of `docker-machine`. ## Containers stuck in `Ready` state Verify the integrity of the container that you have pushed into the repository # Self-check Questions Useful resourse: [Docker Documentation](https://docs.docker.com/engine/reference/commandline/cli/) 1. Is this statement correct: You can't create multiple containers from the same image? 2. Is this statement correct: All containers running on a single machine share the same operating system kernel, so they start instantly and make more efficient use of RAM. 3. Is this statement correct: Containers include the application and all of its dependencies, but share the kernel with other containers. They run as an isolated process in userspace on the host operating system. They're also not tied to any specific infrastructure – Docker containers run on any computer, on any infrastructure, and in any cloud. 4. Fill in a blank: ________ is a cloud-hosted service from Docker that provides registry capabilities for public and private content 5. Fill in a blank: ________ is a tool for defining and running multi-container Docker applications. 6. Fill in a blank: ________ is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual Docker host. 7. Fill in a blank: ________ is a text document that contains all the commands a user could call on the command line to assemble an image 8. Explain Docker command: `docker exec -it container_id bash` 9. Explain Docker command: `docker build -t my_user/repo_name:1.0` 10. Explain Docker command: `docker commit -m "My first update" container_ID user_name/repository_name` 11. Explain Docker command: `docker push user_name/repository_name` 12. Explain Docker command: `docker ps` 13. Explain Docker command: `docker images` 14. Explain Docker command: `docker ps -a` 15. Which command runs a Docker container? 16. Which command stops Docker a container? 17. Which command deletes a Docker container? 18. Which command deletes a Docker image? # Acceptance criteria: 0. Understanding of what you did 1. Image loaded to docker hub 2. Stack deployed to **2 VMs** in the swarm 3. Visualizer container running relevant web page from the host machine 4. Hello web page opens from the host (from both VM IPs) 5. Counter from Redis works