Docker is an open platform for developing, shipping, and running application. Read more about Docker here.
Basic Docker Terminology 🐳
Here are some basic Docker terminologies:
A Docker image is the blueprint of the Docker Container. The image of the application needs to be created for shipping any app. The Docker Image provides a convenient way to package applications and other preconfigured server environments to make development much more streamlined.
A Docker Container is a running instance of a Docker Image. Simply put, the Docker Image is pulled from a registry and it is executed as a Container.
When building an image from scratch, Docker creates layers to make the successive deployments and builds efficient. Each layer is a diff/delta from the previous layer that was built before it.
Let us try to understand it with the help of an example.
For this article we use this Docker Sample Application along with its Dockerfile.
FROM node:12-alpine RUN apk add --no-cache python2 g++ make WORKDIR /app COPY . . RUN yarn install --production CMD ["node", "src/index.js"] EXPOSE 3000
Each line in this Dockerfile is a Docker layer and if not changed will be reused from the cached layers in later builds.
Running the build for the 1st time:
docker build -t getting-started .
For the first time, every layer will be built from scratch so the entire build process will take a relatively long time.
We can see here that the base images are downloaded from the internet and the commands are run inside of it to create the image and take 175 secs.
Now, let us try to rebuild it:
docker build -t getting-started .
The build time now goes down to 4s 🤯🤯
This is what layering and caching in docker does. The subsequent builds are built from the cached layers that were created from the previous builds, and as no changes were made to Dockerfile all the layers were taken up from the cache.
Now, let us make changes in the Dockerfile and see how the cache behaves here.
We simply change the WORKDIR command in Dockerfile.
FROM node:12-alpine RUN apk add --no-cache python2 g++ make WORKDIR /app_temp COPY . . RUN yarn install --production CMD ["node", "src/index.js"] EXPOSE 3000
Now, building it gives a different result:
Layers [1/5] [2/5] are cached whereas only [3/5] [4/5] [5/5] are again built. This is still better than building everything from scratch.
The layers can be reused in other images created.
Note that both adding and removing files will result in a new layer.
Using Multi-Stage Builds
One of the most challenging things about building images is keeping the image size down. Each instruction in the Dockerfile adds a layer to the image, and you need to remember to clean up any artifacts that you do not need before moving on to the next layer. This is where multi-stage builds help.
# syntax=docker/dockerfile:1 FROM node:12-alpine as initial_builder RUN apk add --no-cache python2 g++ make WORKDIR /app COPY . . RUN yarn install --production FROM alpine #Final build stage WORKDIR /app COPY --from=initial_builder /app /app CMD ["node", "src/index.js"] EXPOSE 3000
In the final build stage just the built artifacts are brought from the previous stage into this new stage.
docker build multi-stage .
Now, let us compare the size between the 1st image and the final image.
docker image ls
The size drastically reduces here. 😎😎
Using --no-cache while building the image will always start building the image from scratch even if cached layers are available.
Understanding R/W Layer
An image has many layers. When a container starts, only one read-write layer is attached on top of all the layers of images.
All the changes a container makes are made to the editable R/W layer and not to the underlying image layers. Therefore, a number of containers can use the same image with each having its own R/W layer.
Copy-on-Write (CoW) mechanism in its storage drivers. This mechanism satisfies the need of different containers to share the same image. However, when a single container performs operations such as modification of an image file, a duplicate image is created in the upper read-write layer.
Advantages of using Docker Layers
Good storage management
Sharing across multiple containers
Docker Layers and Cache are important concepts when it comes to adopting good practices of creating any Docker infrastructure. Small tweaks here and there can increase the efficiency of scalability and deployments.
I have tried to explain the concepts in a simple and easy to understand language here to make readers interested into using these in their docker practices.
Hope you enjoyed the article, have a great day !!✌🏻✌🏻
This is a part of a series of articles to help understand Docker better. Find the other articles as follows: