laitimes

Take you back the disk space that was eaten up by Docker

author:DBAplus Community

If you're a heavy user of Docker/Kubernetes, you'll probably run into the problem of "no space left on device".

Take you back the disk space that was eaten up by Docker

Of course, if you have a lot of hard disk space and don't mind using a lot of space for unnecessary Docker resources, then you can ignore this article XD~

Before we discuss how to effectively avoid Docker taking up a lot of disk space, let's talk about what resources Docker consumes (resulting in space usage):

  • Image:Container 的镜像文件通常是通过 Dockerfile 结合 docker build 命令生成的
  • Container: is a process with an independent environment (namespace is used to isolate the environment in the Linux operating system), which is usually started by using the docker create or docker run commands
  • Volume: You can use Docker Volume to mount data (files or entire folders) from a container in the host environment.
  • Network:可以使多个Docker Container(s)的网络环境相互通信,Docker network的运作方式很大程度上取决于开发者使用何种Network driver,常见的driver有bridge、host、overlay、macvlan、none、external plugin

Now that we've looked at the common resources for Docker, let's take a look at the statuses of each of these resources:

  • used:指正在被某个container(s)使用的资源
  • unused: The container is not used at all
  • dangling: refers to images that are invalid and will never be used

It's easy to tell if a resource is used or unused, Docker uses the resource if it's currently in use by at least one container, or if not, it's an unused resource.

Dangling, on the other hand, is a special state that only exists in Docker images because Docker images have a version concept. In general, when building a Docker image, we label the image with the -t flag (image file name plus version number). If we don't specify a version number, Docker will default to the latest version.

However, if we repeatedly compile docker images with the same tag name, the first generated docker image will enter the dangling state at the end of the second compilation (an image with a tag name will be observed with docker image ls <none> ), and these dangling images(s) will never be used, and will take up disk space if not cleared regularly.

Now that we've discussed Docker resources and related states, let's take a look at how to clean up those unused/dangling resources!

Docker provides a series of docker prune instructions that can help us clean up different types of resources:

# Remove all unused images, not just dangling ones
$ docker image prune -a
# Remove dangling images
$ docker image prune


$ docker network prune


$ docker volume prune


$ docker container prune
# Remove all unused containers, networks, images (both dangling and unreferenced), and optionally, volumes.
$ docker system prune           

In fact, Docker's cleanup function has solved the problem in some scenarios. For example, for a virtual machine or CI/CD Runner that is running a production service, we only need to use these commands and work with the system's scheduled tasks.

However, in terms of the developer's personal environment, there are many times when we don't really want to delete those unused image(s), so let's consider the following scenario:

Tom uses docker-compose to deploy 20 containers at a time, and these containers use more than 20 Docker images. If Tom uses docker compose rm + docker image prune to empty resources, he will find that the images may have been removed by Docker by the next time he wants to enable the service.

In order to avoid this problem (laziness), we usually choose to ignore the step of emptying resources when testing the deployment environment locally, and over time these dangling images and unused volumes will fill up the disk space of the local machine......

So, it's actually better to use a makefile or shell script to handle the compilation of the Docker image, and remove the dangling image or the image with a specific tag name after the compilation is complete:

docker image prune
# or
docker image ls | grep "<YOUR_TAG_NAME>" | awk '{print $3}' | xargs docker image rm           

In the code above, the second command can delete all image(s) that match the conditions in grep at once, which is also the method I use a lot (after all, it's easy to mistakenly delete innocent images with docker image prune...... )

So far, we've seen how to handle unused images and dangling images in different usage scenarios. However, the above solutions aren't too user-friendly (and not too convenient) for beginners. To save time, we can write these steps into a shell script and put it in the /bin directory of the system, so that the script can be executed directly from the terminal:

#!/bin/bash


force=false


while getopts ":f" opt; do
  case ${opt} in
    f )
      force=true
      ;;
    ? )
      echo "Invalid option: -$OPTARG" 1>&2
      exit 1
      ;;
    : )
      echo "Option -$OPTARG requires an argument." 1>&2
      exit 1
      ;;
  esac
done


shift $((OPTIND -1))


if [ $# -ne 2 ]; then
    echo "Usage: docker-clean [-f] <resource> <keyword>"
    exit 1
fi


resource=$1
keyword=$2


if [ "$force" = true ]; then
  force_args="-f"
else
  read -p "Are you sure you want to delete all $resource with $keyword? [y/N] " confirmation
  if [ "$confirmation" != "y" ] && [ "$confirmation" != "Y" ]; then
    echo "Operation cancelled."
    exit 0
  fi
fi


case $resource in
    "image")
        docker images | grep "$keyword" | awk '{print $3}' | xargs docker rmi $force_args
        ;;
    "container")
        docker ps -a | grep "$keyword" | awk '{print $1}' | xargs docker rm $force_args
        ;;
    *)
        echo "Invalid resource type. Must be either 'image' or 'container'."
        exit 1
        ;;
esac           

By the way:

The entire shell script is generated using chatGPT, which is a great tool for some simple use cases (but remember to check the reliability of the data provided by chatGPT to avoid ruining important data and running away overnight...... )

This script can remove images and containers with specific tag names. Please refer to the following example for the tutorial:

# 安裝
$ git clone https://github.com/ianchen0119/docker-clean.git
$ mv docker-clean/docker-clean /bin/docker-clean
$ chmod 777 /bin/docker-clean




# 使用
$ docker-clean 
Usage: docker-clean [-f] <resource> <keyword>           

summary

Containerization solutions such as Docker are a very friendly tool for developers, but if they are not used correctly, they can easily affect the machine in the execution environment. This article briefly introduces some reference processing methods, if you also have a good management method, you are also welcome to leave a message to communicate!

最后补充一下个人不喜欢批量处理unused & dangling image(s)的原因:

  • When recompiling, the docker image uses the appropriate cache layer based on the commit hash to avoid re-work. If unused images are removed on a regular basis for each compilation, these caches will also be emptied, greatly improving the speed of image compilation.
  • In some microservice test scenarios, if an image takes an additional 3 minutes to reproduce the previous steps, it may take more than 30 minutes to start the entire service......

To maximize disk space utilization while using docker, in addition to regularly cleaning up static docker resources that you don't need, there are other things we can do:

  • Use a streamlined Docker image as a base;
  • Reuse base images as much as possible: If the application does not have special dependencies, you should try to choose a reused base image for compilation or as the execution environment of the application.
  • Avoid too much docker commit: This is similar to the above and helps keep the image clean. However, it is not mandatory, because sometimes operators may update certain images frequently, and if the Dockerfile is written too concisely, it may be difficult to read and difficult to use the cache.
  • After the installation is complete, use the rm or apt-get remove command to remove the installation package to remove the garbage generated during the compilation of the image.
  • Use Distroless Images: There are also many people in the community who are calling for not using images like Alpine as a basis, and instead suggesting that developers choose Distroless images as much as possible, specifically for the sake of container security.

Author丨Wake up and think about money

Source丨 juejin.cn/post/7259032711019855930

The DBAPLUS community welcomes contributions from technical personnel at [email protected]