By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account Docker build inside sysbox container results in "lchown ... no such file or directory" errors Docker build inside sysbox container results in "lchown ... no such file or directory" errors DPatrickBoyd opened this issue Jan 13, 2021 · 34 comments

Hi there,

I am attempting to solve our CI/CD woes using sysbox and I was really excited to have it working, until it didn't.

Using a dotnet restore with dotnetcore image inside a docker build is failing with a very generic message:

Error processing tar file(exit status 1): lchown /tmp/clr-debug-pipe-202-24216845-in: no such file or directory

see here and here for more information

I am able to solve this by adding COMPlus_EnableDiagnostics=0 as an ENV in the Dockerfile or by passing it from docker-compose and using ARG in Dockerfile. However, I really don't want to have to alter a ton of Dockerfiles for a bunch of microservices, and I don't want to have to disable debugging, which is what that flag does.

How to reproduce:
create a Dockerfile using mcr.microsoft.com/dotnet/core/sdk:3.1-buster image and then either pull a dotnet repo that does a dotnet restore in the Dockerfile

Things I have tried:

running on normal Docker/non-dind = works as intended

running on dind using privileged flag and mounting /lib/var/docker as a volume and running nested = works

running with sysbox as runtime and:

  • added cap_add - ALL to first docker-compose = fails
  • added cap_add - ALL to inner docker-compose = fails
  • I was able to do an strace on both the docker daemon when using standard docker and then using dind with sysbox, here are a few snippets

    standard :
    -mknodat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", S_IFIFO|0700) = 0
    -fchownat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
    -fchmodat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", 0700) = 0
    -utimensat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", [{tv_sec=1610522665, tv_nsec=0} /* 2021-01-13T07:24:25+0000 */, {tv_sec=1610522665, tv_nsec=0} /* 2021-01-13T07:24:25+0000 */], 0) = 0

    sysbox :

    -newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2-init/merged/tmp/clr-debug-pipe-225-63286646-in", 0xc00192e6b8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory) -newfstatat(AT_FDCWD,
    "/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2/merged/tmp/clr-debug-pipe-225-63286646-in", {st_mode=S_IFIFO|0700, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
    -lgetxattr("/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2/merged/tmp/clr-debug-pipe-225-63286646-in", "security.capability", 0xc00192a700, 128) = -1 ENODATA (No data available)
    -newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2-init/merged/tmp/clr-debug-pipe-225-63286646-out", 0xc00192e858, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)

    As you can see, it doesn't seem to be able to run any of the syscalls like mknodat, fchmodat, etc. Which is why I was hoping adding the cap_add would solve this. Both containers are running as root.

    Any help on this would be much appreciated!

    Hi @DPatrickBoyd , thanks for giving Sysbox a shot and for filing the issue. Thanks for the initial debugging on it too!

    Error processing tar file(exit status 1): lchown /tmp/clr-debug-pipe-202-24216845-in: no such file or directory

    In the past I've seen this error when using an older container image with a new version of Docker. In fact a couple of weeks someone in the Sysbox slack channel reported this same issue and solve it by bumping the Docker version inside the container.

    Inside the container, can you do?

    lsb_release -a
    uname -a
    docker version

    Thanks!

    Go version: go1.13.15
    Git commit: 2291f61
    Built: Mon Dec 28 16:17:32 2020
    OS/Arch: linux/amd64
    Context: default
    Experimental: true

    Server: Docker Engine - Community
    Engine:
    Version: 20.10.2
    API version: 1.41 (minimum version 1.12)
    Go version: go1.13.15
    Git commit: 8891c58
    Built: Mon Dec 28 16:15:09 2020
    OS/Arch: linux/amd64
    Experimental: false
    containerd:
    Version: 1.4.3
    GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
    runc:
    Version: 1.0.0-rc92
    GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
    docker-init:
    Version: 0.19.0
    GitCommit: de40ad0

    hi @ctalledo ! thanks for getting back to me so fast. which container? the "outter" or "inner" container?

    I meant the outer container (the one launched with Docker + Sysbox). I should have been more explicit since we can get confused quickly :)

    If I understand correctly, the failure is occurring when doing a Docker build inside the outer container correct?

    ah interesting, on inspecting the resulting docker image that the base image dotnetcore came from it was created using docker version 19. Not sure how that is possible
    it reports as "DockerVersion": "19.03.13+azure",

    edit: checking out the other images that are in cache, they seem to be using docker 20.10, so its just the base image apparently

    hi @ctalledo ! thanks for getting back to me so fast. which container? the "outter" or "inner" container?

    I meant the outer container (the one launched with Docker + Sysbox). I should have been more explicit since we can get confused quickly :)

    If I understand correctly, the failure is occurring when doing a Docker build inside the outer container correct?

    yes correct. Both my host vm and the sysbox container are all running 20.10

    So I am actually using a docker-compose file for this, and its a little convoluted so I will try and make it not convoluted since it involves company code and get back to you soon thanks

    Sure; whatever you can share that allows me to repro would be great. Thanks!

    Hi @DPatrickBoyd ,

    I tried reproducing with a sysbox container using nestybox/ubuntu-bionic-systemd-docker , with the inner docker at version 19.03 or version 20.10, and was not able to reproduce. That is, inside that sysbox container I easily builda Dockerfile that looks like this:

    FROM mcr.microsoft.com/dotnet/core/sdk:3.1-buster
    RUN apt-get update && apt-get install -y nano
    

    I also tried reproducing with a sysbox container usning nestybox/ubuntu-focal-systemd-docker, with the inner docker at version 19.03, and no problem there either.

    I suspect the Dockerfile I used is too simple, so if you could provide more info on the Dockerfile that is causing the failure it would be useful.

    ok I was able to replicate with general files
    here is a random dotnet application I found https://github.com/dotnet-architecture/eShopOnWeb.git

    sysbox was brought up with:

    sudo docker run --runtime=sysbox-runc --rm -it --hostname my_cont ubuntu:latest bash

    I then used docker install manually, and started the docker daemon with system docker start

    I then used
    sudo docker exec -it $containerid bash
    to get inside of it

  • just run inside of sysbox:
    git clone https://github.com/dotnet-architecture/eShopOnWeb.git
    cd eShopOnWeb
    docker-compose build
  • I followed the steps but was not able to repro. Here is what I did:

  • Launched the sysbox container:
  • docker run --runtime=sysbox-runc --rm -it --hostname my_cont ubuntu:latest bash
    

    The remainder of the commands occur inside the container.

  • Verified the container has Ubuntu Focal in it:
  • # cat /etc/os-release 
    NAME="Ubuntu"
    VERSION="20.04.1 LTS (Focal Fossa)"
    
  • Installed and started Docker:
  • # apt-get update && apt install docker.io
    # dockerd > /var/log/dockerd.log 2>&1 &
    

    NOTE: this installed docker v19.03.8.

  • Installed Docker compose:
  • # apt-get install curl
    # curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    # chmod +x /usr/local/bin/docker-compose
    # ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
    

    NOTE: this installed docker-compose 1.27.4

  • Cloned the eShopOnWeb repo and built it:
  • # git clone https://github.com/dotnet-architecture/eShopOnWeb.git
    # cd eShopOnWeb/
    # docker-compose build
    

    This worked without a problem.

    I suspect the Docker versions I used are different than what you used.

    Can you confirm the versions of Docker and Docker-compose you had inside the container?

    My versions are all 20+ for docker. Doubt docker-compose is a deal breaker. I will downgrade to docker v19 and report back. Can you try with version 20 and see if it fails for you?

    Sounds good, let's do that. Thanks!

    ok, so reporting back. I was able to get it to work by downgrading, but I had to manually download and install docker-ce, docker-cli and containerd.io .deb files in the dockerfile because using docker.io was breaking init.d.

    I was able to successfully build using sysbox, so thank you!

    My only concern now at the moment is how changing runtimes from runc to sysbox could potentially effect the downstream (ie production) environment. How different is sysbox and what sort of things could it effect?

    Does it alter the resulting image or information in anyway that is unique or propietary?

    I was able to get it to work by downgrading

    Got it. I still want to get to the bottom of why the downgrade is needed, so will take a closer look.

    but I had to manually download and install docker-ce, docker-cli and containerd.io .deb files in the dockerfile because using docker.io was breaking init.d.

    I see; that's strange. In my case the outer container had systemd in it, and installing docker.io worked perfectly.

    My only concern now at the moment is how changing runtimes from runc to sysbox could potentially effect the downstream (ie production) environment. How different is sysbox and what sort of things could it effect?

    Sysbox can live side-by-side with the OCI runc, so it's not an "either" choice.

    As you are just getting familiarized with Sysbox, my suggestion is that you use Sysbox for containers that run workloads that otherwise require privileged containers with the OCI runc. Things like Docker-in-Docker, systemd-in-docker, or even k8s-in-Docker. This way you avoid the security risks posed by privileged containers. It's great for CI/CD, container-based dev environments, sandboxing, etc.

    Having said this, we strive to make Sysbox a superset of the OCI runc, meaning that Sysbox should be capable of running any workloads that run in containers with the OCI runc, but do so more securely. This is the case already for most workloads, though there are still a few issues.

    Does it alter the resulting image or information in anyway that is unique or propietary?

    No. Sysbox places no requirements on the container image. Rather, it works by enhancing the container abstraction, such that processes running inside the container see an environment that resembles that of a VM or physical machine (though it's really a container).

    Hope that helps!

    yes it does help thank you :)

    One thing that did happen was that at some point my containers got restarted, and the docker daemon inside of the sysbox container couldn't start up again, it still had a containerd process running somehow and the .pid file was still there for docker. Not sure if there is something I can do, or if there is a better way to handle ungracefully shutdown containres? I can make a new issue for this or we can take it into slack

    One thing that did happen was that at some point my containers got restarted, and the docker daemon inside of the sysbox container couldn't start up again, it still had a containerd process running somehow and the .pid file was still there for docker

    A good way to deal with that is to add a process manager to the container (e.g., systemd, supervisord, etc), such that when the container starts it can automatically start the processes / services you want it to. It also has the advantage of handling process reaping / reparenting for your container.

    Systemd is a bit heavy but Docker integrates well with it, so it will be able to restart the Docker service and automatically remove the .pid file.

    You can find examples of Dockerfiles that add systemd or supervisord to the container here:

    https://github.com/nestybox/dockerfiles

    on further inspection, I am using the init flag for starting containers, which uses docker-init (tini) for PID 1, and I believe that is interfering with systemd starting up as it wants to be PID 1

    Got it, thanks.

    changed the title dotnet restores not working inside of docker-in-docker using sysbox as runtime Docker build inside sysbox container results in "lchown ... no such file or directory" errors Feb 24, 2021

    We spotted a lchown error in our gitlab-ci infrastructure too which happens when pulling an oracle image within a docker dind container:

    [ERROR] DOCKER> Unable to pull '[...]oracle-xe-11-2-0-2:RELEASE-1.0.1' from registry '[...]' : failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory  [failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory ]
    

    This happens while pulling an image built from this Dockerfile:
    https://github.com/oracle/docker-images/tree/main/OracleDatabase/SingleInstance/dockerfiles/11.2.0.2
    http://download.xskernel.org/soft/linux-rpm/oracle-xe-11.2.0-1.0.x86_64.rpm.zip

    It only happens within a docker dind container running on a host docker-daemon using sysbox-runc. When running docker dind on runc (with --privileged) it works.

    This does NOT happen on the host docker-daemon using sysbox-runc so it is not a general sysbox problem.

    Hi @nudgegoonies , thanks for the latest report.

    Question: what's the version / tag of the docker:dind container image on which this happens? Does it happen with the latter docker:dind?

    I ask because in the past we've seen problems with the docker:18.04-dind image, but these don't repro with the 19.04 image.

    Thanks!

    You have to build a docker image with the Dockerfile.ex and the .zip file linked in my above comment and store it in a registry. Then start a dind with volume:

    docker volume create --name docker-dind
    docker pull docker:20.10.2-dind
    /docker run --name docker-dind -v docker-dind:/var/lib/docker -d docker:20.10.2-dind
    

    Then exec into the docker-dind container and pull the selb built oracle image.

    Hi @nudgegoonies, had to dig a bit to get to the bottom of this one, but I think I've found the reason for the problem.

    First, I reproduced the problem by launching a sysbox container (with the nestybox/ubuntu-focal-systemd-docker image), and inside of it launching the docker CLI and docker daemon containers as follows:

    $ docker network create some-network                                                                                                                                                                                                                                                                                          
    $ docker volume create --name docker-dind                                                                                                                                                                                                                                                                                     
    $ docker pull docker:20.10.2-dind  
    # Inner Docker dind container:
    $ docker run --privileged --name dind -d  -v docker-dind:/var/lib/docker  --network some-network --network-alias docker     -e DOCKER_TLS_CERTDIR=/certs     -v dind-certs-ca:/certs/ca     -v dind-certs-client:/certs/client docker:20.10.2-dind
    # Inner Docker CLI container:
    $ docker run -it --rm     --network some-network     -e DOCKER_TLS_CERTDIR=/certs     -v dind-certs-client:/certs/client:ro     docker:latest sh
    

    Then, from the inner Docker CLI container, I pulled the oracle database container image you mentioned above.

    The pull failed with:

    failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory  
    

    I then straced the docker pull operation, I found that the failure occurs in the fchownat() syscall below:

    2407193 newfstatat(AT_FDCWD, "/dev/initctl", 0xc000a6eac8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
    2407193 fchownat(AT_FDCWD, "/dev/initctl", 0, 0, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
    2407193 write(2, "lchown /dev/initctl: no such fil"..., 46) = 46
    

    Basically, it looks like this image requires /dev/initctl; as a result, Docker is looking for /dev/initctl during the image extraction but this device does not exist within the ephemeral docker container (spawned inside the dind container) where the extraction is taking place.

    I then repeated the experiment by running the same commands above, but this time at host level (i.e., not inside the sysbox container). Interestingly, this time things worked. I straced the docker daemon, I found the following:

    2496154 newfstatat(AT_FDCWD, "/dev/initctl",  <unfinished ...>
    2496154 <... newfstatat resumed>0xc000a3cc68, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
    2496154 mknodat(AT_FDCWD, "/dev/initctl", S_IFIFO|0600 <unfinished ...>
    2496154 <... mknodat resumed>)          = 0
    2496154 fchownat(AT_FDCWD, "/dev/initctl", 0, 0, AT_SYMLINK_NOFOLLOW <unfinished ...>
    2496154 <... fchownat resumed>)         = 0
    

    Notice the difference: docker called mknod on /dev/initctl during the image extraction. As a result, the subsequent fchownat() worked fine.

    So why did Docker not call mknod when the dind image run inside the sysbox container, but did call it when the dind image run on the host?

    Looking at the Docker code, it appears the answer is here:

     186case mode&os.ModeDevice != 0:                                                                                                                                                                                                                                                                                    
     187if sys.RunningInUserNS() {                                                                                                                                                                                                                                                                                    
     188// cannot create a device if running in user namespace                                                                                                                                                                                                                                                     
     189return nil                                                                                                                                                                                                                                                                                                 
     190 │          }                                                                                                                                                                                                                                                                                                             
     191if err := unix.Mknod(dstPath, stat.Mode, int(stat.Rdev)); err != nil {                                                                                                                                                                                                                                        
     192return err                                                                                                                                                                                                                                                                                                 
     193 │          } 

    Since Sysbox containers always use the Linux user-namespace (for strong isolation), the Docker daemon running inside the inner dind container is refusing to use mknod to create the /dev/initctl device required by the Oracle image. As a result, the subsequent fchownat() fails.

    This explains the failure. It's really caused by Docker's assumption that within a user-ns mknod is not allowed. This is generally true, but does not take into account that container runtimes like Sysbox (or LXD for example) can deal properly with such operations by virtue of intercepting the mknod syscall, examining if it's allowed, and if so handling it on behalf of the container. Thus, it would be better if Docker had called mknod() and if it failed, optionally check if it's running in userns().

    As far as a solution, I don't have a good one right now. The only work-around I found was to use the docker:19.03.2-dind image instead of the docker:20.10.2-dind image (which suggests the Docker source code check for userns I copied above must have been recently added).

    I'll think if there is some other solution to make this work with docker:20.10.2-dind.

    Hi @nudgegoonies ,

    One question comes to my mind as this behavior comes from using userns. Would shiftfs help in this situation? There are already "inofficial" dkms solutions available for running shiftfs kernel module on Debian.

    No it won't unfortunately. Sysbox always creates containers with the Linux user-namespace (for strong isolation), regardless of whether shiftfs is present or not. Thus, the inner Docker will refuse to mknod() and the docker pull of the oracle database container imagewill fail.

    The presence of shiftfs in the kernel is complementary to user-ns: if present in the kernel, it means Docker can continue to create the container's filesystem with host root:root ownership, yet the container will have access to it even though the container's root user is not the host's root user (by virtue of Sysbox using the user-namespace). Without shiftfs, Docker needs to create the container's filesystem with ownership that matches the container's root user (i.e., Docker must be configured in userns-remap mode).

    Hope that helps.

    Pulling image gcr.io/cloud-foundation-cicd/cft/developer-tools inside the Sysbox container fails