squid crashes on startup in docker

Bug #1978272 reported by Wen-Ding Zeng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Docker Images
Fix Released
Medium
Athos Ribeiro

Bug Description

Squid crashes immediately on startup in docker.

```sh
docker run --name squid ubuntu/squid:5.2-22.04_beta
```

The error messages are:

```
2022/06/10 06:20:01| WARNING: BCP 177 violation. Detected non-functional IPv6 loopback.
2022/06/10 06:20:01| FATAL: xcalloc: Unable to allocate 1073741816 blocks of 432 bytes!

2022/06/10 06:20:01| Squid Cache (Version 5.2): Terminated abnormally.
CPU Usage: 0.019 seconds = 0.011 user + 0.009 sys
Maximum Resident Size: 69728 KB
Page faults with physical i/o: 0
/usr/local/bin/entrypoint.sh: line 24: 11 Aborted (core dumped) /usr/sbin/squid -Nz
2022/06/10 06:20:01| WARNING: BCP 177 violation. Detected non-functional IPv6 loopback.
2022/06/10 06:20:01| FATAL: xcalloc: Unable to allocate 1073741816 blocks of 432 bytes!

2022/06/10 06:20:01| Squid Cache (Version 5.2): Terminated abnormally.
CPU Usage: 0.018 seconds = 0.012 user + 0.006 sys
Maximum Resident Size: 69376 KB
Page faults with physical i/o: 0
/usr/local/bin/entrypoint.sh: line 25: 12 Aborted (core dumped) /usr/sbin/squid "$@"
```

Squid was trying to allocate 1073741816 x 432 bytes, which was about 432 GiB memory. This should be abnormal.

When squid is run in the host, no error occurred.

Environment:

```sh
$ docker --version
Docker version 20.10.14, build a224086
$ cat /etc/fedora-release
Fedora release 35 (Thirty Five)
$ uname -r
5.16.18-200.fc35.x86_64
```

The detailed log (passing `-X` to print debug messages) could be found [here](https://pastebin.com/gKE4qUc0).

Tags: squid
Wen-Ding Zeng (wdzeng)
description: updated
description: updated
description: updated
Wen-Ding Zeng (wdzeng)
tags: added: squid
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote (last edit ):
Download full text (3.3 KiB)

Hello,

This seems to be an instance of this issue described in the upstream mailing lists:
https://<email address hidden>/msg19232.html

In a debugging session (https://<email address hidden>/msg19252.html) we see that squid calls

fd_table =(fde *) xcalloc(Squid_MaxFD, sizeof(fde));

And as per https://<email address hidden>/msg19266.html,

Squid_MaxFD is set based on ulimit. In the squid config file (/etc/squid/squid.conf), we have:

# TAG: max_filedescriptors
# Set the maximum number of filedescriptors, either below the
# operating system default or up to the hard limit.
#
# Remove from squid.conf to inherit the current ulimit soft
# limit setting.
#
# Note: Changing this requires a restart of Squid. Also
# not all I/O types supports large values (eg on Windows).
#Default:
# Use operating system soft limit set by ulimit.

On a fedora host, when I read /proc/1/limits, I see

$ cat /proc/1/limits
Max open files 1073741816 1073741816 files

Which matches the numbers in your error.

While in Ubuntu, I get
Max open files 1048576 1048576 files

Interestingly enough, in a fedora host, when I check the same value witin a __podman__ container, the value is set to "1048576" as well. While I do not have docker installed in the fedora host I have access to, I believe that if you check that value within your docker container, you will get 1073741816 as well.

Do note that these values are hardcoded. For instance:
https://github.com/containers/podman/blob/v4.1.0/libpod/define/config.go#L92

Now for why the values are set in that specific number? At least in podman, this was done to match Docker defaults, as seen in
https://github.com/containers/podman/pull/1355, and
https://github.com/containers/buildah/commit/a2b018430df1e8e89b23cb5bfa49e4d3517e1c2d

However, later in this limit changing timeline, Docker performed a change on how this limits are set:
https://github.com/moby/moby/commit/80039b4699e36ceb0eb81109cd1686aaa805c5ec

But fedora did set the deamon to use an even lower value:
https://bugzilla.redhat.com/show_bug.cgi?id=1715254#c1
https://src.fedoraproject.org/rpms/moby-engine/blob/f35/f/docker.sysconfig#_7
https://src.fedoraproject.org/rpms/moby-engine/blob/rawhide/f/docker.sysconfig#_7

Are you using the (Fedora) distribution docker (moby-engine) package or are you using a different provider for your docker setup?

Moving on here, the following workaround alternatives exist:

1) Run the squid containers setting the maximum number of file descriptors to a lower value with "--ulimit nofile=1048576:1048576"

2) Since you are in Fedora, you could use podman (or moby-engine, if that is not the case already).

3) Configure your docker installation to use a lower number of file descriptors.

4) Edit the squid configuration file to use a specific value for max_filedescriptors, e.g., 1048576.

Finally, when it comes to providing a fix for the issue, it seems that this should be discussed in squid upstream. If we were going to set a default value for max_filedescriptors in the squid ...

Read more...

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

BTW, just for completeness here: Fedora's squid cap the maximum open file descriptors limit for squid in the systemd service file. If that is not present, squid will most likely crash on startup if max_filedescriptors is not set in the configuration file.

https://src.fedoraproject.org/rpms/squid/blob/rawhide/f/squid.service#_8

Revision history for this message
Wen-Ding Zeng (wdzeng) wrote :

Hi,

Thank you very much for these information.

I have checked the environment you mentioned, and, indeed, the issue is due to the number of max open files. Therefore, this should not be considered a bug concerning ubuntu docker images.

Thank you very much for your help!

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

I did dive a bit further into the issue and decided to check the history of that value in Fedora's systemd configuration.

It turns out that it was a request to __increase__ the limit of open files [1].

Checking the documentation of LimitNOFILE [2] led me to learning that systemd is compiled with default values that are applied to all units [3].

DefaultLimitNOFILE= defaults to 1024:524288.

Squid uses use the soft limit for this value, as described in the configuration file:

# cat /etc/squid/squid.conf | grep max_filedescrip -A11
# TAG: max_filedescriptors
# Set the maximum number of filedescriptors, either below the
# operating system default or up to the hard limit.
#
# Remove from squid.conf to inherit the current ulimit soft
# limit setting.
#
# Note: Changing this requires a restart of Squid. Also
# not all I/O types supports large values (eg on Windows).
#Default:
# Use operating system soft limit set by ulimit.

Neither Ubuntu nor Fedora change these limits:

$ systemctl show -P DefaultLimitNOFILE
524288
$ systemctl show -P DefaultLimitNOFILESoft
1024

And this is what [1] was about. Bumping the limits from 1024 to 16384 to support (possibly very) larger cache directories (note that Ubuntu's squid does not carry this patch).

This also means that my earlier assessment saying that

> If that [systemd unit file patch] is not present, squid will most likely crash on startup if max_filedescriptors is not set in the configuration file.

was incorrect. squid will not crash without max_filedescriptors being set due to the default systemd limits.

Finally, re-assessing this issue with this information, it seems that, if we want to mimic the behavior of a default Ubuntu squid process (as run from systemd), we do want to set the max_filedescriptors configuration to 1024 in our configuration file.

Moreover, it would be nice to document the need for setting max_filedescriptors for users injecting their configuration files into the image.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=772481
[2] https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties
[3] https://www.freedesktop.org/software/systemd/man/systemd-system.conf.html#DefaultLimitCPU=

Changed in ubuntu-docker-images:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Hi wdzeng,

I filed https://code.launchpad.net/~athos-ribeiro/ubuntu-docker-images/+git/squid/+merge/424521 to address this issue.

Could you please build the image from that branch and confirm it fixes the issue for you?

Changed in ubuntu-docker-images:
status: Triaged → Fix Committed
assignee: nobody → Athos Ribeiro (athos-ribeiro)
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Thanks for testing the MP, wdzeng.

This has been re-built and re-tagged in DockerHub and in ECR.

Changed in ubuntu-docker-images:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.