Project

General

Profile

Bug #6635

Long startup delay caused by random generator on v3.5

Added by Max Dobler over 2 years ago. Updated about 2 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
Start date:
01/05/2017
Due date:
% Done:

0%

Estimated time:
Affected versions:
Security IDs:

Description

  • Installed fresh Alpine 3.5
  • Activated openssh
  • on reboot: Boot freezes on openssh startup for a minute or more.
  • afterwards ssh working normally
  • deactivated openssh
  • activatetd dropbear: no freezes on boot, ssh up fast and working o.k.

Haven't seen this in 3.4.x

Max


Related issues

Related to Alpine Linux - Bug #6329: wpa_supplicant delay after libressl adoptionNew10/11/2016

History

#1 Updated by Jakub Jirutka over 2 years ago

  • Subject changed from Alpine 3.5 openssh: freezes to main/openssh: freezes on Alpine 3.5
  • Category set to Aports
  • Priority changed from Normal to High

#2 Updated by Jones Wilson over 2 years ago

I'm also experiencing this issue after upgrading Alpine domU's to 3.5.0.

Using strace I can see that sshd is freezing on the getrandom() syscall and it resumes after the message random: nonblocking pool is initialized appears in dmesg.

I'm able to reduce the freeze time from 2.3 minutes to 70 seconds by installing haveged.

#3 Updated by Jakub Jirutka over 2 years ago

  • Subject changed from main/openssh: freezes on Alpine 3.5 to Long startup delay caused by random generator on v3.5
  • Category deleted (Aports)

I’ve upgraded one virtual machine to v3.5 and I see the same behaviour.

The important information is that this is not related to ssh daemon. In my case it hangs on nrpe, when I disable it, then it hangs on openntpd and when I disable even this, then it hangs on sshd.

As Jones already discovered, the problem is in initialization of random generator. The delay we experience is probably caused by lack of entropy.

I’m going to investigate further.

#4 Updated by Jakub Jirutka over 2 years ago

I’ve added virtio-rng to QEMU/KVM and long delay is gone. However I still don’t know what has caused this change.

#5 Updated by Olivier Goudron over 2 years ago

I had the same problem with openssh on a minimal fresh install on intel NUC 5CP hardware.
Just for information : if i hit some keys on the keyboard then openssh service succeed to start quickly.
This seems to confirm the lack of entropy cause.

#6 Updated by Natanael Copa over 2 years ago

Then i think I can see this too with wpa_supplicant. It sometimes hangs til i press keys on keyboard. I never thought it could be due to lack of entropy.

I wonder if there is some kernel module that should be autoloaded for ng?

#7 Updated by anta ​ over 2 years ago

I got delays between wpa_supplicant and networking too.

Reported 3 months ago: https://bugs.alpinelinux.org/issues/6329

I started noticing it after libressl adoption, but could be a coincidence.

#8 Updated by Timo Teräs over 2 years ago

  • Related to Bug #6329: wpa_supplicant delay after libressl adoption added

#9 Updated by Timo Teräs over 2 years ago

It is possible that libressl has different entropy source by default from openssl (random vs. urandom) or similar. Need to look at the code.

#10 Updated by anta ​ over 2 years ago

On boot the random number generator is initialized after networking and wpa_supplicant.
Could that be the issue?

I get ` Initializing random number generator ... ` after those things.
What if it was done before?

#11 Updated by Timo Teräs over 2 years ago

Seems libressl has migrated to use it's own arc4random for all random number generation. The entropy injection comes from getrandom(2) syscall. And it blocks until the kernel's entropy pool has been initialized. This is probably causing the slow down.

#13 Updated by Natanael Copa about 2 years ago

Does it help to add jitterentropy_rng to /etc/modules?

#14 Updated by anta ​ about 2 years ago

I stopped experiencing this issue after upgrading from 3.5.1 to Edge.
I'd bet the Kernel folks changed something on their side.

#15 Updated by Ralph Siemsen about 2 years ago

Natanael Copa wrote:

Does it help to add jitterentropy_rng to /etc/modules?

No difference for me, boot still hangs until I plug in keyboard and start typing a few characters.

Also I tried upgrading to 3.5.2 (kernel 4.4.52-0-grsec) but the problem persists.

#16 Updated by anta ​ about 2 years ago

Try updating to edge (kernel 4.9.13-0-grsec)
That solved for me

#17 Updated by Ralph Siemsen about 2 years ago

Al P wrote:

Try updating to edge (kernel 4.9.13-0-grsec)
That solved for me

I saw your previous reply 14, but this is not really an option I want to explore. I'm running the stable version on purpose -- for my little gateway I want stable, no surprises, simple 24x7 operation. Alpine has been really great apart from this little hiccup.

I see no indication that upstream libressl plans to address the issue. So if the fix is in the newer kernel, could this be backported, or alternatively, could stable Alpine move to the newer kernel? I realize I can do that manually myself, I am asking more on behalf of all the other Alpine users and newbies. Many of whom will not realize what is happening, and even fewer who will find this bug report to add their comments.

#18 Updated by Matt Hoyle about 2 years ago

Jones Wilson wrote:

Using strace I can see that sshd is freezing on the getrandom() syscall and it resumes after the message random: nonblocking pool is initialized appears in dmesg.

I'm able to reduce the freeze time from 2.3 minutes to 70 seconds by installing haveged.

Same thing on 3.5.2 with the 4.4.52 kernel. The block occurs on either ssh-keygen for first run or the sshd config test.

Thanks for the haveged tip, went from 120 -> 90 seconds at least.

#19 Updated by Ralph Siemsen about 2 years ago

Matt Hoyle wrote:

Thanks for the haveged tip, went from 120 -> 90 seconds at least.

I have installed a tool similar to haveged on my machine: http://chronox.de/jent/jitterentropy-2.0.1.tar.xz
It solves the problem with SSH starting up, I no longer need to plug in a keyboard.

Also available in: Atom PDF