Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#891 closed Bug / Defect (fixed-external)

Restart, "Address already in use" for management port

Reported by: teco Owned by:
Priority: major Milestone:
Component: Management Version: OpenVPN 2.4.0 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: restart address_in_use
Cc:

Description

When openvpn is restarted, and a management interface is configured, the startup could fail because the port is in use.

root@host:/home/test-openvpn# kill $(netstat -anp | grep openvpn | awk '/2201/{print $6}' | awk -F\/ '{print $1}'); openvpn --config openvpn-server.conf
Mon May 15 06:08:39 2017 OpenVPN 2.4.1 i686-pc-linux-gnu [SSL (mbed TLS)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Apr 10 2017
Mon May 15 06:08:39 2017 library versions: mbed TLS 2.4.2, LZO 2.08
Mon May 15 06:08:39 2017 MANAGEMENT: Socket bind failed on local address [AF_INET]127.0.0.1:666: Address already in use
Mon May 15 06:08:39 2017 Exiting due to fatal error
root@host:/home/test-openvpn# 

Tested with OpenVPN 2.4.1 (openvpn-nl edition).

Changing the config a bit solves the problem, it seems timing related. E.g. usage of pid file and kill $(cat pid_file) does not result in this failure. Or sometimes it does not....

But the problem does occur with for example the debian service openvpn restart. Restarting remotely results in lost connection.

Maybe the management listening port is not closed at stop and cleanup of socket by OS takes a bit longer than startup of new OpenVPN instance.

Change History (7)

comment:1 Changed 3 years ago by Steffan Karger

This looks to me as a race condition in the way you kill and restart the processes.

kill send a signal to the OpenVPN process, and returns once it has sent that signal. It will not wait until the openvpn process has stopped. If your shell is then faster in starting the new openvpn process, than the old openvpn process is with cleaning up and shutting down (amongst which is closing the mgmt interface), the new process will not (yet) be able to open the socket.

To fix that, make sure your script waits for the process to be ended before it starts the new one.

(And if our init script does something similar, we need to fix that.)

Last edited 3 years ago by Steffan Karger (previous) (diff)

comment:2 Changed 3 years ago by teco

With SIGHUP, I've got a build-in pause:

Mon May 15 22:25:45 2017 us=588914 Restart pause, 5 second(s)

The debian init script has stop and start without any delay.

restart)
  shift
  $0 stop ${@}
  $0 start ${@}
  ;;

Maybe add a delay here? Probably subsecond is enough.

comment:3 Changed 3 years ago by teco

So this would be a change in https://anonscm.debian.org/cgit/collab-maint/openvpn.git/tree/debian/openvpn.init.d .
Debian package is a bit out of date, it should update to 2.4.1, at least testing.

Version 0, edited 3 years ago by teco (next)

comment:4 Changed 3 years ago by teco

The init.d stop function is implemented with start-stop-daemon, since #716794. Stop function waits until daemon is stopped.
OpenVPN-NL edition missed this patch. But is has the old 'sleep 1' method.
But with systemd, this sleep command is not used. I'll check why systemd stop as part of restart does not wait for finish.

comment:5 Changed 3 years ago by teco

Tests with updated init script with openvpn-nl partly solves the problem, stop finishes when openvpn-nl daemon is stopped (function of start-stop-daemon). As a second improvement, system redirect is disabled with _SYSTEMCTL_SKIP_REDIRECT=1, inserted before /lib/lsb/init-functions.

As this is a debian and openvpn-nl issue, I close this ticket.

comment:6 Changed 3 years ago by Steffan Karger

Resolution: fixed-external
Status: newclosed

This indeed in a debian packaging issue (also in openvpn-nl), so I'll close this ticket as you requested.

Note: See TracTickets for help on using tickets.