Opened 7 years ago
Closed 3 years ago
#1010 closed Bug / Defect (fixed)
p2p, tls-client/tls-server, connect-retry not playing nicely
Reported by: | Gert Döring | Owned by: | Gert Döring |
---|---|---|---|
Priority: | minor | Milestone: | release 2.5.3 |
Component: | Generic / unclassified | Version: | OpenVPN 2.4.4 (Community Ed) |
Severity: | Not set (select this one, unless your'e a OpenVPN developer) | Keywords: | |
Cc: | Selva Nair, tct |
Description
we've sabotaged p2p with tls-client/tls-server in interesting ways in 2.4
--connect-retry
has an exponentially growing delay nowadys, leading to up-to-300s "dead time" on the tls-server(!) side -- so when the network gets disrupted for a longer time, and ping-restart is in use, it can happen that the tls-server is just "not listening" to incoming client packets when the client tries, and when the server is ready to listen, the client is in connect-retry sleep...
workaround: --connect-retry 1 1
on the --tls-server
side
Better fix: default to "no increase in delay" (as a default) on --tls-server
Change History (12)
comment:1 Changed 7 years ago by
comment:2 follow-up: 7 Changed 7 years ago by
So the offending case is (topology == TOP_P2P && !options.ce.remote).
Shall we make the backoff conditional on (options.ce.remote != NULL) as that will be false for the listening side (server or p2p) ?
comment:3 Changed 7 years ago by
Hi,
As requested by Gert here an excerpt from our main OpenVPN logs showing the timing problems :-)
Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Preserving previous TUN/TAP instance: ovpn-bbr-wagsh Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Could not determine IPv4/IPv6 protocol. Using AF_INET Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Using bind-dev vrf_external Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link local (bound): [AF_INET]a.b.c.d:54010 Jan 4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link remote: [AF_UNSPEC] Jan 4 06:38:12 cr03 ovpn-cr03_bbr-wagsh[499]: [UNDEF] Inactivity timeout (--ping-restart), restarting Jan 4 06:38:12 cr03 ovpn-cr03_bbr-wagsh[499]: SIGUSR1[soft,ping-restart] received, process restarting Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: Preserving previous TUN/TAP instance: ovpn-cr03 Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: TCP/UDP: Preserving recently used remote address: [AF_INET]a.b.c.d:54010 Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link local: (not bound) Jan 4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link remote: [AF_INET]a.b.c.d:54010 Jan 4 06:40:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: [UNDEF] Inactivity timeout (--ping-restart), restarting Jan 4 06:40:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: SIGUSR1[soft,ping-restart] received, process restarting Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Preserving previous TUN/TAP instance: ovpn-bbr-wagsh Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Could not determine IPv4/IPv6 protocol. Using AF_INET Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Using bind-dev vrf_external Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link local (bound): [AF_INET]a.b.c.d:54010 Jan 4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link remote: [AF_UNSPEC] Jan 4 06:43:42 cr03 ovpn-cr03_bbr-wagsh[499]: [UNDEF] Inactivity timeout (--ping-restart), restarting Jan 4 06:43:42 cr03 ovpn-cr03_bbr-wagsh[499]: SIGUSR1[soft,ping-restart] received, process restarting Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: Preserving previous TUN/TAP instance: ovpn-cr03 Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: TCP/UDP: Preserving recently used remote address: [AF_INET]a.b.c.d:54010 Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link local: (not bound) Jan 4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link remote: [AF_INET]185.46.137.162:54010 Jan 4 06:46:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: [UNDEF] Inactivity timeout (--ping-restart), restarting Jan 4 06:46:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: SIGUSR1[soft,ping-restart] received, process restarting
Server configuration:
local a.b.c.d port 54010 tls-server bind-dev vrf_external proto udp dev-type tap dev ovpn-bbr-wagsh ca /etc/ssl/certs/ffho-cacert.pem cert /etc/ssl/certs/cr03....cert.pem key /etc/ssl/private/cr03....key.pem dh /etc/openvpn/dh1024.pem script-security 2 up /etc/openvpn/ifup down /etc/openvpn/ifdown keepalive 10 30 comp-lzo persist-key persist-tun status /var/log/openvpn/openvpn-status-cr03_bbr-wagsh.log verb 1
Client configuration:
remote a.b.c.d 54010 tls-client nobind bind-dev vrf_external proto udp dev-type tap dev ovpn-cr03 ca /etc/ssl/certs/ffho-cacert.pem cert /etc/ssl/certs/bbr-wagsh....cert.pem key /etc/ssl/private/bbr-wagsh....key.pem dh /etc/openvpn/dh1024.pem script-security 2 up /etc/openvpn/ifup down /etc/openvpn/ifdown keepalive 10 30 comp-lzo persist-key persist-tun status /var/log/openvpn/openvpn-status-cr03_bbr-wagsh.log verb 1
The bad thing is, that it's hard to trigger this problem. It seems that it takes a while to reach this state. :-)
Best
Max
comment:5 Changed 4 years ago by
This problem is still present in case of udp
, secret
, no remote
[= passive listening] and ping-restart
(or keepalive
). See #1384.
comment:6 Changed 4 years ago by
Cc: | tct added |
---|
comment:7 Changed 4 years ago by
Hi,
dunno where this fell between the cracks of time and space...
Replying to Selva Nair:
So the offending case is (topology == TOP_P2P && !options.ce.remote).
Shall we make the backoff conditional on (options.ce.remote != NULL) as that will be false for the listening side (server or p2p) ?
... but this sounds like a reasonable approach.
Selva, since you already investigated the code ("3 years ago *cough*") would you be willing to send a patch? I'll see that I can set up a testbed to reproduce, and then fix in 2.4, 2.5, master...
comment:8 Changed 4 years ago by
Milestone: | → release 2.4.11 |
---|
comment:9 Changed 4 years ago by
Just realized that checking for remote will not fix p2p setups with --remote defined on both sides. For shared secret with --remote on both sides we have no real way for picking a side to disable the backoff logic. Shall we disable --backoff on both sides for shared secret p2p?
For p2p with TLS we can check options->tls_server.
comment:10 Changed 3 years ago by
Milestone: | release 2.4.11 → release 2.5.3 |
---|---|
Owner: | set to Gert Döring |
Status: | new → accepted |
commit 063d55afeea723fc6df0af29a19df257a8ab6920 (master)
commit d8dee82f1129ac6d3e4bcdc867726f5d64798dc7 (release/2.5)
commit 7029cece844d9324aff687981b8b6c33b099db2d (release/2.4)
Author: Selva Nair
Date: Wed Jun 2 15:47:39 2021 -0400
Apply the connect-retry backoff to only one side of a connection
we might not release another 2.4.x release - but if we do, it will be in 2.4.12.
Setting the milestone to "release 2.5.3" to document "it is in 2.5.3, to be released today" :-)
comment:11 Changed 3 years ago by
As a side note (an idea I only had after merging and pushing all this) - a possibly different approach could have been to do this in the option post-processing. That is, "if p2p, and --connect-retry
wasn't touched, change the default to 5 5
.
But that would have been even more magic code... so maybe this one is good enough - and if someone comes along and complains that p2p mode filled his disk due to "no backoff anymore", we can teach them the benefits of tls-client/tls-server
:-)
comment:12 Changed 3 years ago by
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
So, I think this can be closed? Code has been merged, corner-case fixed...
mmmh, looking at the commit
commit 5d429efd9720109b9c9f1265f5d351a75a401942
Author: Selva Nair <selva.nair@…>
Date: Tue Jul 5 11:32:50 2016 -0400
message, I see
so Arne noticed a variant of this already :-) - unfortunately, the combination of "
udp
,tls-server
, noremote
[= passive listening] andping-restart
" also triggers this.