Opened 6 years ago

Closed 2 years ago

#1010 closed Bug / Defect (fixed)

p2p, tls-client/tls-server, connect-retry not playing nicely

Reported by: Gert Döring Owned by: Gert Döring
Priority: minor Milestone: release 2.5.3
Component: Generic / unclassified Version: OpenVPN 2.4.4 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: Selva Nair, tct

Description

we've sabotaged p2p with tls-client/tls-server in interesting ways in 2.4

--connect-retry has an exponentially growing delay nowadys, leading to up-to-300s "dead time" on the tls-server(!) side -- so when the network gets disrupted for a longer time, and ping-restart is in use, it can happen that the tls-server is just "not listening" to incoming client packets when the client tries, and when the server is ready to listen, the client is in connect-retry sleep...

workaround: --connect-retry 1 1 on the --tls-server side

Better fix: default to "no increase in delay" (as a default) on --tls-server

Change History (12)

comment:1 Changed 6 years ago by Gert Döring

mmmh, looking at the commit

commit 5d429efd9720109b9c9f1265f5d351a75a401942
Author: Selva Nair <selva.nair@…>
Date: Tue Jul 5 11:32:50 2016 -0400

Exponentially back off on repeated connect retries

message, I see

  • Apply backoff only in the udp and tcp-client modes. Backing off on tcp-server could be exploited by a client in p2p-mode to maliciously slow it down (thanks to Arne Schwabe for pointing this out.

so Arne noticed a variant of this already :-) - unfortunately, the combination of "udp, tls-server, no remote [= passive listening] and ping-restart" also triggers this.

comment:2 Changed 6 years ago by Selva Nair

So the offending case is (topology == TOP_P2P && !options.ce.remote).

Shall we make the backoff conditional on (options.ce.remote != NULL) as that will be false for the listening side (server or p2p) ?

comment:3 Changed 6 years ago by Barbarossa

Hi,

As requested by Gert here an excerpt from our main OpenVPN logs showing the timing problems :-)

Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Preserving previous TUN/TAP instance: ovpn-bbr-wagsh
Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Could not determine IPv4/IPv6 protocol. Using AF_INET
Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: Using bind-dev vrf_external
Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link local (bound): [AF_INET]a.b.c.d:54010
Jan  4 06:37:42 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link remote: [AF_UNSPEC]
Jan  4 06:38:12 cr03 ovpn-cr03_bbr-wagsh[499]: [UNDEF] Inactivity timeout (--ping-restart), restarting
Jan  4 06:38:12 cr03 ovpn-cr03_bbr-wagsh[499]: SIGUSR1[soft,ping-restart] received, process restarting
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: WARNING: No server certificate verification method has been enabled.  See http://openvpn.net/howto.html#mitm for more info.
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: Preserving previous TUN/TAP instance: ovpn-cr03
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: TCP/UDP: Preserving recently used remote address: [AF_INET]a.b.c.d:54010
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link local: (not bound)
Jan  4 06:40:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link remote: [AF_INET]a.b.c.d:54010
Jan  4 06:40:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: [UNDEF] Inactivity timeout (--ping-restart), restarting
Jan  4 06:40:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: SIGUSR1[soft,ping-restart] received, process restarting
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Preserving previous TUN/TAP instance: ovpn-bbr-wagsh
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Could not determine IPv4/IPv6 protocol. Using AF_INET
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: Using bind-dev vrf_external
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link local (bound): [AF_INET]a.b.c.d:54010
Jan  4 06:43:12 cr03 ovpn-cr03_bbr-wagsh[499]: UDPv4 link remote: [AF_UNSPEC]
Jan  4 06:43:42 cr03 ovpn-cr03_bbr-wagsh[499]: [UNDEF] Inactivity timeout (--ping-restart), restarting
Jan  4 06:43:42 cr03 ovpn-cr03_bbr-wagsh[499]: SIGUSR1[soft,ping-restart] received, process restarting
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: WARNING: No server certificate verification method has been enabled.  See http://openvpn.net/howto.html#mitm for more info.
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: Preserving previous TUN/TAP instance: ovpn-cr03
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: TCP/UDP: Preserving recently used remote address: [AF_INET]a.b.c.d:54010
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link local: (not bound)
Jan  4 06:45:42 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: UDP link remote: [AF_INET]185.46.137.162:54010
Jan  4 06:46:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: [UNDEF] Inactivity timeout (--ping-restart), restarting
Jan  4 06:46:12 bbr-wagsh ovpn-cr03_bbr-wagsh[15381]: SIGUSR1[soft,ping-restart] received, process restarting

Server configuration:

local   a.b.c.d
port    54010

tls-server
bind-dev	vrf_external

proto   udp

dev-type tap
dev ovpn-bbr-wagsh

ca      /etc/ssl/certs/ffho-cacert.pem
cert    /etc/ssl/certs/cr03....cert.pem
key     /etc/ssl/private/cr03....key.pem
dh      /etc/openvpn/dh1024.pem

script-security 2
up      /etc/openvpn/ifup
down    /etc/openvpn/ifdown

keepalive 10 30

comp-lzo

persist-key
persist-tun

status /var/log/openvpn/openvpn-status-cr03_bbr-wagsh.log

verb 1

Client configuration:

remote	a.b.c.d 54010

tls-client
nobind
bind-dev	vrf_external

proto   udp

dev-type tap
dev ovpn-cr03

ca      /etc/ssl/certs/ffho-cacert.pem
cert    /etc/ssl/certs/bbr-wagsh....cert.pem
key     /etc/ssl/private/bbr-wagsh....key.pem
dh      /etc/openvpn/dh1024.pem

script-security 2
up      /etc/openvpn/ifup
down    /etc/openvpn/ifdown

keepalive 10 30

comp-lzo

persist-key
persist-tun

status /var/log/openvpn/openvpn-status-cr03_bbr-wagsh.log

verb 1

The bad thing is, that it's hard to trigger this problem. It seems that it takes a while to reach this state. :-)

Best
Max

comment:4 Changed 5 years ago by marthasimons

No diff for you, spammer!

Version 1, edited 5 years ago by marthasimons (previous) (next) (diff)

comment:5 Changed 3 years ago by nils.toedtmann

This problem is still present in case of udp, secret, no remote [= passive listening] and ping-restart (or keepalive). See #1384.

comment:6 Changed 3 years ago by tct

Cc: tct added

comment:7 in reply to:  2 Changed 3 years ago by Gert Döring

Hi,

dunno where this fell between the cracks of time and space...

Replying to Selva Nair:

So the offending case is (topology == TOP_P2P && !options.ce.remote).

Shall we make the backoff conditional on (options.ce.remote != NULL) as that will be false for the listening side (server or p2p) ?

... but this sounds like a reasonable approach.

Selva, since you already investigated the code ("3 years ago *cough*") would you be willing to send a patch? I'll see that I can set up a testbed to reproduce, and then fix in 2.4, 2.5, master...

comment:8 Changed 3 years ago by Gert Döring

Milestone: release 2.4.11

comment:9 Changed 3 years ago by Selva Nair

Just realized that checking for remote will not fix p2p setups with --remote defined on both sides. For shared secret with --remote on both sides we have no real way for picking a side to disable the backoff logic. Shall we disable --backoff on both sides for shared secret p2p?

For p2p with TLS we can check options->tls_server.

comment:10 Changed 3 years ago by Gert Döring

Milestone: release 2.4.11release 2.5.3
Owner: set to Gert Döring
Status: newaccepted

commit 063d55afeea723fc6df0af29a19df257a8ab6920 (master)
commit d8dee82f1129ac6d3e4bcdc867726f5d64798dc7 (release/2.5)
commit 7029cece844d9324aff687981b8b6c33b099db2d (release/2.4)
Author: Selva Nair
Date: Wed Jun 2 15:47:39 2021 -0400

Apply the connect-retry backoff to only one side of a connection

we might not release another 2.4.x release - but if we do, it will be in 2.4.12.

Setting the milestone to "release 2.5.3" to document "it is in 2.5.3, to be released today" :-)

Last edited 3 years ago by Gert Döring (previous) (diff)

comment:11 Changed 3 years ago by Gert Döring

As a side note (an idea I only had after merging and pushing all this) - a possibly different approach could have been to do this in the option post-processing. That is, "if p2p, and --connect-retry wasn't touched, change the default to 5 5 .

But that would have been even more magic code... so maybe this one is good enough - and if someone comes along and complains that p2p mode filled his disk due to "no backoff anymore", we can teach them the benefits of tls-client/tls-server :-)

comment:12 Changed 2 years ago by Gert Döring

Resolution: fixed
Status: acceptedclosed

So, I think this can be closed? Code has been merged, corner-case fixed...

Note: See TracTickets for help on using tickets.