Opened 2 years ago

Last modified 9 months ago

#1061 new Bug / Defect

Client cannot reconnect because of pushed routes

Reported by: ar4chn0 Owned by:
Priority: major Milestone:
Component: Networking Version:
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc:

Description

Hey, I have checked currently reported bugs and I am pretty sure this one wasn`t in there. If I have missed it - then I would like to apologize and you can remove this ticket by pointing me to the appropriate issue.

The issue is described in the following steps which can also be used to recreate it:

  1. Connect to VPN server
  2. Remove eth cable from the PC
  3. Wait till you get Inactivity timeout error
  4. Put the cable back in and wait for the client to reconnect (it cannot)
  5. Check routing table:
    0.0.0.0/1 via 10.7.7.1 dev tun0 
    default via 192.168.2.1 dev enp1s0 proto dhcp metric 20100 
    10.7.7.0/24 dev tun0 proto kernel scope link src 10.7.7.2 
    128.0.0.0/1 via 10.7.7.1 dev tun0 
    169.254.0.0/16 dev enp1s0 scope link metric 1000 
    192.168.2.0/24 dev enp1s0 proto kernel scope link src 192.168.2.51 metric 100 
    
  6. Remove these routes:
    0.0.0.0/1 via 10.7.7.1 dev tun0 
    10.7.7.0/24 dev tun0 proto kernel scope link src 10.7.7.2 
    128.0.0.0/1 via 10.7.7.1 dev tun0 
    
  7. Successfully reconnect

I guess you can already see where the issue is. The routes aren`t removed, and the client tries to initiate the connection via the tunnel which no longer exists. Same with TCP and UDP.

Log:

Thu May  3 14:21:25 2018 Initialization Sequence Completed
Thu May  3 14:26:58 2018 [185.245.86.157] Inactivity timeout (--ping-restart), restarting
Thu May  3 14:26:58 2018 SIGUSR1[soft,ping-restart] received, process restarting
Thu May  3 14:26:58 2018 Restart pause, 5 second(s)
Thu May  3 14:27:03 2018 TCP/UDP: Preserving recently used remote address: [AF_INET]185.245.86.157:443
Thu May  3 14:27:03 2018 Socket Buffers: R=[87380->425984] S=[16384->425984]
Thu May  3 14:27:03 2018 Attempting to establish TCP connection with [AF_INET]185.245.86.157:443 [nonblock]
Thu May  3 14:29:03 2018 TCP: connect to [AF_INET]185.245.86.157:443 failed: Connection timed out
Thu May  3 14:29:03 2018 SIGUSR1[connection failed(soft),init_instance] received, process restarting
Thu May  3 14:29:03 2018 Restart pause, 5 second(s)

Relevant client configs:

client
dev tun
proto tcp
resolv-retry infinite
remote-random
nobind
persist-key
persist-tun
reneg-sec 0
remote-cert-tls server
pull
fast-io

Client version: OpenVPN 2.4.4

Relevant server configs:

push "redirect-gateway def1"
topology subnet

Server version: OpenVPN 2.4.5


If you need further information from me - please let me know.

Change History (5)

comment:1 Changed 2 years ago by tincantech

CC -- investigating this I discovered something else but I would like to see how this is resolved first.

@ar4chn0 -- I could not replicate this problem myself, reconnecting worked for me but I am still looking into it. What OS is your client, I presume windows ?

Edit: No, obviously the client is Linux: dev enp1s0

Last edited 2 years ago by tincantech (previous) (diff)

comment:2 Changed 2 years ago by Gert Döring

The problem is that the OS seems to remove the route to the VPN server that we have installed (to be able to reach the server without going through the tunnel). Not much we can do inside OpenVPN here (except "monitor routing information changes", which is very different on each platform and possibly not worth the hassle).

If you remove persist-tun from your config, this should be sufficient to fully tear down the tunnel on reconnect so routes get fixed again.

comment:3 Changed 2 years ago by ar4chn0

This was tested on Debian Stretch.

Removing persist-tun does fix the issue indeed. Thanks guys, I guess this issue can be closed.

comment:4 Changed 2 years ago by Antonio

I actually had a similar problem too.

In my case it happened when I was losing the uplink connection and the uplink interface was getting reconfigured (thus losing the route to the VPN server).

Maybe one way to fix this in a "generic manner" is to always check that there is a route to the VPN server passing through the main GW "if" redirect-gateway is specified?

comment:5 Changed 9 months ago by Gert Döring

There are a few possible ways to tackle it

  • listen to a netlink/route socket and be informed if our /32 or /128 host route goes away (due to interface flap, DHCPD or NM resetting all routes), and/or the default gateway changes, and re-install what is needed
  • do tricks with "ip rule" and fwmark to get "user traffic" into the VPN tunnel without having to change the "system routes" (= we do not care if the LAN interface changes, host routes go away, ...)
  • try binding to the interface (SO_BINDTODEVICE?) and make openvpn packets independent from the routing table. This has caveats with rp_filter eating reply packets (because if the route points to tun0, rp_filter=1 will drop such packets coming from eth0)
Note: See TracTickets for help on using tickets.