Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#1086 closed Bug / Defect (fixed-external)

Routing is broken: ip addr commands do not take effect

Reported by: kuklinistvan Owned by:
Priority: critical Milestone: release 2.4.5
Component: Generic / unclassified Version: OpenVPN 2.4.5 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: eworm, David Sommerseth, Antonio Quartulli

Description

Manually executing the ip addr commands found in the log after the connection has been established sometimes fixes the problems caused by the missing routing entries. Judging by the "missing Nexthop" related errors these commands do run but for some reason they do not make a change.

This bug is well described here, and affects multiple people:
https://forums.openvpn.net/viewtopic.php?t=25771

Change History (14)

comment:1 Changed 6 years ago by Gert Döring

Cc: eworm added

"multiple people" is a slight exaggeration, it seems to affect two - while it works nicely for everyone else on linux, including our automated test environment which runs on a number of different linux distributions and tests all these cases.

Are you using openvpn from your distro, or unmodified from source? Arch Linux tends to add "enhancements", which we cannot debug upstream.

comment:2 Changed 6 years ago by kuklinistvan

It appears here as well.
https://github.com/kylemanna/docker-openvpn/issues/370
And sorry for my English, I meant "more than one independently" by multiple.

I've built the package from AUR:
https://aur.archlinux.org/packages/openvpn-pkcs11/

As you see it contains no patches or "enchantments"; but to be sure I've just compiled the client from upstream source (https://swupdate.openvpn.org/community/releases/openvpn-2.4.6.tar.gz) and the same error happens. (it is version 2.4.6 source, not 2.4.5 as in the AUR, but the situation is the same)

I've compiled it with ./configure, make, sudo make install

[kuklinistvan@pisti-ws openvpn-2.4.6]$ openvpn --version
OpenVPN 2.4.6 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Aug 19 2018

Even if it is a fault on my side it is strange that it just silently fails.

comment:3 Changed 6 years ago by Gert Döring

Cc: David Sommerseth Antonio Quartulli added

Thanks for the link over to the CoreOS issue. This is very mysterious indeed - these are all synchronous calls, read: when they return, the device has to be ready.

Of course we can add a sleep() call after tun device creation, but this would be a funny workaround.

In other words, I consider this an OS bug in "bleeding edge Linux" (Arch and this docker stuff). Maybe David knows more about this, and maybe it's already fixed in an even more recent version... @dazo: any ideas?

comment:4 Changed 6 years ago by tct

cc

comment:5 Changed 6 years ago by eworm

Just for the record... Arch Linux tends to *NOT* add "enhancements" but uses plain upstream source where possible. Both, openvpn and iproute (which contains ip command) do not have any enhancements.

I have not seen this myself...

comment:6 Changed 6 years ago by Gert Döring

@eworm: Apologies if I misunderstood. I thought you did apply patches that have been on the list but not merged yet, like the systemd/capability enhancement stuff?

(But anyway, this is outside openvpn, so either it is something funny in ip [which I find unlikely] or in udev/systemd/kernel land...)

comment:7 in reply to:  6 ; Changed 6 years ago by eworm

Replying to Gert Döring:

@eworm: Apologies if I misunderstood. I thought you did apply patches that have been on the list but not merged yet, like the systemd/capability enhancement stuff?

I built test packages for myself, but this kind of changes is nothing to push to official repositories. Even the netlink changes from master have to wait. ;)

The last series of patches I applied was for openssl, simply because we had a recent openssl 1.1.0 in our repositories before openvpn supported it.

(But anyway, this is outside openvpn, so either it is something funny in ip [which I find unlikely] or in udev/systemd/kernel land...)

Sounds like kernel is involved here. But wondering why I did not see it myself...

The first report from forum does not give any detail what distribution or kernel is used, no?

comment:8 in reply to:  7 Changed 6 years ago by Gert Döring

Replying to eworm:

Sounds like kernel is involved here. But wondering why I did not see it myself...

This is a good question indeed.

The first report from forum does not give any detail what distribution or kernel is used, no?

No, what we have so far is not really much detail to work with.

One report "it happens on Arch", one report "it happened on CoreOS after I upgraded" (in the other forum post, which contained the workaround with "add a sleep in the 'ip' wrapper script"), one report without specifics.

@uklinistvan: which kernel version are you running ("uname -a")?

comment:9 Changed 6 years ago by kuklinistvan

This is interesting. I was unable to reproduce the bug in a fresh Arch virtual machine - however, the instance on my pc is also up to date and was installed this month.

I've just done another upgrade and tried using openvpn again; it still doesn't work and now my kernel version is

kuklinistvan@pisti-ws ~ % uname -a
Linux pisti-ws 4.14.65-1-lts #1 SMP Sat Aug 18 12:17:05 CEST 2018 x86_64 GNU/Linux

The virtual machine was also just updated and tested, and strangely it works.

arch@arch-test ~ % uname -a
Linux arch-test 4.14.65-1-lts #1 SMP Sat Aug 18 12:17:05 CEST 2018 x86_64 GNU/Linux

Last edited 6 years ago by kuklinistvan (previous) (diff)

comment:10 Changed 6 years ago by kuklinistvan

Update:
I've recently had some trouble with systemd-networkd and I've replaced it with NetworkManager?. It seems that the issue has solved itself on may machine, routing works now.

comment:11 Changed 6 years ago by Gert Döring

So can I summarize this as "under some circumstances, systemd-networkd and openvpn do not work well together"?

@eworm: do you use systemd-networkd?

(I'm tempted to close, since this is really out of our sphere of influence - but it is good to have it documented here in case someone else runs into this9

comment:12 Changed 6 years ago by eworm

I do, but with a proper configuration. ;)

Some sources propose to configure systemd-networkd in a way to match all interfaces (and starting dhcp client or whatever). First step is to run flush on the interface, that could explain why the reporter does not see the expected addresses and routes.

comment:13 Changed 4 years ago by Gert Döring

Resolution: fixed-external
Status: newclosed

Yes, indeed.

It could be a race condition, with openvpn configuring the interface just fine, then systemd-networkd noticing the new interface, and "flushing it away".

There is nothing we can do here, except document that this is not how to set up your system.

comment:14 in reply to:  2 Changed 4 years ago by tct

Replying to kuklinistvan:

As you see it contains no patches or "enchantments"; but to be sure I've just compiled the client from upstream source (https://swupdate.openvpn.org/community/releases/openvpn-2.4.6.tar.gz) and the same error happens. (it is version 2.4.6 source, not 2.4.5 as in the AUR, but the situation is the same)

I've compiled it with ./configure, make, sudo make install

Just FTR, if you build OpenVPN for Arch then you almost certainly need to do this ./configure --enable-systemd

Note: See TracTickets for help on using tickets.