Opened 8 years ago

Closed 4 years ago

#662 closed Patch submission (wontfix)

NetBSD: topology subnet tun setup causes quagga routing to ignore interface

Reported by: pruy Owned by: Gert Döring
Priority: major Milestone:
Component: Generic / unclassified Version: OpenVPN 2.3.6 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: plaisthos

Description

configuring openvpn using subnet topology usin dev tun
initializes the tun dev as point-to-point device
and issues an ifconfig setting a specific end-point address
(consistent with p2p).

Such setup however, causes routing software (quagga ripd in my case)
to detect a p2p if and therefore not to forward any other address of the
configured subnet to the if in use.

On netbesd setting up tun dev may use IFF_BROADCAST in place of IFF_POINTTOPOINT
to create a broadcast capable tun dev that will allow configuring a true subnet.

Applying the attached patch will change the NetBSD specific part of tun.c
to use IFF_BROADCAST oand use the proper ifconfig command line
for fitting with if semantic detection from other programs (like ripd).

Attachments (3)

patch_openvpn_tun.c (1.1 KB) - added by pruy 8 years ago.
Patch to tun.c
patch_tun.c_master (1.1 KB) - added by pruy 8 years ago.
Patch of tun.c against master as of 2016-02-26
patch_openvpn_configure (4.6 KB) - added by pruy 8 years ago.
patch to configure files needed for NetBSD

Download all attachments as: .zip

Change History (18)

Changed 8 years ago by pruy

Attachment: patch_openvpn_tun.c added

Patch to tun.c

comment:1 Changed 8 years ago by Gert Döring

Owner: set to Gert Döring
Status: newaccepted

Please send patches against git master only - 2.3.6 is fairly old, and quite a bit has changed in the master tree in the meantime.

I'm fairly willing to change the NetBSD tun init code to use "real" multiaccess setting, but the reason it is the way it is right now is that IPv6 didn't work properly with IFF_BROADCAST when I implemented this - and if I have to choose, IPv6 will win over quagga.

So, the patch needs to be against git master, and be tested on the server and client side, with IPv4 and IPv6.

thanks in advance

comment:2 Changed 8 years ago by pruy

New patch against master.

I have not yet been able to test this
as autoconf/automake cause some errors.
(But likely this is my fault. I'm working on this)
However, I will not have sufficient time to perform tests early next week.
Nevertheless, with the updated patch you might be able testing on your side
and identifying potential problems. Especially as I do not have a well configured
IP6 environment, I might miss some IP6 aspects.

As of now I'm using the patch with 2.3.6, as otherwise openvpn is not working at all here
(using subnet).

Changed 8 years ago by pruy

Attachment: patch_tun.c_master added

Patch of tun.c against master as of 2016-02-26

comment:3 Changed 8 years ago by pruy

For getting the sources from master compiled I had to apply some patches to configure parts.
This is due to the fakt that PKTINFO is being enabled but the relevant structure dies not support
"ipi_spec_dst" member.

For this patch I used the patch that is applied with pkgsrc install of openvpn,
adjusted to master.

I'm not sure this is the best way of treating and I likely will not be able verifying
that this patch is behaving well with other plattforms.

But, at least I put it in for reference.

Changed 8 years ago by pruy

Attachment: patch_openvpn_configure added

patch to configure files needed for NetBSD

comment:4 in reply to:  3 Changed 8 years ago by Gert Döring

Replying to pruy:

For getting the sources from master compiled I had to apply some patches to configure parts.
This is due to the fakt that PKTINFO is being enabled but the relevant structure dies not support
"ipi_spec_dst" member.

This sounds oh so slightly annoying - especially since I do run a NetBSD buildslave, and master compiles without any issues there - but that is a somewhat older NetBSD version (5.1)...

If would be *sooo* nice if those folks that do pkgsrc patches would actually send a HEADS UP back to the original maintainers "hey, some APIs have changed, patches are needed". *grumble*.

comment:5 Changed 8 years ago by pruy

Couldn't agree more....

comment:6 Changed 8 years ago by pruy

BTW,
I had a first test with my android phone running OpenVPN-2.4-icsopenvpn.
This had worked using 2.3.6 server without any problems (having the subnet patches applied)

Now phone and server start handshake and both die from timeout during TLS handshake.
The server is sending retries that obviously are not being received by the client.
Unfortenately, I have no development and debugg´ing environment for android phones,
so I can't tell wether the packets from server are lost during delivery or after having seen by the phone.
At least they do not reach OpenVPN on the phone.

Any idea on this? Anything related that changed from 2.3.6 to master, that could affect TLS handshake? I'd assume he changes I'm trying to test for sure are not yet involved at that phase....

Will go on testing NetBSD 6.1 server and client interaction soon.....

comment:7 in reply to:  6 Changed 8 years ago by Gert Döring

Cc: plaisthos added

Hi,

Replying to pruy:

Any idea on this? Anything related that changed from 2.3.6 to master, that could affect TLS handshake? I'd assume he changes I'm trying to test for sure are not yet involved at that phase....

Quite a bit has changed :-) - the most likely culprit is the increase of the control packet MTU to speed up handshaking, and the introduction of TLS 1.2. Both *could* be the issue here - or it could be something totally different.

To force TLS 1.1 set "--tls-version-max 1.1" in the config on either side.

(Copying in @plaisthos, he might have more ideas how to pinpoint this)

comment:8 Changed 8 years ago by pruy

Yep,
reducing link-mtu made the link work again.
But looks like this setting has to be "global"
and is not a property of an individual association?

comment:9 Changed 8 years ago by pruy

False positive!

MTU seems not to be the primary issue, as reduced mtu only occasionally yields a working setup.
Looks like the key influence is something different.
(Successfule attempts uses TLS1.2, so this is ruled out also)

comment:10 Changed 8 years ago by pruy

Probaly this is not related to openvpn at all.
Now the problem also exists with 2.3.6, where it had worked for several days before.

comment:11 Changed 8 years ago by pruy

Setting up a tunnel with my android mobile as a client works as expected.

Using a openVPN 2.3.6 client and the patched code on the server side also works.
I had to fiddle on the client to make routes working within the nets linked via the tunnel.
But this is all from code that I did not touch.

[May be I built up a completely wrong picture of what subnet mode shoud be doing?
Could you give a short sketch of what is supposed to happen on both sides?
(Just to get my 30 years of doing IPv4 lifted to modern times (;-))]

Will now go for testing using patched code also on client side.

Version 0, edited 8 years ago by pruy (next)

comment:12 in reply to:  11 Changed 8 years ago by Gert Döring

Hi,

Replying to pruy:

Setting up a tunnel with my android mobile as a client works as expected.

Using a openVPN 2.3.6 client (NetBSD 6.1) and the patched code on the server side (NetBSD 6.1) also works.
I had to fiddle on the client to make routes working within the nets linked via the tunnel.
But this is all from code that I did not touch.

[May be I built up a completely wrong picture of what subnet mode shoud be doing?
Could you give a short sketch of what is supposed to happen on both sides?
(Just to get my 30 years of doing IPv4 lifted to modern times (;-))]

I'm starting to suspect that you're not using point-to-multipoint OpenVPN, but point-to-point (and "server" is just "tls-server", not a true --server in the OpenVPN sense).

Explaining all the details in the ticket might be a bit tricky, and there are way too many options, but I'll give it a try.

First variant: two machines (exactly two) talking to each other. This is called "point-to-point" mode in OpenVPN lingo - if you do static-keyed UDP, it indeed is, because both sides will just throw UDP packets around, with no handshake whatsoever. If you do TCP, or TLS, you still end up having one "server" and one "client" as far as the network setup and TLS handshake goes - but regarding OpenVPN protocol, both sides are "peers".

In p2p mode, there is no routing inside OpenVPN - what goes into the tun is just encrypted and sent to the other end. What comes in via network is decrypted and sent via tun (or tap) towards the kernel. So in that mode, you can control routing completely outside OpenVPN - and, I think, you should not ever use "--topology subnet" because it does not make sense (and is likely to break stuff).

The other mode OpenVPN can operate in is the "multi-client" or "point-to-multipoint" mode. Here you have one server side inside the OpenVPN protocol, and dedicated clients (server has --server, client has --client in its config). One of the major differences is that in this mode, the server will send (PUSH) config data towards the client, IP addressing for the tun interface among it. Further, the openvpn server process will maintain an internal routing table to send packets towards the "right" client - so the flow of packets is

kernel --> server tun --> server process consults IP-to-client table (openvpn routing table, controlled by "iroute" and "ifconfig-pool" stuff) --> send packet towards the corresponding client

on the client side, still no brains "what comes in goes out on the other end".

If two clients want to talk to each other, client A sends to server, server consults iroute table, sends to client B (or to server side kernel tun).

Now, what does --topology do? Clients are assigned addresses from the server, and clients also encompass *windows*, which does not have a proper tun interface that gets told "our ip / their ip", but always needs a network to operate on ("looks like an ethernet interface to the kernel").

Server IP addressing for clients always starts with a larger subnet which is then chopped up and handed to clients - and on the server side, it's routed "in full" (like, a /24) into the server tun.

So, there was "topology net30", which took a /30 for each client out of the server side subnet and gave it to the client - works, but a bit wasteful, and people got confused because the /30 is really just "one ip for the client", the other IP in it never pinged (because the server didn't configure it on the server side).

Then, there was "topology p2p", which just gave each client an individual IPv4 address, and told them the address of the server to configure as the remote end of the tun interface (remember: p2p tun interfaces on unix do not have a network on them, but a "my ip, their ip" setting). Works great on all unixes, totally fails on Windows.

And then, there came "topology subnet", which just pretends to put the server-side tun networks "flat" onto all client tun interfaces - that is, every client gets an IP out of the (example) /24, and does "ifconfig $ip 255.255.255.0" or "ifconfig $ip" + "route /24 onto tun interface". On the client side, the net effect is "every IP address out of the /24 is sent via the tun towards the server" (remember: the client does not look at the packets - so if the client side routing is set up to route the /24 into the tun, however it is done on each platform(!), packet will reach the server) - and as far as tun config goes, the server side process does the same: point the /24 into the tun, and then handle individual addresses per-client.

"topology subnet" on a tun interface is basically a hack - you configure two addresses out of the /24 for "us" and "them", and route the /24 onto the tun interface, or if the OS does not do routing-to-interface, to the "them" address. On some OSes, the tun if can be configured with a /24 netmask, and the kernel will set up the proper route (but that is vastly different, as you can see in tun.c).

This hack works perfectly fine for software that does not look at interfaces, but just sends packets toward destinations - IP routing will stuff it into the tun, server will un-stuff it, VPN works. But if you bring in routing software, it will not know that there is a /24 involved, but see the "point-to-point" nature of the tun and follow its own assumptions on what to do with it (not that you couldn't do routing on a p2p interface...).

It gets a bit more complicated :-) - theoretically, it should not matter if one side is using the p2p+/24 route hack and the other side has a /24 ifconfig'ed onto the interface, and the interface properly run in IFF_SUBNET mode. Unfortunately, it does, if you enable IPv6 - at least on FreeBSD. p2p mode will never do neighbour discovery, just send packets in the same (IPv6) /64 subnet towards the other end - and that gets things done perfectly fine. FreeBSD's IFF_SUBNET mode will turn on neighbour discovery, so when trying to reach a host on the tun subnet (like, the server) FreeBSD will send out IPv6 neighbour discovery packets on fe80:: addresses, which other platforms do not reply to on tun interfaces, and which we do not support in the OpenVPN server "iroute" code (since in p2mp server mode the server doesn't hand the packets to the kernel side but sorts out locally connected client itself, it would need to learn ND).

I assume you're thoroughly confused now - welcome to the club :-) - I inherited this heap of historic workarounds a few years ago, and making it work cross-platform is quite a bit of a challence.

But the conclusion from all the above is:

  • if you want quagga to work, you need to use OpenVPN point-to-point mode, because otherwise the "openvpn server" will not learn the routes quagga is negotiating (quagga will tell the kernel on the server side, which will dutifully send the packets into the tun interface, but the internal routing table of the server - iroute - will not know, and send the packets back)
  • quagga *should* be able to work in --topology p2p mode or --topology net30, because that's really "normal tun setup" with one IP left and one IP right
  • topology subnet is likely to setup quagga, as the interface is in p2p mode, but when talking to it (on a client) there are more machines inside the (example) /24 than just two

If you need quagga to work in real server mode (p2mp), you need to use --dev tap, because then OpenVPN will only concern itself with mac addresses, and never look at IP headers - so quagga can route *through* the openvpn server iroute table (which is really a mac table as in a switch, not a routing table, then).

If you're still with me :-) - maybe explain what you're trying to achieve, and whether I totally misunderstood you, and gave funny advice.

comment:13 Changed 8 years ago by Eric Crist

/me provides a gold star to cron02 - good work buddy!

Last edited 8 years ago by Eric Crist (previous) (diff)

comment:14 Changed 8 years ago by pruy

Thanks cron02 for explanation.
It is still consistent with my assumptions.
And yes, I'm using true server mode and proper ip pool for the overall network.

I had dealt with some cross platform topics (when the UNIX world was not mainly a set of variants of Linuxes and posix was introduced to mainframes) but kept my fingers away from windows.

Looks like NetBSD >= 6 is a beasts that is caring about the if configuration.
(A p2p if will not receive packets that do not match the endpoint address.
The kernel is smart enough to "know" the other addresses are not reachable for a p2p if. I have not yet checked the sources, but this seems to happen within if code as routing level is trying to send packets to the if)
So doing setup properly is essential.

This is why I came to using IFF_BROADCAST. For a p2mp setup it provides the correct notion
to the local kernel. However, I recognized the client side uses identical addresses for local and remote, also providing the kernel with the notion there is no usefule remote side.
So I had to change ifconfig manually using the proper /32 addresses to make connection work again.
I personally feel a bit scared with the client side gatting a route to /24 by default.
It would likely would complicate a multi path network. (Networks A, B, C. A is linked to B and C using adresses from /24; B links to C using some other net. Then routing will force traffic from C to B(/24 client address) via A and not allow using C to B link directly.

With ipv6 the basic problem seems to be that no link local address is being assigned (no IFF_SUBNET mode at least with 6, maybe netBSD 7 has adopted it). I had rudimentary ping connectivity if I added ll addresses manually. But I did not yet have a closer look on ipv6 setup.

So yes, for sure this all will extend the zoo of exceptions and workarounds a far bit....

And finally.. I'm still not confused. It's all within expectations. Some subtleties that I would have missed....

About what I intend to do:

I intend to link a set of networks where one will act as central hub, but where some will have shortcut links beside the central one. Each of those networks will support several networks. This is why working routing protocol is *very* appreciated.
Currently I'm using vtund with NetBSD and it is working OK (pure p2p links). Howver, no some Linux gatewayed networks are to be integrated and vtund is not working well ther. And also some simple clients (e.g. mobiles, notebooks) are pending for being integrated also.

Do to the nature of some networks, bridging is not a valid option. Doing layer 3 links is the way to go.

comment:15 Changed 4 years ago by Gert Döring

Resolution: wontfix
Status: acceptedclosed

So. Coming back to this after a long while, and after quite a bit has been rewritten in the interface handling for all platforms for 2.4 and for the upcoming 2.5

What is the state of affairs here? Is NetBSD still problematic with latest git master?

At least the "it should build without errors" is definitely taken care of. "Topology subnet" tunnels still look a bit wonky (POINTTOPOINT instead of BROADCAST).

tun1: flags=0x8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500
        inet 10.194.3.35/24 -> 10.194.3.1 flags 0x0
        inet6 fe80::250:56ff:fe9c:fbd6%tun1/64 ->  flags 0x0 scopeid 0x4
        inet6 fd00:abcd:194:3::1021/64 ->  flags 0x0

... OTOH, as said before, with an openvpn point-to-multipoint server on the other end, Quagga will not work properly anyway - quagga will exchange routes, but the openvpn server process will not learn them (internal routing table) - so this is somewhat of a "even if we fix the tunnel interface config, it will still not be good enough to run quagga over it" case.

--dev tap will work, as it's a transparent ethernet switch, then.

Yes, "routing is the way to go", but --dev tap does not mean "you have to go briding". It just means "the openvpn point-to-multipoint engine on the server, and the on-wire framing, deal with ethernet packets instead of IP packets" - so, all the more advanced stuff (multicast, setting up routing outside OpenVPN) will work right away.

All this said, I think I'm closing this ticket now. We will revisit the whole "learning of routes" thing for 2.6, so maybe a quagga-compatible tun p2mp server will fall out of this, but it's not something actively worked on.

Note: See TracTickets for help on using tickets.