Opened 3 years ago

Closed 14 months ago

#562 closed Bug / Defect (fixed)

FreeBSD 9.3-RELEASE-p16 - OpenVPN 2.3.7 :: Only the first Dialin-IP-Traffic forwarded

Reported by: mg16373 Owned by: cron2
Priority: major Milestone: release 2.3.14
Component: Networking Version: OpenVPN 2.3.7 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: mandree

Description

Since version 2.3.7 was installed over FreBSD Port-Build (see below) method the TUN-Interface setup is different to 2.3.6 and only the first IP address in this pool was forwarded to the internet. All other IP addresses are unreachable from external networks. After I have copied the old binary (without any changes to the configuration) with version 2.3.6 the forwarding seems OK. What's the issue?

NOTICE: I can't select version 2.3.7 on field "Version" in this form.

Network: xx.xx.32.240/28 (Non-RFC1918)
Clients: Ubuntu 14, iPhone 5-Client

[2.3.6-Output (TUN-I/F)]
tun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet6 fe80::xxxxxxf%tun1 prefixlen 64 scopeid 0x12
inet xx.xx.x2.241 --> xx.xx.x2.241 netmask 0xfffffff0
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 1145

[netstat -rnfinet]
xx.xx.x2.240/28 xx.xx.x2.241 UGS 0 5048 tun1
xx.xx.x2.241 link#18 UH 0 0 tun1

---

[2.3.7-Output (TUN-I/F)]
tun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet6 fe80::xxxxxxf%tun1 prefixlen 64 scopeid 0x12
inet xx.xx.x2.241 --> xx.xx.x2.242 netmask 0xfffffff0
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 1146

[netstat -rnfinet]
xx.xx.x2.240/28 xx.xx.x2.241 UGS 0 5048 tun1
xx.xx.x2.241 link#18 UH 0 0 lo0

---

[Port-Build]
./configure --enable-pkcs11 --enable-password-save --enable-x509-alt-username --with-crypto-library=openssl --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/info/ --build=amd64-portbld-freebsd9.3

[Configuration]
daemon
dev tun1
proto udp
port 500
bind xx.xx.1.38
local xx.xx.1.38
topology subnet
float
tun-mtu 1500
mssfix
mute-replay-warnings
management localhost 7500

# certs
..

# TLS
tls-auth /usr/local/etc/openvpn500/server.key 0
verify-x509-name xxxxxxxxxxxxx name
cipher AES-256-CBC
tls-version-min 1.0

comp-lzo yes
keepalive 30 600

status /var/log/openvpn500-status.log 1
log-append /var/log/openvpn500.log
user root
group daemon

persist-key
persist-tun
duplicate-cn

tls-server
server xx.xx.x2.240 255.255.255.240

push "redirect-gateway"
push "dhcp-option DNS xx.xx.x2.196"
push "dhcp-option DNS xx.1xx.xx.196"
push "dhcp-option DNS xx.xx.86.xx"

plugin /usr/local/lib/radiusplugin.so

client-to-client
tmp-dir /etc/openvpn
client-config-dir /usr/local/etc/openvpn500/ccd

username-as-common-name
verb 3
script-security 2

Change History (24)

comment:1 Changed 3 years ago by cron2

What do you mean by "only the first IP address in this pool was forwarded to the internet" and "unreachable"? Please explain a bit more precisely, like, showing traceroute output from the internet to the first and second client (the routing tables I initially asked for are there, and are exactly as they are supposed to be - the /28 points to tun1 in both cases, and the local server address points to "lo0" in the 2.3.7 case which makes it actually a working IP for the server itself)

We run git master, which has the same change as 2.3.6->2.3.7 regarding tun and --topology subnet on our corporate VPN servers, on FreeBSD 9.3 and 10.1, and "it works perfectly well"...

(See also trac#481 for a description of the change)

Last edited 3 years ago by cron2 (previous) (diff)

comment:2 Changed 3 years ago by cron2

  • Cc mandree added
  • Owner set to cron2
  • Status changed from new to accepted

corpvpn2$ ifconfig tun0
tun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet6 fe80::221:5aff:fed4:c1ae%tun0 prefixlen 64 scopeid 0xc
inet x.x.x.193 --> x.x.x.194 netmask 0xffffffe0
inet6 2001:608:y:xxx::1 prefixlen 64
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 55747

and tracerouting from external to a client "pretty far down" in the subnet:

traceroute to x.x.x.212, 64 hops max, 40 byte packets

1 195.30.0.21 0.453 ms 0.365 ms 0.322 ms
2 195.30.3.126 0.431 ms 0.333 ms 0.336 ms
3 185.54.120.9 0.706 ms 0.586 ms 0.574 ms
4 y.y.y.238 0.958 ms 0.605 ms 0.600 ms <- last cisco before openvpn server
5 x.x.x.253 0.631 ms 0.635 ms 0.643 ms <- the openvpn server itself
6 x.x.x.212 25.878 ms 24.649 ms 29.952 ms <- the client

comment:3 follow-up: Changed 3 years ago by cron2

It might actually be something completely different, if there are other differences between your "first" and "second" client - like, TLS version negotiation, which got turned on again in 2.3.7. If the "second" client is connected first, will it work then? Anything interesting in the server or client logs for the non-working client?

comment:4 Changed 3 years ago by mg16373

I have checked another VPN server based on FreeBSD 10.1-RELEASE Patchlevel 12 with OpenVPN 2.3.7 build via Portstree. All tunnel interfaces seems to initialized in error. After a downgrade to a binary version 2.3.6 compiled on FreeBSD 10.1 all tunnels works fine.

With "First IP" I mean the first available IP address for a VPN client in the assigned network. In version 2.3.6 the configuration statement "server 192.168.10.0 255.255.255.0" creates a TUN-interface with the IP "192.168.10.1 -> 192.168.10.1" when you enter "ifconfig". In my radius setup client "A" will become IP 192.168.10.2, client "B" the IP 192.168.10.3" [...]

When client "A" is connected all IP traffic works (inside the network and to the world via NAT). Client "B" and all other are unable to send packets to the internet. In my reported setup (see above) all IP address are not RFC1918 addresses.

Only IP "1"+"2" (= First, Second IP) are reachable from the other side. All other IP addresses in a traceroute (MTR; Pings) returns "Destination unreachable" (from outsinde and inside).

The difference exists after the update to 2.3.7 in the routing entry of the FreeBSD FIB. With 2.3.6 the route is assigned over the right TUN-interface. With 2.3.7 the "server IP route" (GW-IP) was assigned over the loopack-interface (lo0).

This problem exists with RFC1918 pools (with NAT) and routed IP addresses (without NAT). A local PING on to third IP address of the defined pool returns a "Destination unreachable" and no ICMP packets are visible on the TUN interfaces.

That's very strange. I running some OpenVPN-based servers around the "world" with the same configuration (but different IP address pools for the clients) and that works fine but not with version 2.3.7 on FreeBSD 9.3-RELEASE and 10.1-RELEASE. In some cases the VPN IP pools are redistributed via OSPF but that's not the problem. With 2.3.7 a traceroute to the third IP address results in a routing loop between ISP and VPN server. On a another server no outgoing traffic is visible.

I think it's a problem of the new version on FreeBSD.

Best regards,
Markus

comment:5 in reply to: ↑ 3 Changed 3 years ago by mg16373

Replying to cron2:

It might actually be something completely different, if there are other differences between your "first" and "second" client - like, TLS version negotiation, which got turned on again in 2.3.7. If the "second" client is connected first, will it work then? Anything interesting in the server or client logs for the non-working client?

No differences! All clients that receive the first IP address of the pool works very good. When I change this IP to next available address (third, four, five) on FreeRADUS the client is unable to communicate. No changes on FreeBSD, no changes on the routing policy (OSPF, BGP, Static-Entries).

The distributed OpenVPN client 2.3.6 and 2.3.7 (<- available after "pkg update"; "pkg upgrade") I have never used because not all options I need are included (see above). Under "/usr/ports/security/openvpn" I use "make menu; make" and this version works fine with FreeRADIUS.

Please take a look on my outputs above. The routing table seems not correct for me. In the output above a little typo errors exists. The correct output for 2.3.7 is ...

[netstat -rnfinet]
xx.xx.x2.240/28 xx.xx.x2.241 UGS 0 5048 tun1
xx.xx.x2.242 link#18 UH 0 0 lo0 <-- I believe here is the error; it must be .241 and not 242

The IP xx.xx.x2.242 is the first client IP that is available and that's not the GW address of the server.
With 2.3.6 on FreeBSD the routing entry is over .241 (over TUN-I/F) and that's is right for me and results in all client traffic was forwarded. With 2.3.7 the routing entry was injected over "lo0" (loopback).

Last edited 3 years ago by mg16373 (previous) (diff)

comment:6 Changed 3 years ago by cron2

As I said, see trac #481 for a detailed description on why the change was done - the 2.3.6 behaviour is actually not correct. People expect the IP address the server assigns to itself to be pingable and usable for traffic, and on FreeBSD, that means it has to be "on lo0" in the netstat output - which is what happens internally if you do "ifconfig tun0" with two different IPv4 addresses. 2.3.6 (and before) did "ifconfig tun0" with two times the *same* address, which made routing of the subnet work, but the server IP becomes non-pingable then (just to point out the obvious: on 10.1, your LAN ip address will also show up as "lo0").

That out of the way, I still do not understand why it is working on my FreeBSD servers and not on yours.

The next thing that confuses is is the updated "netstat -rnfinet" output - I'd expect it to look like that if you actually tell the server "use .242!" (because then it will assign that to itself, and use .241 as an arbitrary address out of the subnet for "remote", and points the route there), but not if you configure a pool with "--server xx.xx.x2.240". In that case, the first address in the subnet (.241) gets on "lo0", the second (.242) is used as the "remote tunnel ip", and the /28 is routed towards the tunnel. Going back to my example above:

x.x.x.192/27 x.x.x.193 US 0 7615 tun0
x.x.x.193 link#12 UHS 0 0 lo0
x.x.x.194 link#12 UH 0 845 tun0

(based on "server x.x.x.192 255.255.255.224" in the openvpn config)

So. Could you please a) verify what is the real output of "netstat -rnfinet", and b) copy-paste the "ifconfig" and "route" lines from the openvpn log?

It *should* look like this:

Jun 15 18:08:55 corp openvpn[55704]: ifconfig_local = 'x.x.x.193'
Jun 15 18:08:55 corp openvpn[55704]: ifconfig_remote_netmask = '255.255.255.224'
Jun 15 18:08:29 corp openvpn[55746]: server_network = x.x.x.192
Jun 15 18:08:29 corp openvpn[55746]: server_netmask = 255.255.255.224
Jun 15 18:08:29 corp openvpn[55746]: ifconfig_pool_start = x.x.x.194
Jun 15 18:08:29 corp openvpn[55746]: ifconfig_pool_end = x.x.x.221
Jun 15 18:08:29 corp openvpn[55746]: ifconfig_pool_netmask = 255.255.255.224
...
Jun 15 18:08:29 corp openvpn[55747]: /sbin/ifconfig tun0 x.x.x.193 x.x.x.194 mtu 1500 netmask 255.255.255.224 up
Jun 15 18:08:29 corp openvpn[55747]: /sbin/route add -net x.x.x.192 x.x.x.193 255.255.255.224

your netstat output and server config do not really "match" (it should use .241 for local with a --server line of .240), but without a more detailed log file view it's hard to see what is happening. I can setup a test system with 2.3.7 if needed, but the code is identical in git master and 2.3.7 (just verified again) so I'm fairly sure 2.3.7 would work as well for me...

comment:7 follow-up: Changed 3 years ago by mg16373

# netstat -rn
xx.xx.32.240/28 xx.xx.32.241 UGS 0 0 lo0
xx.xx.32.242 link#18 UH 0 0 tun1

# route -n get xx.xx.32.240

route to: xx.xx.32.240

destination: xx.xx.32.240

mask: 255.255.255.240

gateway: xx.xx.32.241

fib: 0

interface: lo0

flags: <UP,GATEWAY,DONE,STATIC>

recvpipe sendpipe ssthresh rtt,msec mtu weight expire

0 0 0 0 16384 1 0

# ifconfig tun1
tun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet6 fe80::21c:23ff:fee2:d94f%tun1 prefixlen 64 scopeid 0x12
inet xx.xx.32.241 --> xx.xx.32.242 netmask 0xfffffff0
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 40814

# grep -e ifconfig -e route /var/log/openvpn500.log

Tue Jun 16 08:51:30 2015 us=951024 ifconfig_local = 'xx.xx.32.241'
Tue Jun 16 08:51:30 2015 us=951061 ifconfig_remote_netmask = '255.255.255.240'
Tue Jun 16 08:51:30 2015 us=951097 ifconfig_noexec = DISABLED
Tue Jun 16 08:51:30 2015 us=951134 ifconfig_nowarn = DISABLED
Tue Jun 16 08:51:30 2015 us=951170 ifconfig_ipv6_local = '[UNDEF]'
Tue Jun 16 08:51:30 2015 us=951207 ifconfig_ipv6_netbits = 0
Tue Jun 16 08:51:30 2015 us=951243 ifconfig_ipv6_remote = '[UNDEF]'
Tue Jun 16 08:51:30 2015 us=952888 route_script = '[UNDEF]'
Tue Jun 16 08:51:30 2015 us=952924 route_default_gateway = 'xx.xx.32.242'
Tue Jun 16 08:51:30 2015 us=952960 route_default_metric = 0
Tue Jun 16 08:51:30 2015 us=952995 route_noexec = DISABLED
Tue Jun 16 08:51:30 2015 us=953030 route_delay = 0
Tue Jun 16 08:51:30 2015 us=953066 route_delay_window = 30
Tue Jun 16 08:51:30 2015 us=953102 route_delay_defined = DISABLED
Tue Jun 16 08:51:30 2015 us=953137 route_nopull = DISABLED
Tue Jun 16 08:51:30 2015 us=953173 route_gateway_via_dhcp = DISABLED
Tue Jun 16 08:51:30 2015 us=953209 max_routes = 100
Tue Jun 16 08:51:30 2015 us=958147 push_entry = 'route-gateway xx.xx.32.241'
Tue Jun 16 08:51:30 2015 us=958288 ifconfig_pool_defined = ENABLED
Tue Jun 16 08:51:30 2015 us=958326 ifconfig_pool_start = xx.xx.32.242
Tue Jun 16 08:51:30 2015 us=958363 ifconfig_pool_end = xx.xx.32.253
Tue Jun 16 08:51:30 2015 us=958400 ifconfig_pool_netmask = 255.255.255.240
Tue Jun 16 08:51:30 2015 us=958435 ifconfig_pool_persist_filename = '[UNDEF]'
Tue Jun 16 08:51:30 2015 us=958470 ifconfig_pool_persist_refresh_freq = 600
Tue Jun 16 08:51:30 2015 us=958505 ifconfig_ipv6_pool_defined = DISABLED
Tue Jun 16 08:51:30 2015 us=958541 ifconfig_ipv6_pool_base = ::
Tue Jun 16 08:51:30 2015 us=958583 ifconfig_ipv6_pool_netbits = 0
Tue Jun 16 08:51:30 2015 us=959009 push_ifconfig_defined = DISABLED
Tue Jun 16 08:51:30 2015 us=959046 push_ifconfig_local = 0.0.0.0
Tue Jun 16 08:51:30 2015 us=959083 push_ifconfig_remote_netmask = 0.0.0.0
Tue Jun 16 08:51:30 2015 us=959119 push_ifconfig_ipv6_defined = DISABLED
Tue Jun 16 08:51:30 2015 us=959155 push_ifconfig_ipv6_local = ::/0
Tue Jun 16 08:51:30 2015 us=959191 push_ifconfig_ipv6_remote = ::
Tue Jun 16 08:51:30 2015 us=959402 max_routes_per_client = 256
Tue Jun 16 08:51:31 2015 us=17141 do_ifconfig, tt->ipv6=0, tt->did_ifconfig_ipv6_setup=0
Tue Jun 16 08:51:31 2015 us=17212 /sbin/ifconfig tun1 xx.xx.32.241 xx.xx.32.242 mtu 1500 netmask 255.255.255.240 up
Tue Jun 16 08:51:31 2015 us=21369 /sbin/route add -net xx.xx.32.240 xx.xx.32.241 255.255.255.240

comment:8 Changed 3 years ago by mg16373

After I have started this VPN instance based on 2.3.7 a iPHONE client doesn't work. With 2.3.6 no problem exists and a local PING to the client IP failed. That's not normal and a very crazy routing for me.

#tcpdump -ni tun1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tun1, link-type NULL (BSD loopback), capture size 65535 bytes
09:05:06.966513 IP xx.xx.32.243.49689 > 97.133.35.196.53: 58206+ A? www.mindsoft.de. (33)
09:05:08.616370 IP xx.xx.32.243.61163 > 97.133.35.196.53: 30189+ SOA? local. (23)
09:05:08.636024 IP xx.xx.32.243.51743 > 97.133.35.196.53: 6628+ A? xpl.theadex.com. (33)
09:05:13.422395 IP xx.xx.32.243.49689 > 95.170.86.85.53: 58206+ A? www.mindsoft.de. (33)
09:05:19.444460 IP xx.xx.32.243.49689 > 95.170.86.85.53: 58206+ A? www.mindsoft.de. (33)

# ping xx.xx.32.243
PING xx.xx.32.243 (xx.xx.32.243): 56 data bytes
36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst

4 5 00 5400 3f10 0 0000 01 01 0000 127.0.0.1 xx.xx.32.243

36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst

4 5 00 5400 3f27 0 0000 01 01 0000 127.0.0.1 xx.xx.32.243

36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst

4 5 00 5400 3f2b 0 0000 01 01 0000 127.0.0.1 xx.xx.32.243

36 bytes from localhost (127.0.0.1): Time to live exceeded
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst

4 5 00 5400 3f2f 0 0000 01 01 0000 127.0.0.1 xx.xx.32.243

comment:9 Changed 3 years ago by mg16373

From outside a routing loop with 2.3.7 / never seen with 2.3.6

 Host                                                                                    Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. XXXXXXXXXXXXXXXXXXXX                                                                  0.0%     1    0.9   0.9   0.9   0.9   0.0
 2. hos-tr1.yyyyyy ..xxxx xx.de                                                      0.0%     1    0.9   0.9   0.9   0.9   0.0
 3. core21.xxxxxx.de                                                                      0.0%     1    0.9   0.9   0.9   0.9   0.0
 4. core1.xxxxxx.de                                                                         0.0%     1    4.9   4.9   4.9   4.9   0.0
 5. juniper1.ffm.xxxxxx.de                                                                0.0%     1    4.9   4.9   4.9   4.9   0.0
 6. decix2.mtkom-ip.net                                                                   0.0%     1    5.9   5.9   5.9   5.9   0.0
 7. bbcr05-fra4.mtkom-ip.net                                                              0.0%     1    5.9   5.9   5.9   5.9   0.0
 8. gate61wa.yyyyyyyyyyyyyy.de                                                            0.0%     1    7.9   7.9   7.9   7.9   0.0
 9. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
10. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
11. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
12. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
13. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
14. xx.xxx.1.34                                                                           0.0%     1    8.8   8.8   8.8   8.8   0.0
15. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
16. xx.xxx.1.34                                                                           0.0%     1    8.8   8.8   8.8   8.8   0.0
17. xx.xxx.1.34                                                                           0.0%     1    8.9   8.9   8.9   8.9   0.0
18. xx.xxx.1.34                                                                           0.0%     1    9.8   9.8   9.8   9.8   0.0
19. xx.xxx.1.34                                                                           0.0%     1    8.8   8.8   8.8   8.8   0.0
20. xx.xxx.1.34                                                                           0.0%     1    8.8   8.8   8.8   8.8   0.0

comment:10 Changed 3 years ago by mg16373

With 2.3.6 it works! Same output as above but reduced to the last three lines:

..

  1. gate61wa.yyyyyyyyyy.de 0.0% 6 7.9 8.2 7.9 8.9 0.0
  2. xx.xxx.1.34 0.0% 6 7.9 8.4 7.9 8.9 0.0
  1. 243.xxxxxxxxxxxxxxxxx

# birdc show route protocol static1
BIRD 1.5.0 ready.
10.0.0.0/8 unreachable [static1 2015-06-15] * (10)
xx.xxx.32.192/26 unreachable [static1 2015-06-15] * (10)
169.254.0.0/16 unreachable [static1 2015-06-15] * (10)
192.168.0.0/16 unreachable [static1 2015-06-15] * (10)
xx.xxx.32.240/28 via xx.xxx.32.241 on tun1 [static1 09:26:30] (10) <-- Helper (*)
192.0.2.0/24 unreachable [static1 2015-06-15] * (10)
172.16.0.0/12 unreachable [static1 2015-06-15] * (10)
xx.xxx.95.114/32 via xx.xxx.1.33 on uplink [static1 2015-06-15] * (10)

  • = for OSPF redistribution to my upstream provider.

comment:11 in reply to: ↑ 7 ; follow-up: Changed 3 years ago by cron2

Hiya,

Replying to mg16373:

# netstat -rn
xx.xx.32.240/28 xx.xx.32.241 UGS 0 0 lo0
xx.xx.32.242 link#18 UH 0 0 tun1

I'm sure you have noticed that this looks different again from the first two netstat -rn you have posted, which both show the /28 to be routed toward the tun1. This netstat -rn explains the looping effect quite well - of course it loops if the route points to lo0.

Now the interesting question is "why is that so", and "where is the xx.xx.32.241 link... lo0" entry.

Is it possible that bird is interfering with your routing setup? How exactly is bird set up? If it has a bird-side route for the /28 pointing to the .241, it might delete the openvpn route and reinstall "something" (the "Helper" you have tagged).

Tue Jun 16 08:51:31 2015 us=17212 /sbin/ifconfig tun1 xx.xx.32.241 xx.xx.32.242 mtu 1500 netmask 255.255.255.240 up
Tue Jun 16 08:51:31 2015 us=21369 /sbin/route add -net xx.xx.32.240 xx.xx.32.241 255.255.255.240

This looks the same as here. By "traditional definition", a route pointing to "my own IP" is considered "local on that interface" (and it works that way on my 9.3 systems). I have not tested this on 10.1 yet, but will do ASAP - maybe something in the stack changed and it really needs to point towards the remote IP address.

Anyway: if bird is installing the /28 route, try pointing bird towards the xx.xx.32.242 as gateway address.

comment:12 in reply to: ↑ 11 Changed 3 years ago by mg16373

Replying to cron2:

Is it possible that bird is interfering with your routing setup? How exactly is bird set up? If it has a

It's a "complex" bird setup to handle many OSPF-based VPN networks and two BGPv4 stub-links and I don't want post this here.

bird-side route for the /28 pointing to the .241, it might delete the openvpn route and reinstall "something" (the "Helper" you have tagged).

Anyway: if bird is installing the /28 route, try pointing bird towards the xx.xx.32.242 as gateway address.

I will try it but with new version 2.3.7 a additional IP address will be lost for client usage. An example: 192.168.1.0/29 = 0=Network, 1=LocalTUN-IP1, 2=OpenVPN-GW-IP (?), 3-6=Client, 7=Broadcast. Only four IP addresses are available for clients. With RFC1918 that's not a big problem but Non-RFC1918 pools are "precious" :-)

In small IP pools a little problem.

comment:13 follow-up: Changed 3 years ago by cron2

Nothing is lost :-)

The $base+2 address is perfectly usable for clients (as you already found out, as that's the only one working for you). It does not matter what the FreeBSD side thinks that it is there for, as long as it is sent into the tun interface for the OpenVPN server to hand out and distribute.

The only effective difference is that the $base+1 address now actually works to ping the server - which is the intended outcome, and worked that way on all other platforms but not on FreeBSD with "--topology subnet".

comment:14 in reply to: ↑ 13 Changed 3 years ago by mg16373

Replying to cron2:

Nothing is lost :-)

The $base+2 address is perfectly usable for clients (as you already found out, as that's the only one working for you). It does not matter what the FreeBSD side thinks that it is there for, as long as it is sent into the tun interface for the OpenVPN server to hand out and distribute.

The only effective difference is that the $base+1 address now actually works to ping the server - which is the intended outcome, and worked that way on all other platforms but not on FreeBSD with "--topology subnet".

I have tested now 2.3.7 but all clients about IP .242 does not forwarded or get traffic back. I give up! Version 2.3.7 is buggy on all my FreeBSD servers.

With or without BIRD routing entry the clients retrieves no replies from outside. I can ping the .241 IP but I need this not. IP .242 and .243 unreachable.

Version 2.3.7 Output
[netstat -rnfinet]
xx.xx.32.240/28 xx.xx.32.241 UGS 0 0 lo0
xx.xxx.32.242 link#18 UH 0 0 tun1

Sorry! The ticket can now closed because I will use 2.3.6 so long it possible. The different setup in 2.3.7 to enable a ping to the server IP is not feature for me when all other IPs are not reachable. What ever! :-)

Thanks a lot for your time.
Markus

Last edited 3 years ago by mg16373 (previous) (diff)

comment:15 follow-up: Changed 3 years ago by cron2

I'm not closing this ticket as long as someone is unhappy :-) - but I still do not understand why this is breaking for you, and what is causing the /28 route to point towards the lo0. Maybe this is really a 10.1 thing (though you wrote that it is broken on 9.3 for you as well).

(In the long run, you want to use git master anyway to get tls floating... and it has the same code, so we should better understand the breakage and make it work for you).

Will test with 10.1 and report back.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 3 years ago by mg16373

Replying to cron2:

I'm not closing this ticket as long as someone is unhappy :-) - but I still do not understand why this is breaking for you, and what is causing the /28 route to point towards the lo0. Maybe this is really a 10.1 thing (though you wrote that it is broken on 9.3 for you as well).

(In the long run, you want to use git master anyway to get tls floating... and it has the same code, so we should better understand the breakage and make it work for you).

Will test with 10.1 and report back.

Fresh reboot of the machine. Two of 16 VPN servers are started with 2.3.7 (topology subnet). No VPN related static routes injected by BIRD or other software. Here is the result:

{{
xx.xx.32.240/28 xx.xx.32.241 US 0 155 tun1
xx.xx.32.241 link#18 UHS 0 0 lo0
xx.xx.32.242 link#18 UH 0 2 tun1

tun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet6 fe80::21c:23ff:fee2:d94f%tun1 prefixlen 64 scopeid 0x12
inet xx.xx.32.241 --> xx.xx.32.242 netmask 0xfffffff0
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 1146

10.xx.2xx.240/28 10.xx.2xx.241 US 0 0 tun4
10.xx.2xx.241 link#19 UHS 0 0 lo0
10.xx.2xx.242 link#19 UH 0 4 tun4

tun4: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500

options=80000<LINKSTATE>
inet 10.xx.2xx.241 --> 10.xx.2xx.242 netmask 0xfffffff0
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
Opened by PID 1166

}}

Can you please compare this routing entries with your setup? The client (.243) can communicate with the world. It's a little bit curious!!! :-(

Thanks!

comment:17 in reply to: ↑ 16 ; follow-up: Changed 3 years ago by cron2

Hi,

Replying to mg16373:

Fresh reboot of the machine. Two of 16 VPN servers are started with 2.3.7 (topology subnet). No VPN related static routes injected by BIRD or other software. Here is the result:

{{
xx.xx.32.240/28 xx.xx.32.241 US 0 155 tun1
xx.xx.32.241 link#18 UHS 0 0 lo0
xx.xx.32.242 link#18 UH 0 2 tun1

..

10.xx.2xx.240/28 10.xx.2xx.241 US 0 0 tun4
10.xx.2xx.241 link#19 UHS 0 0 lo0
10.xx.2xx.242 link#19 UH 0 4 tun4

These look perfectly reasonable - "just the way it looks here"...

Can you please compare this routing entries with your setup? The client (.243) can communicate with the world. It's a little bit curious!!! :-(

... and with these entries, I'm not surprised to see that it works :-) - /28 route pointing to tunX is exactly right.

Now, why did this break previously...

If, one one of the non-working instances, you stop openvpn completely - is the /28 route still there? This might explain things (2.3.6 or bird not cleaning up, and the existing route getting confused with the .241 next-hop now pointing to lo0).

comment:18 in reply to: ↑ 17 Changed 3 years ago by mg16373

Replying to cron2:

Hi,

... and with these entries, I'm not surprised to see that it works :-) - /28 route pointing to tunX is exactly right.

I will watch from time to time into the FIB :-)

If, one one of the non-working instances, you stop openvpn completely - is the /28 route still there? This might explain things (2.3.6 or bird not cleaning up, and the existing route getting confused with the .241 next-hop now pointing to lo0).

This behavior I have observed a number of times at BIRD. Routing entries remain in the FIB FreeBSD (like corpses) although these should be removed. This problem, however, apparently only in connection with OpenVPN and BIRD. A few minutes ago I phoned my ISP and asked for the routing table. Without helper routes (BIRD) all subnets via OSPF are present. That sounds good at the moment.

Please left open this ticket for a while (1-2 days).

Best regards,
Markus

comment:19 Changed 2 years ago by cron2

  • Milestone changed from release 2.3.7 to release 2.3.8
  • Version changed from 2.3.6 to 2.3.7

comment:20 Changed 2 years ago by cron2

  • Milestone changed from release 2.3.8 to release 2.3.9

Missed 2.3.7... the reason why I've kept this open is because I want to reorganize FreeBSD routing a bit (next-hop could be the "arbitrary remote IP", which makes it slightly easier to understand what is happening - even if it does the same thing in the end).

comment:21 Changed 2 years ago by mg16373

Thanks for your reply and work!

-Markus

comment:22 Changed 2 years ago by cron2

  • Milestone changed from release 2.3.9 to release 2.3.10

the reason why it's still open is still there - "clean up freebsd next-hop'ing", but 2.3.9 is imminent and needs to get out, with no further delay. Soon...

comment:23 Changed 20 months ago by samuli

  • Milestone changed from release 2.3.10 to release 2.3.12

comment:24 Changed 14 months ago by cron2

  • Milestone changed from release 2.3.12 to release 2.3.14
  • Resolution set to fixed
  • Status changed from accepted to closed

There seem to be a multitude of bugs related to FreeBSD and topology subnet... *scratch head*

There's #481, which is "I cannot ping myself" (fixed quite a while ago).

Then there's #425, which is "of the subnet, only the first neighbour address works, and everything else gets routed to lo0", which seems to hit 11.0-RELEASE and maybe some of the "in the 10-STABLE branch" versions - a fix is on the list, and will show up in 2.3.14.

In the end, the dual fix is to synthesize a "peer address" out of the subnet, so the p2p tun interface is happy (using the same address for both endpoints broke #481), and then use that peer address as gateway address for the subnet itself (#425).

Closing *this* one, as no more action is needed here. #425 tracking the not-yet-commited patch.

Note: See TracTickets for help on using tickets.