Opened 7 years ago

Closed 7 years ago

Last modified 6 years ago

#807 closed Bug / Defect (fixed)

Error setting tap IP by dhcp

Reported by: Selva Nair Owned by:
Priority: major Milestone:
Component: Generic / unclassified Version: OpenVPN 2.4.0 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: Samuli Seppänen

Description (last modified by Selva Nair)

If a connection is made using a tap adapter and then after disconnecting it if the same connection is attempted using a second tap adapter, openvpn fails to asssign IP address to the new adapter.

In normal use this error can show up in many different ways with multiple tap interfaces and single or multiple connections depending on the order in which connections are started etc.. But the basic issue appears to be due to the assigned address not being removed from an interface when tun is closed. Also there are no such errors if a fixed interface is always assigned to a connection (using --dev-node option).

Easy to reproduce: have two tap adapters, say, adapter1 and adapter2 and one config file with a single remote that assigns the same VPN IP to the client everytime (10.9.0.6 in logs below)

First have --dev-node adapter1 in the config, make a successful connection. Disconnect, change the dev-node to --dev-node adapter2 and connect again. This time it will fail with something like

Thu Dec 29 23:48:57 2016 us=508321 TAP-WIN32 device [adapter2] opened: \\.\Global\{4403682C-28DA-498B-98A8-CFC764054EB6}.tap
Thu Dec 29 23:48:57 2016 us=509321 TAP-Windows Driver Version 9.21 
Thu Dec 29 23:48:57 2016 us=509321 TAP-Windows MTU=1500
Thu Dec 29 23:48:57 2016 us=516322 Set TAP-Windows TUN subnet mode network/local/netmask = 10.9.0.0/10.9.0.6/255.255.255.0 [SUCCEEDED]
Thu Dec 29 23:48:57 2016 us=516322 Notified TAP-Windows driver to set a DHCP IP/netmask of 10.9.0.6/255.255.255.0 on interface {4403682C-28DA-498B-98A8-CFC764054EB6} [DHCP-serv: 10.9.0.0, lease-time: 31536000]
Thu Dec 29 23:48:57 2016 us=517322 Successful ARP Flush on interface [28] {4403682C-28DA-498B-98A8-CFC764054EB6}
Thu Dec 29 23:48:57 2016 us=534323 do_ifconfig, tt->did_ifconfig_ipv6_setup=1
Thu Dec 29 23:48:57 2016 us=534323 MANAGEMENT: >STATE:1483073337,ASSIGN_IP,,10.9.0.6,,,,,xx:xx::1004
Thu Dec 29 23:48:57 2016 us=534323 TUN: adding address failed using service: The object already exists.   [status=5010 if_index=28]
Thu Dec 29 23:48:57 2016 us=534323 add_route_ipv6(xx:xx::/64 -> xx:xx::1004 metric 0) dev adapter1
Thu Dec 29 23:48:57 2016 us=534323 IPv6 route addition via service succeeded
Thu Dec 29 23:49:02 2016 us=181589 TEST ROUTES: 0/0 succeeded len=-1 ret=0 a=0 u/d=down
Thu Dec 29 23:49:02 2016 us=181589 Route: Waiting for TUN/TAP interface to come up...
Thu Dec 29 23:49:07 2016 us=61868 TEST ROUTES: 0/0 succeeded len=-1 ret=0 a=0 u/d=down
Thu Dec 29 23:49:07 2016 us=61868 Route: Waiting for TUN/TAP interface to come up...
Thu Dec 29 23:49:08 2016 us=285938 TEST ROUTES: 0/0 succeeded len=-1 ret=0 a=0 u/d=down
Thu Dec 29 23:49:08 2016 us=285938 Route: Waiting for TUN/TAP interface to come up...
Thu Dec 29 23:49:09 2016 us=14980 TEST ROUTES: 0/0 succeeded len=-1 ret=0 a=0 u/d=down
Thu Dec 29 23:49:09 2016 us=14980 Route: Waiting for TUN/TAP interface to come up...

The above message repeats several times and finally the connection "completes" but with no ip number set on adapter2. At this point

netsh int ipv4 show addr shows (not all adapters listed)

Configuration for interface "adapter2"
    DHCP enabled:                         Yes
    InterfaceMetric:                      20

Configuration for interface "adapter1"
    DHCP enabled:                         Yes
    IP Address:                           10.9.0.6
    Subnet Prefix:                        10.9.0.0/24 (mask 255.255.255.0)
    InterfaceMetric:                      20


(Note that only netsh shows addresses on adapter1 as above, ipconfig /all does not, possibly because adapter1 is disconnected).

So adapter1 is still holding the IP we are trying to set on adapter2 and could be the reason for this failure. ipconfig /release adapater1 doesn't work as the "media" is disconnected on that adapter.

As seen in the logs, an ipv6 address is also assigned (through the service) which fails too possibly due to the same reason.

Change History (12)

comment:1 Changed 7 years ago by Selva Nair

Description: modified (diff)
Version: 2.2.22.4.0

comment:2 Changed 7 years ago by Gert Döring

I wonder if it would work to just call ipconfig /release adapter1 on tun_close() time?

... and I thought we remove the v6 address already today?

comment:3 Changed 7 years ago by Selva Nair

We do not remove the v6 address -- that patch is still waiting for review. Unlike earlier thought it seems removing the v6 address is more than cosmetics.

As for ipv4 ipconfig /release adapter1 may work but will have to be done from iservice. A better way may be for the dhcp server inside tap driver do this.

comment:4 in reply to:  3 Changed 7 years ago by Gert Döring

Cc: Samuli Seppänen added

Hi,

Replying to selvanair:

We do not remove the v6 address -- that patch is still waiting for review.

Oops. Got lost somewhere, sorry.

Unlike earlier thought it seems removing the v6 address is more than cosmetics.

I wonder if this needs specific timing to "bite". My test cases did connect to different remotes on the same tap interface, as did samuli's - but I was doing this manually, so Windows had a few seconds to orderly shut down the interface, and "forget everything relevant".

As for ipv4 ipconfig /release adapter1 may work but will have to be done from iservice. A better way may be for the dhcp server inside tap driver do this.

True.

The DHCP server might be a way to work around this, for example by setting the lease time to 30 seconds or so...

But I have a different suspicion. If this is not happening with 2.3, with the NDIS6 tap driver, it might be related to commit 533495298bd93ddaaf4418dc666922779ce5ef9b (though our testing was supposed to figure out whether it could bring this sort of problem), which changed the sequence of do_ifconfig() and open_tun() on windows.

Hypothesis:

  • first connection is started
  • we set up the DHCP server for "address 1"
  • everything works normally
  • we close the tap device, openvpn ends
  • second connection is started, we reopen the tap device
  • the tap driver still knows about "address 1"
  • windows is FAST and grabs the DHCP address right away
  • openvpn sets "address 2", but windows doesn't care anymore, because it is satisfied with "address 1" (for the lease time)

to verify this, I think one could build an openvpn binary that just pokes 10.1.2.3 into the tap driver for DHCP on exit - and if that address is what shows up on the openvpn run that fails, we need to figure out how to un-load the tap driver's DHCP settings (maybe setting to 0.0.0.0 is good enough, maybe it needs to be explicitly turned off).

Thinking loud, so this might all be wrong.

comment:5 Changed 7 years ago by Selva Nair

First, I see this only with more than one adapter so this may be unrelated to the similar looking report in the users list. I am yet to see something like this with only one adapter. So if there is a timing issue I haven't seen it.

For me its the issue of a disconnected adapter (adapter1) still holding an address which tap driver wants to assign to another one (adapter2). On further testing see that that this happens often when there are multiple adapters and connections started in random order unless one always use a particular adapter with a particular connection (which is what I normally do).

As for one adapter, Windows is unlikely to just grab the previous address as usually its sin is "spamming" the dhcp server too often. Based on some sniffing I see windows always sending out dhcp discover and getting a response when connection comes up -- it does request the existing address first but the tap driver denies it and then it accepts the new one. So assigning a new address to an adapter should work without any timing issues.

So the reports on multiple connections + 1 adapter not working is still a mystery to me..

Last edited 7 years ago by Selva Nair (previous) (diff)

comment:6 Changed 7 years ago by Gert Döring

see also #665 for reports about 2.3.x being affected by this (so it's not d12fk's commit)

comment:7 Changed 7 years ago by Selva Nair

Doing dhcp_release() in close_tun() does resolve this. And release appears to work without needing admin rights, so this could be a fix for the problem I reported -- not necessarily a fix for Track 665 and similar reports with one adapter and multiple remotes.

We already have the code required for dhcp release -- normally called in close_tun() if --dhcp-release is in the config. I think it should be always called. The man page talks about some caveats of using this option but nothing mentioned sounds like "caveats".

There is also an undocumented --dhcp-pre-release option to do dhcp release in open_tun which appears unnecessary as dhcp renew will lead to a release unless the address hasn't changed.

Last edited 7 years ago by Selva Nair (previous) (diff)

comment:8 Changed 7 years ago by Selva Nair

Replying to cron2:

Replying to selvanair:

We do not remove the v6 address -- that patch is still waiting for review.

Oops. Got lost somewhere, sorry.

My bad. v6 address is indeed removed, the pending patch is for somewhat unrelated "v6 directly connected net" route. The error I saw with v6 address setting was probably caused by repeated trials. Only v4 address setting appears to be affected.

A patch to remove the dhcp address in close_tun() sent to the list.

comment:9 Changed 7 years ago by Samuli Seppänen

Quoting the OpenVPN 2.4.0 man-page:

    --dhcp-renew
           Ask Windows to renew the TAP adapter lease on startup.  This option is
           normally unnecessary, as Windows automatically triggers  a DHCP
           renegotiation on the TAP adapter when it comes up, however if you set
           the TAP-Win32 adapter Media Status property to "Always Connected", you
           may need this flag.

    --dhcp-release
           Ask Windows to release the TAP adapter lease on shutdown.  This
           option has the same caveats as --dhcp-renew above.

I too fail to see the so-called caveats. The text has stayed the same since 2005:

$ git log 6fbf66
Author: james <james@e7ae566f-a301-0410-adde-c780ea21d3b5>
Date:   Mon Sep 26 05:28:27 2005 +0000

    This is the start of the BETA21 branch.
    It includes the --topology feature, and
    TAP-Win32 driver changes to allow
    non-admin access.
     
    git-svn-id: http://svn.openvpn.net/projects/openvpn/branches/BETA21/openvpn@580 e7ae566f-a301-0410-adde-c780ea21d3b5

$ git show 6fbf66:openvpn.8

comment:10 Changed 7 years ago by Gert Döring

commit db5b9b45508ea8f66ea80565279af3edd9300499 (master)
commit 139cd1b14c57501bee36c1ec1dc5a037617ec29e (release/2.4)
Author: Selva Nair
Date: Tue Jan 3 15:38:03 2017 -0500

Always release dhcp address in close_tun() on Windows.

thanks :-)

comment:11 Changed 7 years ago by Selva Nair

Resolution: fixed
Status: newclosed

comment:12 Changed 6 years ago by engy

I had a problem with disabled DHCP media sense. I created a pull request to warn about that https://github.com/OpenVPN/openvpn/pull/97

Can be checked with "netsh interface ipv4 show global"
and fixed with "netsh interface ipv4 set global dhcpmediasense=enabled"

Note: See TracTickets for help on using tickets.