Opened 7 years ago

Last modified 2 years ago

#160 accepted Bug / Defect

openvpn sometimes doesn't provide 'common_name' env. var during client-disconnect execution.

Reported by: Daniel Owned by: JJK
Priority: major Milestone: release 2.4
Component: plug-ins / plug-in API Version: OpenVPN 2.2.0 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords:
Cc: Gert Döring

Description

Hi!

I have a client-disconnect script which gets executed when users disconnect. Although I'm push'ing "explicit-exit-notify" to clients, sometimes it happens that they disconnect abruptly (because of flaky internet connections). For a long time the client-disconnect script has never failed me, and it cleaned up after the user, but to be able to do this, it needs the 'common_name' environment variable, which provides me the username (I'm using username/password auth, not key files).
Unfortunately, every now and then, there is a user, who loses her/his internet connection, and I'm starting to get these in the server logs:

read UDPv4 [ECONNREFUSED]: Connection refused (code=111)

Then after ten or so messages, the client-disconnect script's log entries:

client-disconnect: undefined username!

The code is simple which provides this: it simply checks for the 'common_name' env. var existence.
I've gathered additional information from the time when this has happened. These informations are produced by the same client-disconnect script, which couldn't find the 'common_name' env. variable:

  • ENVIRONMENT VARIABLES:
    time_ascii=Sat Sep 10 15:35:02 2011
    daemon_start_time=1315566047
    ifconfig_local=10.x.x.x
    trusted_ip=7x.x.x.x
    remote_port_1=1194
    daemon_pid=1129
    daemon_log_redirect=0
    untrusted_port=1024
    verb=3
    time_duration=416
    bytes_sent=17018
    daemon=1
    local_1=9x.x.x.x
    trusted_port=1024
    ifconfig_broadcast=10.x.x.x
    dev=tap0
    ifconfig_pool_remote_ip=10.x.x.49
    untrusted_ip=7x.x.x.x
    bytes_received=10460
    tun_mtu=1500
    ifconfig_netmask=255.255.240.0
    ifconfig_pool_netmask=255.255.240.0
    time_unix=1315661702
    proto_1=udp
    link_mtu=1574
    local_port_1=1194
    config=/etc/openvpn/openvpn-fw.conf
    script_type=client-disconnect
    script_context=init
    

You can see that the 'common_name' is missing from it. Every other information is present and correct.

  • ARP TABLE:
    Address       HWtype HWaddress          Flags Mask Iface
    [...]
    10.x.x.49 ether  00:ff:x:x:x:x  C          tap0
    [...]
    

The "offending" user's information is available in the arp table.

  • OPENVPN STATUS FILE:
    OpenVPN CLIENT LIST
    Updated,Sat Sep 10 15:41:58 2011
    Common Name,Real Address,Bytes Received,Bytes Sent,Connected Since
    [...]
    UNDEF,7x.x.x.x:1024,10460,17018,Sat Sep 10 15:35:02 2011
    [...]
    ROUTING TABLE
    Virtual Address,Common Name,Real Address,Last Ref
    [...]
    00:ff:x:x:x:x,UNDEF,7x.x.x.x:1024,Sat Sep 10 15:41:21 2011
    [...]
    GLOBAL STATS
    Max bcast/mcast queue length,49
    END
    

As you can see, the username is UNDEF, probably that is why the client-disconnect script won't get the env. var. The other informations (ip, mac) are present and correct.

OpenVPN server version is 2.2.1, and has been configured and compiled like this:

./configure --enable-password-save --enable-iproute2 --disable-selinux --prefix=/usr/local && make

The server config:

daemon          openvpn-fw

mode            server
tls-server
dh              /etc/openvpn/dh1024.pem
ca              /etc/ssl/certs/ca.crt
cert            /etc/ssl/certs/cert.crt
key             /etc/ssl/private/key.key

user            openvpn
group           openvpn

local           9x.x.x.x
port            1194
proto           udp

dev-type        tap
dev             tap0
ifconfig        10.x.x.1 255.255.240.0

ifconfig-pool   10.x.x.10 10.x.x.255 255.255.240.0
push            "route-gateway 10.x.x.1"

persist-key
persist-tun

replay-persist  /var/run/openvpn/openvpn-replay

comp-lzo        adaptive
push            "comp-lzo adaptive"

max-clients     150
keepalive       3 30
push            "explicit-exit-notify 3"

status          /var/run/openvpn/openvpn-fw_status 1
status-version  1

syslog          openvpn
verb            3

management      /var/run/openvpn/openvpn-fw_management unix /etc/openvpn/management_passwd
management-client-user  root
management-client-group root

client-cert-not-required
username-as-common-name

script-security 2

client-connect          /usr/local/libexec/openvpn/client-connect.pl
client-disconnect       /usr/local/libexec/openvpn/client-disconnect.pl
tmp-dir                 /dev/shm

auth-user-pass-verify   /usr/local/libexec/openvpn/auth-user-pass-verify.pl     via-file

Change History (16)

comment:1 Changed 7 years ago by JJK

Component: Generic / unclassifiedConfiguration
Owner: set to JJK
Status: newaccepted

this has been reported before in the openvpn-users mailing list but nobody was able to reproduce this bug. A patch was presented, which fixes the issue, but this patch was not applied due to the reproducability issues. The patch is:

--- multi.c 2009-10-24 21:17:29.000000000 -0200
+++ multi.c.patched 2010-03-02 14:57:12.000000000 -0300
@@ -447,6 +447,9 @@

multi_client_disconnect_setenv (struct multi_context *m,

struct multi_instance *mi)

{

+ /* setenv incoming cert common name for script */
+ setenv_str (mi->context.c2.es, "common_name", tls_common_name
(mi->context.c2.tls_multi, true));
+

/* setenv client real IP address */
setenv_trusted (mi->context.c2.es, get_link_socket_info (&mi->context));

(this was for 2.1)

comment:2 Changed 7 years ago by Daniel

Thank you, I'm starting to test the patch now.
Just for the record, a copy-paste'able version of it:

--- multi.c.orig	2011-06-24 08:13:39.000000000 +0200
+++ multi.c	2011-09-13 08:43:56.732903222 +0200
@@ -447,6 +447,9 @@
 multi_client_disconnect_setenv (struct multi_context *m,
				struct multi_instance *mi)
 {
+  /* setenv incoming cert common name for script */
+  setenv_str (mi->context.c2.es, "common_name", tls_common_name (mi->context.c2.tls_multi, true));
+
   /* setenv client real IP address */
   setenv_trusted (mi->context.c2.es, get_link_socket_info (&mi->context));

comment:3 Changed 7 years ago by Daniel

The patch works and is stable, thank you!

comment:4 Changed 7 years ago by Daniel

Unfortunatelly, I've again got a disconnection when the common_name variable was not present. What can I do to further aid the solving of this problem?

comment:5 Changed 6 years ago by David Sommerseth

Component: Configurationplug-ins / plug-in API

The best way to help sorting out this is to try to find a reliable way how to reproduce this at will.

Otherwise, you'll need to have a debugger attached to the openvpn process with a trigger to dump a backtrace when multi_client_disconnect_setenv() is called and tls_common(mi->context.c2.tls_multi, true)) returns nothing (or NULL).

It would also be interesting to see if a plugin written in C would also have an empty common_name in this case as well.

comment:6 Changed 4 years ago by ldperron

I do also have this issue, version: 2.3.2-2.el6.

I use username-as-common-name, client-connect and client-disconnect options.

common-name environment variable is indeed the username, as intended for the client-connect script, but this is not always true for the client-disconnect script: sometimes, the common_name environ variable become the common_name of the client's certificate instead of the actual username. Note: this happens only on connections that lasted for more then an hour or two.

The patch above thats adds common_name setup in multi_client_disconnect_setenv fixed the problem for me.

comment:7 Changed 4 years ago by Samuli Seppänen

Cc: Gert Döring added
Milestone: release 2.4

We have at least three reports of the same issue, and in all cases the provided patch fixes the issue. If the patch is likely to be harmless, perhaps we should just apply it to Git "master" and let it go to 2.4. Thoughts?

comment:8 Changed 4 years ago by JJK

as far as I know (and can see) the patch is harmless, and I don't see how it could negatively affect an existing setup. Let's apply it to Git "master".

comment:9 Changed 3 years ago by Gert Döring

I'm not convinced. From my reading of the ticket, OpenVPN seems to clear the common_name internally before calling the script (which is why it has "UNDEF" in the status.log), so there's at least two bugs here

  • why is it UNDEF on "client disappears without saying goodbye" (and: cann this be reproduced?)
  • why is it sometimes not passing the username but the common_name on disconnect? The time dependency hints at "during rekey, the variable in question is overwritten" (comment 6)

I feel that we're not fully understand what is happening here.

@samuli: daniell reports that it does not fix the problem for him...

comment:10 Changed 3 years ago by JJK

which message is daniell seeing? same as before? any way to reproduce his setup? the big problem with this bug report is that it does not address the underlying bug (the reproducability issue) but it *did* fix the issue for several people: classical case of "bad solution to a bad problem"

comment:11 Changed 3 years ago by Daniel

Guys, I reported this four years ago. The fact that the patch was/wasn't working for me then is irrelevant... I don't even live in the same city, and I sure don't know the details or the environment of this setup anymore. I don't know how to reproduce this, that script has long gone, in fact, the whole setup has been long gone.
This reminds me of a few KDE bugs I had reported when I was still in high school, and got the flash-back when after 7 years they replied with an inquiry if it is still reproducible :-)

Daniel

comment:12 in reply to:  11 Changed 3 years ago by Gert Döring

Replying to daniell:

Guys, I reported this four years ago.

well... sorry for that. We do our best, but some of the bugs are harder than others - like this one. And, like most open source projects, we lack developer resources...

comment:13 Changed 3 years ago by Daniel

Oh, hey man, don't get me wrong, I wasn't trying to hold anyone accountable. This was simply a "fun fact", and I was just trying to state that since a very long time, I'm unable to test or reproduce this in the original environment -- it (along with me from there) has been long gone.

Daniel

comment:14 Changed 3 years ago by JJK

still playing around with this, and actually managed to reproduce it with 2.3.6 !
my test config is a basic tun setup with auth-user-pass-verify + client-disconnect + reneg-sec 40
After several key renegotiations and a termination of the client the disconnect script prints the common_name instead of the username. The "reneg-sec XX" is crucial here.

Now all we have to do is figure out what is causing this to happen.

comment:15 Changed 3 years ago by JJK

follow-up: here's what happens:

  • client connects, auth-user-pass-verify script is called with common_name=cert, username=user
  • client auth succeeeds, client-connect script is called. During this, an explict common_name=user is done
  • renegotion: auth-user-pass-verify script is called again, which sets common_name=cert !
  • client-connect is NOT called again, hence common_name is not altered
  • at disconnect, common_name is still set to certificate name

My original patch now makes more sense, although one could argue that the common_name should be set to <user> straight after calling the auth-user-pass-verify script. However, how does the server keep track of which env vars are used for which client?

comment:16 Changed 2 years ago by CLaudi

I am having the same problem and created a new ticket (#691) for it. Sorry for that, I didn't see this one.

Anyway, I will just post my experiences about this problem:
I had a problem with my disconnect script and discovered this bug. I am using the $common_name variable in the disconnect script to receive the "username" while having "username-as-common-name" activated. My expected behaviour would be that the $common_name variable always includes the username submitted by the authentication script. But this is not the case. Somewhen this variable is overwritten and the common name from the certificate is written to it. Sadly I can't tell you exactly when this overwriting happens but my assumption is during a TLS renegotiation or ping timeout. It would be nice if you could fix this as I assume further problems if duplicate-cn is not activated. Then multiple people could have the same common name and would then be disconnected because of that.

Note: See TracTickets for help on using tickets.