Opened 3 years ago
Last modified 3 years ago
#1434 closed Bug / Defect
Client-disconnect script env does not have common_name — at Version 11
Reported by: | tct | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | release 2.5.5 |
Component: | Generic / unclassified | Version: | OpenVPN git master branch (Community Ed) |
Severity: | Not set (select this one, unless your'e a OpenVPN developer) | Keywords: | |
Cc: |
Description (last modified by )
Under certain circumstances, --client-disconnect
has the wrong value for common_name
in the env
.
In this case --username-as-common-name
is in use and username
is not the same as common_name
.
Possibly Related to: #167
What actually takes place is as follows:
- Client connects normally - UDP -
username
is assigned tocommon_name
- Client renegotiates normally.
- Client disconnects normally.
This leads the server supplying common_name
not username
to --client-disconnect
, every time.
I ran a single standard server and a single standard client.
I used a simple script to monotonically dis/enable routing to the server, over a period of 10mins.
The client uses --reneg-sec 270
, so the client has always done a renegotiate when the routing becomes disabled. Then the server does --ping-timeout
for the client.
Then the server runs --client-disconnect
with the wrong value for common_name
, every time.
The server ran for ~20 hours and generated this error 130 times.
Change History (11)
comment:3 Changed 3 years ago by
This is a shorter log from client-connect to disconnect.
In this case only a float is seen not a renegotiation. Followed by a timeout.
I have removed some excess noise:
2021-10-18 20:33:55 us=744310 core_01_b/10.20.1.2:36433 MULTI_sva: pool returned IPv4=10.171.22.201, IPv6=(Not enabled) 2021-10-18 20:33:55 us=744480 core_01_b/10.20.1.2:36433 OPTIONS IMPORT: reading client specific options from: /etc/openvpn/server/v31a/ccd/core_01_b <EXOK> * EasyTLS-client-connect => vars loaded => dyn opts loaded => CN: core_01_b => conntrac: tallied => conntrac: registered => tls key serial: 638a99cd17661fcf1b8e41165f259d486dae443a9e3bf7879069f4ffcc35e7a5 => fixed_md_file loaded => metadata -> x509 serial match => Key is not locked to hwaddr => connection allowed => temp-files deleted 2021-10-18 20:33:55 us=766992 core_01_b/10.20.1.2:36433 OPTIONS IMPORT: reading client specific options from: /tmp/openvpn_cc_785fbac066fdc39090b1703aedf641e.tmp 2021-10-18 20:33:55 us=767238 core_01_b/10.20.1.2:36433 OPTIONS IMPORT: timers and/or timeouts modified 2021-10-18 20:33:55 us=767272 core_01_b/10.20.1.2:36433 MULTI: Learn: 10.171.22.201 -> core_01_b/10.20.1.2:36433 2021-10-18 20:33:55 us=767295 core_01_b/10.20.1.2:36433 MULTI: primary virtual IP for core_01_b/10.20.1.2:36433: 10.171.22.201 2021-10-18 20:33:55 us=767322 core_01_b/10.20.1.2:36433 Data Channel: using negotiated cipher 'CHACHA20-POLY1305' 2021-10-18 20:33:55 us=767356 core_01_b/10.20.1.2:36433 Data Channel MTU parms [ L:1535 D:1450 EF:35 EB:406 ET:0 EL:3 AF:14/122 ] 2021-10-18 20:33:55 us=767436 core_01_b/10.20.1.2:36433 Outgoing Data Channel: Cipher 'CHACHA20-POLY1305' initialized with 256 bit key 2021-10-18 20:33:55 us=767462 core_01_b/10.20.1.2:36433 Incoming Data Channel: Cipher 'CHACHA20-POLY1305' initialized with 256 bit key 2021-10-18 20:33:55 us=767535 core_01_b/10.20.1.2:36433 SENT CONTROL [core_01_b]: 'PUSH_REPLY,topology subnet,route-gateway 10.171.22.1,route 10.171.22.1,route 10.66.67.101,comp-lzo no,explicit-exit-notify 1,route 10.66.67.101,ping 10,ping-restart 150,ifconfig 10.171.22.201 255.255.255.0,peer-id 0,cipher CHACHA20-POLY1305,key-derivation tls-ekm' (status=1) 2021-10-18 20:34:54 us=268173 Float requested for peer 0 to 10.20.2.2:36433 2021-10-18 20:34:54 us=268315 peer 0 (core_01_b) floated from 10.20.1.2:36433 to [AF_INET6]::ffff:10.20.2.2:36433 2021-10-18 20:37:11 us=19054 Control Channel: using tls-crypt-v2 key 2021-10-18 20:37:11 us=19146 Outgoing Control Channel Encryption: Cipher 'AES-256-CTR' initialized with 256 bit key 2021-10-18 20:37:11 us=19175 Outgoing Control Channel Encryption: Using 256 bit message hash 'SHA256' for HMAC authentication 2021-10-18 20:37:11 us=19193 Incoming Control Channel Encryption: Cipher 'AES-256-CTR' initialized with 256 bit key 2021-10-18 20:37:11 us=19214 Incoming Control Channel Encryption: Using 256 bit message hash 'SHA256' for HMAC authentication 2021-10-18 20:37:48 us=434569 core_01_b/10.20.2.2:36433 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity) 2021-10-18 20:37:48 us=434652 core_01_b/10.20.2.2:36433 TLS Error: TLS handshake failed 2021-10-18 20:37:48 us=434687 core_01_b/10.20.2.2:36433 TLS: move_session: dest=TM_LAME_DUCK src=TM_ACTIVE reinit_src=1 2021-10-18 20:39:03 us=457918 core_01_b/10.20.2.2:36433 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity) 2021-10-18 20:39:03 us=458029 core_01_b/10.20.2.2:36433 TLS Error: TLS handshake failed 2021-10-18 20:39:16 us=526467 10.20.2.2:41196 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity) 2021-10-18 20:39:16 us=526562 10.20.2.2:41196 TLS Error: TLS handshake failed 2021-10-18 20:39:16 us=526818 10.20.2.2:41196 SIGUSR1[soft,tls-error] received, client-instance restarting 2021-10-18 20:39:58 us=222685 core_01_b/10.20.2.2:36433 [UNDEF] Inactivity timeout (--ping-restart), restarting 2021-10-18 20:39:58 us=222783 core_01_b/10.20.2.2:36433 SIGUSR1[soft,ping-restart] received, client-instance restarting <EXOK> * EasyTLS-client-disconnect => vars loaded => CN: => conntrac: acquire_lock => conntrac: tallied => conntrac pattern: 638a99cd17661fcf1b8e41165f259d486dae443a9e3bf7879069f4ffcc35e7a5=817D664B73EF39A1A788E84E685DC5EB====10.171.22.201 => conntrac: record not found: 638a99cd17661fcf1b8e41165f259d486dae443a9e3bf7879069f4ffcc35e7a5=817D664B73EF39A1A788E84E685DC5EB====10.171.22.201 => conntrac: release_lock => conn_trac_disconnect FAIL => conntrac: acquire_lock => conntrac: tallied => conntrac pattern: 638a99cd17661fcf1b8e41165f259d486dae443a9e3bf7879069f4ffcc35e7a5=817D664B73EF39A1A788E84E685DC5EB==core_01_b==10.171.22.201 => conntrac: unregistered => conntrac: RESET => conntrac: release_lock => conn_trac_disconnect recovered => disconnect completed
comment:4 follow-up: 5 Changed 3 years ago by
JJK's patch in #160 doesn't look right --- its trying to get the common name from the tls session and set it in env each time the script is called. But the tls session may be closed/reset when the disconnect script is executed. In that case there is no common_name in the session as its already wiped by then. And, setting an env var as null is same as unsetting it. You could get "UNDEF" in there if tls_common_name() is called with "false" but that doesn't help.
I think one approach to ensure that the "current" value of common-name is in the env would be to just set it in env right away whenever it changes --- possibly this could be done in set_common_name() itself. Then it is there when scripts are called.
Note that this does not mean the common_name will equal the username at all points in time. cert_verify() sets it to the CN in the certificate and is replaced by username (if --username-as-common_name is in use) only when verify_user_pass() succeeds.
comment:5 Changed 3 years ago by
JJK's patch does not work and masks the problem further, I include the details only for reference.
Also, thanks for the verify_auth_user_pass()
tip! :)
comment:6 Changed 3 years ago by
I ran a similar test for JJK's patch and generated 44 errors in 16hours.
The test made it clear that JJK's patch will not set common_name
at all after the client renegotiates and then times out. If the client disconnects using --explicit-exit-notify
then JJK's patch works, as expected.
The reason to record these findings it because I have thoroughly tested the results and these are the steps required to reproduce the issues.
Summary:
- Without JJK's patch, all that is required to trigger the original problem is for the client to renegotiate and then disconnect. The disconnect can be a time out or explicit, the problem always occurs, regardless.
- With JJK's patch, the client must renegotiate and then time out to trigger the second problem.
For absolute clarity:
- The client must renegotiate in all circumstances.
- I do not believe floating has any effect on the original issue here.
However, float probably has some effect, if JJK's patch is applied.
comment:7 Changed 3 years ago by
If anybody simply wants a solution then use Easy-TLS.
easytls-client-disconnect.sh
has the following code, which catches and corrects for both errors (and others) above.
conntrac_record
uses the current common_name
, where as conntrac_alt_record
uses username
explicitly.
# Disconnect common_name conn_trac_disconnect "${conntrac_record}" "${EASYTLS_CONN_TRAC}" || { case $? in 2) # Not fatal because errors are expected #160 update_status "conn_trac_disconnect FAIL" conntrac_fail=1 log_env=1 ;; 1) # Fatal because these are usage errors [ $FATAL_CONN_TRAC ] && { ENABLE_KILL_PPID=1 die "CONNTRAC_DISCONNECT_FILE_ERROR" 99 } update_status "conn_trac_disconnect ERROR" conntrac_error=1 log_env=1 ;; *) # Absolutely fatal ENABLE_KILL_PPID=1 die "CONNTRAC_DISCONNECT_UNKNOWN" 98 ;; esac } # If the first failed for number two then try again .. if [ $conntrac_fail ] then # Disconnect username conn_trac_disconnect "${conntrac_alt_rec}" "${EASYTLS_CONN_TRAC}" || { case $? in 2) # fatal later - because errors could happen #160 update_status "conn_trac_disconnect A-FAIL" conntrac_alt_fail=1 log_env=1 ;; 1) # Fatal because these are usage errors [ $FATAL_CONN_TRAC ] && { ENABLE_KILL_PPID=1 die "CONNTRAC_DISCONNECT_ALT_FILE_ERROR" 99 } update_status "conn_trac_disconnect A-ERROR" conntrac_alt_error=1 log_env=1 ;; *) # Absolutely fatal ENABLE_KILL_PPID=1 die "CONNTRAC_DISCONNECT_UNKNOWN" 98 ;; esac } fi
comment:8 follow-up: 10 Changed 3 years ago by
Milestone: | → release 2.6 |
---|---|
Version: | → OpenVPN git master branch (Community Ed) |
Thanks for making this so easy to reproduce.
Can we close the #160 bug then (because that ticket is way too long now :-) )?
... and let's see if we can find the right spot in our code before 2.6.0 to fix this.
comment:9 Changed 3 years ago by
Description: | modified (diff) |
---|
comment:10 Changed 3 years ago by
I am hoisted by my own petard!
I would prefer that people use Easy-TLS to solve the problem ;-)
Phase 3: Profit!
comment:11 Changed 3 years ago by
Description: | modified (diff) |
---|
If JJK's patch is applied then the behaviour becomes slightly different but the problem is not resolved.
In this case, the client does the following:
username
is assigned tocommon_name
Float normally.Not requirednormallyby time-out only.This leads to the server not supplying any value to
common_name
for--client-disconnect
.