Opened 6 months ago

Closed 6 months ago

Last modified 6 months ago

#1272 closed Bug / Defect (fixed)

One client kills other client session via false client floating

Reported by: kia0 Owned by: stipa
Priority: major Milestone: release 2.4.9
Component: Generic / unclassified Version: OpenVPN 2.4.8 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: client float peer_id reuse
Cc: tincantech

Description

One client can effectively stop VPN traffic of another client by 'client float' mechanism in case of reuse peer_id.

Steps to reproduce:

OpenVPN 2.4.8 process acting as server using tun interface and UDP. Client authentication is by SSL key/certificate, different for all clients. Option 'duplicate-cn' is not set. At least one client ("aggressor") uses single key/cert on two or more devices simultaneously, so the sessions preempts each other. Another client ("victim") connects at time between aggressor's first and second device and get reuse peer_id from the stale aggressor's session (you need a luck to have peer_id reused). When a packet from first aggressor's session came to the server, it floats the victim's session to first aggressor's and stop respond to victim. Victim can't get traffic from VPN and then drops the session by keep-alive timeout

As aggressor's sessions preempts each other continuously, it can randomly freezes/kills other client sessions

Here is scenario:

1) Aggressor connects from the first device, ip:port is a.b.c.d:xxx, authenticates using key1/cert1, get peer_id=123, have sessions A1

2) Aggressor connects from the second device ip:port=e.f.g.h:yyy, authenticates using same key1/cert1, get peer_id=234, get session A2 replacing A1. Server logged "new connection by client 'aggressor' will cause previous active sessions by this client to be dropped". Peer_id 123 is freed. The first device don't informed about this and still have A1 valid

3) Victim connects from ip:port=i.j.k.l:zzz, authenticates using key2/cert2 and occasionally get peer_id=123, get session V1

4) Aggressor's first device sends network traffic to VPN using session A1 which is still valid on the device. Data packet with peer_id=123 come to the server and is not dropped. The server think this packet is from V1 because it has this peer_id. Server erroneously floats V1 to A1 logging "peer 123 floated from i.j.k.l:zzz to a.b.c.d:xxx"

5) As session keys in A1 and V1 are different, server drops incoming data traffic from both and logging errors "local/remote TLS keys are out of sync" or somewhat similar.

6) Victim does not see keep-alive packets come back and drops V1 after configured timeout. First aggressor's device do the same for A1

7) First aggressor's device reconnects, makes a session A3, which preempts A2. The scenario repeats

8) Victim try to reconnect and can have a luck if peer_id got is different from 234

I think good result can be not to float peer by just peer_id or to just drop stale A1 packets on server at step 4

Attachments (2)

openvpn-ticket1272.log (30.7 KB) - added by kia0 6 months ago.
Example log
openvpn-server.conf (1.3 KB) - added by kia0 6 months ago.
Server config

Download all attachments as: .zip

Change History (12)

Changed 6 months ago by kia0

Attachment: openvpn-ticket1272.log added

Example log

comment:1 Changed 6 months ago by kia0

Example log from server, some data was masked for privacy.

"Aggressor" has CN=akuznetcov, he has a Windows notebook and an Android phone both connected to home WiFi?, public ip is 185.97.aa.bb. "Victim" is 'magma' with Windows notebook connected via mobile network, public ip is 188.170.cc.dd

Changed 6 months ago by kia0

Attachment: openvpn-server.conf added

Server config

comment:2 Changed 6 months ago by kia0

My server is under Oracle Linux 6 (clone of RHEL/CentOS 6), openvpn-2.4.8 from EPEL repository. 100+ clients

comment:3 Changed 6 months ago by Gert Döring

Mmmh, peer-id alone is not sufficient to kick out someone. You also need to have a valid session key for the current owner of this peer-id - so this surprises me a bit.

In other words, we do not "float on peer-id alone" already today.

comment:4 Changed 6 months ago by tincantech

Cc: tincantech added

comment:5 Changed 6 months ago by kia0

I have a network dump for this example and can share it privately if need

comment:6 Changed 6 months ago by stipa

Hi kia0,

I checked the log and it seems the problem is in those lines:

160	Wed Apr  8 16:16:36 2020 us=652681 Float requested for peer 115 to 185.97.aa.bb:13057
161	Wed Apr  8 16:16:36 2020 us=652710 Key [AF_INET]188.170.cc.dd:47533 [0] not initialized (yet), dropping packet.
162	Wed Apr  8 16:16:36 2020 us=652726 peer 115 (magma) floated from 188.170.cc.dd:47533 to [AF_INET]185.97.aa.bb:13057

What is happening here is that by some (*) reason key state got messed up

                    if (!ks->crypto_options.key_ctx_bi.initialized)
                    {
                        msg(D_MULTI_DROPPED,
                            "Key %s [%d] not initialized (yet), dropping packet.",
                            print_link_socket_actual(from, &gc), key_id);
                        goto error_lite;
                    }

so openvpn marked incoming packet as dropped by setting length to 0:

done:
    buf->len = 0;
    *opt = NULL;
    gc_free(&gc);
    return ret;

error:
    ++multi->n_soft_errors;
error_lite:
    tls_clear_error();
    goto done;

Despite packet length being set to 0, openvpn continues to process it and attempts to decrypt:

        /* authenticate and decrypt the incoming packet */
        decrypt_status = openvpn_decrypt(&c->c2.buf, c->c2.buffers->decrypt_buf,
                                         co, &c->c2.frame, ad_start);

Decrypt code returns "true" if packet length is 0:

bool
openvpn_decrypt(struct buffer *buf, struct buffer work,
                struct crypto_options *opt, const struct frame *frame,
                const uint8_t *ad_start)
{
    bool ret = false;

    if (buf->len > 0 && opt)
    {
        /* skip */
    }
    else
    {
        ret = true;
    }
    return ret;
}

which makes process_incoming_link_part1() return true, which leads to float being committed:

            if (process_incoming_link_part1(c, lsi, floated))
            {
                if (floated)
                {
                    multi_process_float(m, m->pending);
                }

                process_incoming_link_part2(c, lsi, orig_buf);
            }

If the key state (*) wouldn't be messed up, openvpn_decrypt() will obviously fail to decrypt a packet from attacker using victim's crypto state on the server.

The log doesn't exactly match scenario you've described. Here is what I see
(user A is akuznetcov, user B is magma):

  • user A from address 185.97.aa.bb:13057 establishes session A1, gets peer-id 115
  • rekeying for A1 happens hour later
  • user B establishes session B1, gets peer 33
  • user A establishes session A2, gets peer-id 138, kicks A1
  • user A closes client, terminating A2 (SIGTERM[soft,remote-exit] received, client-instance exiting)
  • user B from address 188.170.cc.dd:47533 establishes session B2, likely gets peer-id 115, and during handshake..
  • data packet arrives from A1 (address 185.97.aa.bb:13057)
  • data packet contains peer-id 115 and openvpn assumes that peer floated from 188.170.cc.dd:47533 to 185.97.aa.bb:13057

It is unclear for me what causes "key not initialized", probably a "rogue" packet has arrived in a very (un)fortunate moment between peer-id being already allocated and push request (which triggers data channel cipher initialization) hasn't yet been yet received.

Anyway, I believe the fix is simple:

diff --git a/src/openvpn/multi.c b/src/openvpn/multi.c
index da0aeeba..58892a87 100644
--- a/src/openvpn/multi.c
+++ b/src/openvpn/multi.c
@@ -2555,7 +2555,7 @@ multi_process_incoming_link(struct multi_context *m, struct multi_instance *inst
             orig_buf = c->c2.buf.data;
             if (process_incoming_link_part1(c, lsi, floated))
             {
-                if (floated)
+                if (floated && c->c2.buf.len > 0)
                 {
                     multi_process_float(m, m->pending);
                 }

This prevents float if packet marked as "dropped".

Would it be possible for you to apply this patch and check if it has fixed the problem?

Last edited 6 months ago by stipa (previous) (diff)

comment:7 Changed 6 months ago by stipa

Owner: set to stipa
Status: newaccepted

comment:9 Changed 6 months ago by stipa

Milestone: release 2.4.8release 2.4.9
Resolution: fixed
Status: acceptedclosed

comment:10 Changed 6 months ago by Gert Döring

commit 37bc691e7d26ea4eb61a8a434ebd7a9ae76225ab (master)
commit f7b318f811bb43c0d3aa7f337ec6242ed2c33881 (release/2.4)
Author: Lev Stipakov
Date: Wed Apr 15 10:30:17 2020 +0300

Fix illegal client float (CVE-2020-11810)

(just for reference)

Note: See TracTickets for help on using tickets.