Opened 7 years ago

Closed 7 years ago

#263 closed Bug / Defect (fixed)

OpenVPN 2.3.0 disconnect issue when using tcp-server

Reported by: pivot Owned by:
Priority: major Milestone:
Component: Networking Version: OpenVPN 2.3.0 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: TCP, Crypto, Disconnect, Network
Cc:

Description

I am a long time user of OpenVPN and I recently upgraded to 2.3.0 on my server. For technical reasons I am bound to use OpenVPN over TCP which has worked great so far.

After the upgrade I started noticing that clients got randomly disconnected from time to time. The client log suggests there were a crypto error (see error-log.txt).

I have made the following observations:

  • When the server passes data in a faster phase than the network can handle, the client reports a crypto error and disconnects.
  • This only applies when the server is running 2.3.0. The client version does not matter.
  • This bug is present in both the precompiled Arch repository and when compiling from source.
  • The bug is present regardless if OpenVPN is compiled with OpenSSL or PolarSSL support.
  • It does not matter which --cipher is used. (Tested with the default Blowfish and AES-256-CBC)
  • I have verified this behaviour with three different servers on different hardware, running Ubuntu 12.10, Ubuntu 11.10 and the latest release of Arch Linux.
  • Different (newly generated) certificate authorities were used in each setup with key lengths ranging from 1024 to 8192 bits.
  • Clients were running 2.3.0 on Windows 7, Ubuntu 12.10 and iOS. Also verified this behaviour on 2.2.2 on both Windows and Linux.
  • When switching to UDP I was unable to reproduce this bug.

I have attached a compressed folder of the client and server configurations, together with a set of certificates. I have also included a small PHP script that will UDP flood a specific IP address.

This bug can be reproduced by:

  • Extract bug.tar.gz
  • Use server.ovpn on the server and client.ovpn on the client. Use two separate computers, I was unable to reproduce the bug when running both the server and client on the same machine.
  • Run the included PHP script from the server side, causing the UDP flood to go over the VPN tunnel to the client machine.
  • The client usually disconnects within a couple of seconds.

Attachments (2)

bug.tar.gz (9.7 KB) - added by pivot 7 years ago.
The configuration used to reproduce this bug
error-log.txt (3.3 KB) - added by pivot 7 years ago.
The error reported by a Windows client running 2.3.0.

Download all attachments as: .zip

Change History (7)

Changed 7 years ago by pivot

Attachment: bug.tar.gz added

The configuration used to reproduce this bug

Changed 7 years ago by pivot

Attachment: error-log.txt added

The error reported by a Windows client running 2.3.0.

comment:1 Changed 7 years ago by pivot

I've played around with this bug a little more and would like to add that it's also possible to disconnect other clients if the server has --client-to-client specified. In my case, since I run a layer 2 TAP network, it was also possible to disconnect all clients by flooding the broadcast address.

Some emails on the mailing list suggested that I should increase the broadcast buffer size. This did little or no difference. Also, disabling encryption (--auth none, --cipher none) did not do any difference.

Last edited 7 years ago by pivot (previous) (diff)

comment:2 Changed 7 years ago by Gert Döring

The question that I find most interesting is: which version works, which version is broken?

If I read this right, it's in 2.2 as well as 2.3 - and since the changes from 2.1 to 2.2 are actually quite small, I assume that it sneaked in one of the (many) 2.1.x subreleases...

comment:3 Changed 7 years ago by pivot

I did not find this bug in any 2.2 (or earlier) version. What I meant was that the client version doesn't matter as long as the server is running 2.3.0.

comment:4 Changed 7 years ago by pivot

Investigating this bug using git bisect shows that commit 4029971240b6274b9b30e76ff74c7f689d7d9750 is causing things to break.

comment:5 Changed 7 years ago by Gert Döring

Resolution: fixed
Status: newclosed

Found the bug, "bool mbuf_len()" in mbuf.h should be "unsigned int".

Fix verified by original reporter, commited to master and release/2.3 branch.

commit 0eb398501fab9c016b9b6008682c43873c4a6188 (master)
commit 80b4b1e740de60a7f94132ac4bebcd9474fbe182 (release/2.3)

Note: See TracTickets for help on using tickets.