Opened 6 years ago

Closed 12 months ago

#528 closed Bug / Defect (fixed)

openvpn server on openbsd 5.7 segfaults when running for several days

Reported by: alexander.haensch Owned by:
Priority: major Milestone: release 2.4.1
Component: Crypto Version: OpenVPN 2.3.6 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: openbsd
Cc: Steffan Karger

Description

I have a problem on openbsd 5.7 which introduces some changes to the memcpy, openvpn crashes regularly.

Backtraces are posted to the mailing list for example here:

http://marc.info/?l=openbsd-bugs&m=142540517429811&w=2

Change History (12)

comment:1 Changed 6 years ago by Gert Döring

Could you try with git master, please? If it's all "getaddrinfo() handling" (the link you posted is), the code in master is sufficiently different that this problem might already be fixed.

Of course we need to fix it in the 2.3 branch as well.

comment:2 Changed 5 years ago by alexander.haensch

Until now i had no chance to test the current master.

Backtraces are collected here:

http://marc.info/?l=openbsd-bugs&m=142538404619451&w=2

and here:

http://marc.info/?l=openbsd-bugs&m=142537361416085&w=2

The crash happens during the renegotiation of the connection, caused by an expiring key. Two logfiles are attached here:

http://marc.info/?l=openbsd-bugs&m=142479189811894&w=2

comment:3 Changed 5 years ago by Samuli Seppänen

Keywords: openbsd added
Milestone: release 2.4

comment:4 Changed 5 years ago by David Sommerseth

Is there some documentation to what OpenBSD have changed in memcpy()?

As far as I can see, after a quick look at the backtraces is that the first one is related to a memcpy() operation in x509_get_subject() [ssl_verify_openssl.c:291]. This I also found in addition in another mail in the same thread, so this has happened twice.

For the second one, it is related to a pem_password_callback() [ssl.c:340], but I don't quite see an obvious memcpy() operation - maybe strncpynt() is wrapping memcpy() through a macro on OpenBSD?

It's odd that two very different code paths suddenly bails out on a memcpy() operation in OpenVPN. And both paths are hit fairly often. So not easy to really understand why it happens, so we need to understand if there have been changes to memcpy(), what they are and how we can ensure that doesn't happen.

comment:5 Changed 5 years ago by alexander.haensch

I think it is described here: http://www.tedunangst.com/flak/post/OpenBSD-57-highlights
openbsd crashes the program if there are overlapping memory involved.

comment:6 Changed 5 years ago by alexander.haensch

I can add, that this is still happening with openvpn 2.3.8 / libressl 2.2.2. Additionally i managed to compile openvpn against openssl 1.0.1p with the same result.
A openssl crashdump will follow soon.

comment:7 Changed 5 years ago by alexander.haensch

Found the culprit.
The thing is that openvpn 2.3+ without the compat-names option is not correctly working with spaces in the subject field. The routine is eating through RAM slowly.
In the openbsd case the kernel is correctly crashing the program because openvpn incorrectly overwrites the wrong memory.

Added compat-names to my config fixed the situation.

comment:8 Changed 5 years ago by Gert Döring

Cc: Steffan Karger added

thanks for the pointer towards !compat-names - syzzer, do you feel like having a quick look?

comment:9 Changed 5 years ago by alexander.haensch

I was thinking that maybe https://www.qualys.com/2015/10/15/cve-2015-5333-cve-2015-5334/libressl-cve-2015-5333-cve-2015-5334.txt is helping but i think it is the same or even crashing faster with the fixes in libressl.

comment:10 Changed 5 years ago by Steffan Karger

There's a number of things here.

(1) The memcpy() segfault that alexander experiences. It points clearly to the memcpy() call in x509_get_subject(), which will not get executed if you enable compat-names. However, that call uses freshly allocated memory (using gc_malloc(), so the memory *was* successfully allocated) as its destination, and memory returned by openssl/libressl's BIO module as its source. The addresses in the stack traces provided also seem to be too far apart to be overlapping (for valid x509 subjects). Perhaps libressl is returning invalid pointers? I have not seen such errors in OpenSSL builds, and *lots* of people are running openvpn without compat-names.

(2) The trace from http://marc.info/?l=openbsd-bugs&m=142539655624422&w=2. This one is truly mysterious. This is the stack trace:

Program received signal SIGSEGV, Segmentation fault.
0x0000064fce25d090 in print_sockaddr () from /usr/local/sbin/openvpn
(gdb) bt
#0  0x0000064fce25d090 in print_sockaddr () from /usr/local/sbin/openvpn
#1  0x0000064fce25d611 in print_sockaddr () from /usr/local/sbin/openvpn
#2  0x0000064fce21dc5f in management_show_net_callback () from
/usr/local/sbin/openvpn
#3  0x0000064fce21eb0d in management_show_net_callback () from
/usr/local/sbin/openvpn
#4  0x0000064fce236dd4 in mroute_addr_hash_function () from
/usr/local/sbin/openvpn
#5  0x0000064fce2093d1 in ?? () from /usr/local/sbin/openvpn
#6  0x0000000000000000 in ?? ()

But, mroute_addr_has_function() does not call management_show_net_callback(), nor does management_show_net_callback() call print_sockaddr():

void
management_show_net_callback (void *arg, const int msglevel)
{
#ifdef WIN32
  show_routes (msglevel);
  show_adapters (msglevel);
  msg (msglevel, "END");
#else
  msg (msglevel, "ERROR: Sorry, this command is currently only implemented on Windows");
#endif
}

So there's not much to go on here.

(3) The incorrect handling of sockaddr_in/sockaddr_in6 in socket.c : http://marc.info/?l=openbsd-bugs&m=142540313828879&w=2. This one reached us recently through a different path and was fixed by Gert in the release/2.3 branch (will be part of the next release): https://github.com/OpenVPN/openvpn/commit/cdbadd00.

comment:11 Changed 4 years ago by Gert Döring

Milestone: release 2.4release 2.4.1

I'm bumping this to 2.4.1 to point out that it won't make 2.4 (and I admit I had no time to investigate).

Alexander: could this be fixed already by the commit syzzer referenced? That was a bit sloppy with memory, so could have been the one.

comment:12 Changed 12 months ago by Gert Döring

Resolution: fixed
Status: newclosed

I think I'll close this. No feedback in 3 years, and I have not heard anything else from the OpenBSD camp since then (and we do test on OpenBSD 6.7 these days and it does not crash).

So maybe it truly was the sockaddr fix...

Note: See TracTickets for help on using tickets.