Opened 9 years ago
Closed 4 years ago
#528 closed Bug / Defect (fixed)
openvpn server on openbsd 5.7 segfaults when running for several days
Reported by: | alexander.haensch | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | release 2.4.1 |
Component: | Crypto | Version: | OpenVPN 2.3.6 (Community Ed) |
Severity: | Not set (select this one, unless your'e a OpenVPN developer) | Keywords: | openbsd |
Cc: | Steffan Karger |
Description
I have a problem on openbsd 5.7 which introduces some changes to the memcpy, openvpn crashes regularly.
Backtraces are posted to the mailing list for example here:
Change History (12)
comment:1 Changed 9 years ago by
comment:2 Changed 9 years ago by
Until now i had no chance to test the current master.
Backtraces are collected here:
http://marc.info/?l=openbsd-bugs&m=142538404619451&w=2
and here:
http://marc.info/?l=openbsd-bugs&m=142537361416085&w=2
The crash happens during the renegotiation of the connection, caused by an expiring key. Two logfiles are attached here:
comment:3 Changed 9 years ago by
Keywords: | openbsd added |
---|---|
Milestone: | → release 2.4 |
comment:4 Changed 9 years ago by
Is there some documentation to what OpenBSD have changed in memcpy()?
As far as I can see, after a quick look at the backtraces is that the first one is related to a memcpy() operation in x509_get_subject() [ssl_verify_openssl.c:291]. This I also found in addition in another mail in the same thread, so this has happened twice.
For the second one, it is related to a pem_password_callback() [ssl.c:340], but I don't quite see an obvious memcpy() operation - maybe strncpynt() is wrapping memcpy() through a macro on OpenBSD?
It's odd that two very different code paths suddenly bails out on a memcpy() operation in OpenVPN. And both paths are hit fairly often. So not easy to really understand why it happens, so we need to understand if there have been changes to memcpy(), what they are and how we can ensure that doesn't happen.
comment:5 Changed 9 years ago by
I think it is described here: http://www.tedunangst.com/flak/post/OpenBSD-57-highlights
openbsd crashes the program if there are overlapping memory involved.
comment:6 Changed 9 years ago by
I can add, that this is still happening with openvpn 2.3.8 / libressl 2.2.2. Additionally i managed to compile openvpn against openssl 1.0.1p with the same result.
A openssl crashdump will follow soon.
comment:7 Changed 9 years ago by
Found the culprit.
The thing is that openvpn 2.3+ without the compat-names option is not correctly working with spaces in the subject field. The routine is eating through RAM slowly.
In the openbsd case the kernel is correctly crashing the program because openvpn incorrectly overwrites the wrong memory.
Added compat-names to my config fixed the situation.
comment:8 Changed 8 years ago by
Cc: | Steffan Karger added |
---|
thanks for the pointer towards !compat-names - syzzer, do you feel like having a quick look?
comment:9 Changed 8 years ago by
I was thinking that maybe https://www.qualys.com/2015/10/15/cve-2015-5333-cve-2015-5334/libressl-cve-2015-5333-cve-2015-5334.txt is helping but i think it is the same or even crashing faster with the fixes in libressl.
comment:10 Changed 8 years ago by
There's a number of things here.
(1) The memcpy()
segfault that alexander experiences. It points clearly to the memcpy()
call in x509_get_subject()
, which will not get executed if you enable compat-names
. However, that call uses freshly allocated memory (using gc_malloc()
, so the memory *was* successfully allocated) as its destination, and memory returned by openssl/libressl's BIO module as its source. The addresses in the stack traces provided also seem to be too far apart to be overlapping (for valid x509 subjects). Perhaps libressl is returning invalid pointers? I have not seen such errors in OpenSSL builds, and *lots* of people are running openvpn without compat-names.
(2) The trace from http://marc.info/?l=openbsd-bugs&m=142539655624422&w=2. This one is truly mysterious. This is the stack trace:
Program received signal SIGSEGV, Segmentation fault. 0x0000064fce25d090 in print_sockaddr () from /usr/local/sbin/openvpn (gdb) bt #0 0x0000064fce25d090 in print_sockaddr () from /usr/local/sbin/openvpn #1 0x0000064fce25d611 in print_sockaddr () from /usr/local/sbin/openvpn #2 0x0000064fce21dc5f in management_show_net_callback () from /usr/local/sbin/openvpn #3 0x0000064fce21eb0d in management_show_net_callback () from /usr/local/sbin/openvpn #4 0x0000064fce236dd4 in mroute_addr_hash_function () from /usr/local/sbin/openvpn #5 0x0000064fce2093d1 in ?? () from /usr/local/sbin/openvpn #6 0x0000000000000000 in ?? ()
But, mroute_addr_has_function() does not call management_show_net_callback(), nor does management_show_net_callback() call print_sockaddr():
void management_show_net_callback (void *arg, const int msglevel) { #ifdef WIN32 show_routes (msglevel); show_adapters (msglevel); msg (msglevel, "END"); #else msg (msglevel, "ERROR: Sorry, this command is currently only implemented on Windows"); #endif }
So there's not much to go on here.
(3) The incorrect handling of sockaddr_in/sockaddr_in6 in socket.c : http://marc.info/?l=openbsd-bugs&m=142540313828879&w=2. This one reached us recently through a different path and was fixed by Gert in the release/2.3 branch (will be part of the next release): https://github.com/OpenVPN/openvpn/commit/cdbadd00.
comment:11 Changed 7 years ago by
Milestone: | release 2.4 → release 2.4.1 |
---|
I'm bumping this to 2.4.1 to point out that it won't make 2.4 (and I admit I had no time to investigate).
Alexander: could this be fixed already by the commit syzzer referenced? That was a bit sloppy with memory, so could have been the one.
comment:12 Changed 4 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
I think I'll close this. No feedback in 3 years, and I have not heard anything else from the OpenBSD camp since then (and we do test on OpenBSD 6.7 these days and it does not crash).
So maybe it truly was the sockaddr fix...
Could you try with git master, please? If it's all "getaddrinfo() handling" (the link you posted is), the code in master is sufficiently different that this problem might already be fixed.
Of course we need to fix it in the 2.3 branch as well.