Opened 13 years ago

Closed 10 years ago

#159 closed Bug / Defect (fixed)

Packet Dropped over routed tunnel (always the same packet)

Reported by: markc Owned by:
Priority: major Milestone:
Component: Networking Version: OpenVPN 2.2.2 (Community Ed)
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: lost packet MSSQL tun
Cc: Gert Döring

Description

I have what I think is a pretty strange one here that is preventing some of my users from accessing MSSQL servers. I've done some packet captures in wireshark and tcpdump (and compared them to captures of working clients) and I can see, 100% of the time, which packet is getting 'eaten' before reaching the other end of the tunnel. I will provide what information I believe to be pertinent but please let me know what else you require and I'll be happy to get it for you.

Background/info:
We have a routed openvpn service (proto udp) set up for road warriors and as a whole it has been 100%. We've had few difficulties, and it is SO much better than what we had before (thanks!). For some reason though I'm not able to work with MSSQL servers, depending on which version of OpenVPN GUI is downloaded and installed from the community site.

My openvpn client on my machine is still "OpenVPN 2.2-beta5 i686-pc-mingw32 [SSL] [LZO2] [PKCS11] built on Nov 30 2010" on Win7 x64 Pro and I did NOT experience this issue I'm reporting (my MSSQL access works via openvpn).

However, I set a few new clients up (2 WinXP, 1 Win7, all x86) on 2.2.1 built ~Jul 1 2011 and they couldn't access MSSQL servers; everything else was fine (rdp, http, smb/cifs, etc.). Their MSSQL access still worked over our old PPTP vpn and on the LAN, and the issue over openvpn was resolved when I uninstalled 2.2.1 and installed the same 2.2-beta5 client that I had.

My openvpn server is running on Ubuntu 10.04.3 LTS (2.6.32-33-server #72-Ubuntu SMP Fri Jul 29 21:21:55 UTC 2011 x86_64 GNU/Linux).

...and its version is: OpenVPN 2.1.0 x86_64-pc-linux-gnu [SSL] [LZO2] [EPOLL] [PKCS11] [MH] [PF_INET6] [eurephia] built on Jul 20 2010

As you may know, Microsoft has the PortQueryUI tool to test basic connectivity to certain services. I don't have to run the captures on the actual end-user application starting up and failing, I only have to capture on PortQueryUI trying a basic MSSQL connection. Due to this...the capture is only 7 packets long (yay!). (and, reliably, if PortQueryUI is working, the problem is not present.)

I'm attaching the summary version of the capture from the client. The capture on the client is identical regardless of which version of the openvpn client is installed. It shows a few UDP packets as the SQL instance is located, and then a TCP transaction on 1433 testing connectivity to that instance (since TCP 1433 is the default, PortQueryUI will try connecting there regardless of whether the UDP part succeeds). On the server end, using tcpdump, packet #2 never shows up if the client is on the 2.2.1 client (and as consequence packet #3 is not sent in response). So the capture on the openvpn server only sees the remaining 5 packets.

For clients running 2.2-beta5, the capture looks the same on the client and on the server (there are all 7 packets and things suddenly start working again).

I've looked at the openvpn logs on both ends and unfortunately, during the test portion, it goes from silence (around verb 3-5) to a tun-packet vomit session (around verb 6 and 7). I can post them if needed though.

Server conf:
dev tun0
server 172.19.0.0 255.255.255.0
user nobody
group nogroup
persist-key
persist-tun
proto udp
#socket-flags TCP_NODELAY
port 1196
comp-lzo adaptive
push "route 10.8.14.0 255.255.255.0"
push "route 10.8.15.0 255.255.255.0"
push "route 10.8.16.0 255.255.255.0"
push "dhcp-option DOMAIN mydomain.com"
push "dhcp-option DNS 10.8.14.30"
push "dhcp-option DNS 10.8.15.30"
push "redirect-gateway def1"
duplicate-cn
plugin /usr/lib/openvpn/openvpn-auth-pam.so /etc/pam.d/login
#client-to-client
keepalive 15 45
ping-timer-rem
dh /etc/openvpn/dh1024.pem
ca /etc/openvpn/ca.crt
cert /etc/openvpn/servername.crt
key /etc/openvpn/servername.key
status /etc/openvpn/status 30
status-version 1
log /var/log/openvpn.log
verb 3

Client conf:
dev tun0
client
remote servername.mydomain.com
proto udp
port 1196
ca ca.crt
cert cnname.crt
key cnname.key
ns-cert-type server
comp-lzo adaptive
route-method exe
float
nobind
auth-user-pass
#auth-nocache

Please let me know if you have any ideas or questions and thanks in advance!

Attachments (1)

prtqrybrief.txt (4.3 KB) - added by markc 13 years ago.
"short" version of capture data from client

Download all attachments as: .zip

Change History (6)

Changed 13 years ago by markc

Attachment: prtqrybrief.txt added

"short" version of capture data from client

comment:1 Changed 13 years ago by markc

Anyone?

I might ought to mention (confess) that yes, I'm aware the attached packet capture was made on a LAN without the VPN, and that anyone paying close attention would notice that I merely changed the source IP to one that the VPN would use. The capture looks the same over the VPN; just different interfaces show (as would be expected).

I chose to start with that because I'd have to take away a user's laptop for a while (again) to get the over-VPN capture. The only thing that changes are the mac addresses but if you have to have the actual capture over the VPN I can do this.

Thank you,
Mark

comment:2 Changed 11 years ago by Samuli Seppänen

Went through Git logs between v2.2-beta5 and v2.2.1. When CA, packaging, build, documentation etc. commits are removed, we're left with

  • Don't define ENABLE_PUSH_PEER_INFO if SSL is not available (4fe914a0)
  • Fixed bug in port-share that could cause port share process to crash (c7dd80cf)
  • Fix the --client-cert-not-required feature (272aef2f)
  • Implement IPv6 in TUN mode for Windows TAP driver (0265cf3a)

Out of these the last one would seem the logical culprit.

Has this been fixed, and if not, can it be reproduced with the latest OpenVPN release?

comment:3 Changed 11 years ago by Gert Döring

Cc: Gert Döring added

comment:4 Changed 11 years ago by Gert Döring

I seem to remember that we had a windows tap driver bug in the 2.2 series (the new code for IPv6 broke IPv4 under "some" circumstances, which was then fixed "some time later").

Maybe this is related.

Markc, could you please re-try with 2.3.1? We're sorry that this has not been acted upon for such a long time - but there's a good chance that this was already fixed.

comment:5 Changed 10 years ago by Gert Döring

Component: Generic / unclassifiedNetworking
Resolution: fixed
Status: newclosed
Version: 2.2.2

Re-reading this, I'm fairly sure that it's this commit which went into the 2.3 train over two years ago...

commit 10b99726a30bb7252cb01806f5f276be7873e84e
Author: Gert Doering <gert@…>
Date: Thu Nov 10 20:15:44 2011 +0100

add missing break between "case IPv4" and "case IPv6", leading to the
minimum-size for IPv6 being applied to IPv4 packets, subsequently
leading to drop of small-sized IPv4 packets.


Bug found & fixed by Christian Niessner.

As there was no feedback from the original reporter in over 9 month time, I'm now closing the bug and claim it's fixed. Reopen if you can still reproduce with 2.3.2

Note: See TracTickets for help on using tickets.