Opened 2 years ago

Last modified 6 months ago

#949 new TODO (General task list)

Forward Error Correction for OpenVPN

Reported by: wangyucn Owned by:
Priority: major Milestone:
Component: Generic / unclassified Version:
Severity: Not set (select this one, unless your'e a OpenVPN developer) Keywords: FEC, rs code, forward error correction
Cc:

Description

As discussed in this thread,FEC can improve the connection quality on a lossy link:

https://forums.openvpn.net/viewtopic.php?f=10&t=14395

I have implemented the feature and did some test.The algorithm for FEC is Reed-Solomon.

Test Result

environment

openvpn client running inside a single core VPS in Tokyo,with 512mb ram

openvpn server running inside a single core VPS in Los Angeles,with 128mb ram.

The network roundtrip between two machines is about 110~120ms.A simulated packet loss of 10% is introduced at both direction.

The parameter for FEC is 20:10,that means sending 10 redundant packets for every 20 original packets.It costs about 1.5 times of bandwidth when compared with original data stream.(A packet loss of 10% is actually very high. For lower packet loss,sure,you can reduce the FEC parameter and use less bandwidth)

SCP TCP single thread copy test

OpenVPN without FEC

$ scp 0.test.file 10.222.2.1:~                                                                                                                                                              
root@10.222.2.1's password:
0.test.file                                                   0% 3600KB  29.5KB/s 3:25:38 ETA

OpenVPN with FEC

$ scp 0.test.file 10.222.2.1:~
root@10.222.2.1's password:
0.test.file                                    45%  162MB   3.6MB/s   00:55 ETA

ping packet loss test

OpenVPN without FEC

$ ping 10.222.2.1 -O
PING 10.222.2.1 (10.222.2.1) 56(84) bytes of data.
64 bytes from 10.222.2.1: icmp_seq=1 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=2 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=3 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=4 ttl=64 time=118 ms
no answer yet for icmp_seq=5
64 bytes from 10.222.2.1: icmp_seq=6 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=7 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=8 ttl=64 time=118 ms
no answer yet for icmp_seq=9
64 bytes from 10.222.2.1: icmp_seq=10 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=11 ttl=64 time=118 ms
no answer yet for icmp_seq=12
64 bytes from 10.222.2.1: icmp_seq=13 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=14 ttl=64 time=118 ms
no answer yet for icmp_seq=15
64 bytes from 10.222.2.1: icmp_seq=16 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=17 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=18 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=19 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=20 ttl=64 time=118 ms
no answer yet for icmp_seq=21
64 bytes from 10.222.2.1: icmp_seq=22 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=23 ttl=64 time=118 ms
no answer yet for icmp_seq=24
64 bytes from 10.222.2.1: icmp_seq=25 ttl=64 time=119 ms
64 bytes from 10.222.2.1: icmp_seq=26 ttl=64 time=118 ms
no answer yet for icmp_seq=27
64 bytes from 10.222.2.1: icmp_seq=28 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=29 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=30 ttl=64 time=119 ms
64 bytes from 10.222.2.1: icmp_seq=31 ttl=64 time=118 ms
no answer yet for icmp_seq=32
64 bytes from 10.222.2.1: icmp_seq=33 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=34 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=35 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=36 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=37 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=38 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=39 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=40 ttl=64 time=118 ms
64 bytes from 10.222.2.1: icmp_seq=41 ttl=64 time=118 ms
no answer yet for icmp_seq=42
no answer yet for icmp_seq=43
64 bytes from 10.222.2.1: icmp_seq=44 ttl=64 time=118 ms
^C
--- 10.222.2.1 ping statistics ---
44 packets transmitted, 34 received, 22% packet loss, time 43038ms
rtt min/avg/max/mdev = 118.289/118.712/119.959/0.530 ms

OpenVPN with FEC

$ ping 10.222.2.1 -O
PING 10.222.2.1 (10.222.2.1) 56(84) bytes of data.
64 bytes from 10.222.2.1: icmp_seq=1 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=2 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=3 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=4 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=5 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=6 ttl=64 time=129 ms
64 bytes from 10.222.2.1: icmp_seq=7 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=8 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=9 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=10 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=11 ttl=64 time=122 ms
64 bytes from 10.222.2.1: icmp_seq=12 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=13 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=14 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=15 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=16 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=17 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=18 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=19 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=20 ttl=64 time=129 ms
64 bytes from 10.222.2.1: icmp_seq=21 ttl=64 time=120 ms
64 bytes from 10.222.2.1: icmp_seq=22 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=23 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=24 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=25 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=26 ttl=64 time=137 ms
64 bytes from 10.222.2.1: icmp_seq=27 ttl=64 time=120 ms
64 bytes from 10.222.2.1: icmp_seq=28 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=29 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=30 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=31 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=32 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=33 ttl=64 time=120 ms
64 bytes from 10.222.2.1: icmp_seq=34 ttl=64 time=122 ms
64 bytes from 10.222.2.1: icmp_seq=35 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=36 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=37 ttl=64 time=121 ms
64 bytes from 10.222.2.1: icmp_seq=38 ttl=64 time=129 ms
64 bytes from 10.222.2.1: icmp_seq=39 ttl=64 time=120 ms
64 bytes from 10.222.2.1: icmp_seq=40 ttl=64 time=129 ms
64 bytes from 10.222.2.1: icmp_seq=41 ttl=64 time=129 ms
^C
--- 10.222.2.1 ping statistics ---
41 packets transmitted, 41 received, 0% packet loss, time 40046ms
rtt min/avg/max/mdev = 120.908/122.719/137.724/3.535 ms

summary

Looks like it does have improved the connection quality on a lossy link.

I hope this feature can be integrated into OpenVPN.I want to know how the developer team think about it.

If the feature is acceptable,I can try to make patch.

Change History (12)

comment:1 Changed 2 years ago by Gert Döring

This is interesting, and useful. Wether or not we can integrate it depends a bit on how it is done.

Technically, how are you doing it, without increasing latency very much? This is what worried me most when considering it, that it would cause too much overhead and much extra latency.

comment:2 Changed 2 years ago by wangyucn

As you can see,there is ineed some extra latency,about 10ms at max.

The FEC encoder tries to fill the buffer with incoming packets before doing FEC,if buffer is full,the FEC will be done and data will be sent immediately. There is also a timer of 5ms (which is configurable),fec will still be done if not enough data is collected and timer is expired. The 10ms mentioned above is just 5ms+5ms.

There is also an optimization,if a packet didnt get lost,there will be no latency introduced for this packet at all(no latency in theory,there is still a small latency of about 2~3ms in practice).

This is what worried me most when considering it, that it would cause too much overhead and much extra latency.

The overhead is okay with 5ms encode latency. With the FEC parameter of "10 redundant packets for every 20 original packets",I can get 1.2MByte/s transfer speed from a 20Mbps bandwidth.
You can tune the timeout down for a higher bandwidth,the higher the bandwidth the lower the latency can be.

Last edited 2 years ago by wangyucn (previous) (diff)

comment:3 Changed 2 years ago by Steffan Karger

This definitely interesting. Question though: wouldn't it be more useful to implement this as a generic UDP-proxy? That way it could be used for any UDP-based protocol (and we would not have to maintain the code ;-) ).

comment:4 in reply to:  3 Changed 2 years ago by wangyucn

Replying to syzzer:

This definitely interesting. Question though: wouldn't it be more useful to implement this as a generic UDP-proxy? That way it could be used for any UDP-based protocol (and we would not have to maintain the code ;-) ).

It has already been made as a UDP-proxy. UDP-proxy only works for UDP,but OpenVPN with FEC works for all traffic. Though you can use the UDP-proxy with OpenVPN like this:

OpenVPN client-->UDP-proxy client-------------->UDP-proxy server-->OpenVPN server.

Not convenient enough.Furthermore,with redirect-gateway option,OpenVPN may hijack the UDP-proxy 's traffic,you have to add a route exception to slove this,its a bit trick for a normal user to understand.

Another reason is,OpenVPN itself can also be somehow regard as an "all-traffic-proxy",integrate FEC into OpenVPN is more generic.

and we would not have to maintain the code ;-)

FEC feature makes OpenVPN more competitive while introducing some extra maintenance costs. Whether it can be integrate or not,it's up to your decision,dear core team ;-)

==update==
Added the hijack problem.

Last edited 2 years ago by wangyucn (previous) (diff)

comment:5 Changed 2 years ago by tincantech

CC

comment:6 Changed 17 months ago by Antonio

Does this feature use an already existing library? or does it implement all the FEC logic in OpenVPN?

When you say "The parameter for FEC is 20:10,that means sending 10 redundant packets for every 20 original packets" what technique have you used to forge the 10 redundant packets? I think this is also a good case where network coding might come handy.

However, I personally think this is a clear case of a feature that could be implemented as a standalone client/server so that it could also be re-used by other software (like Steffan said).
Especially because it comes with its own set of knobs and OpenVPN has already too many. It would be nice if parameters could be chosen automatically, because the user rarely knows what's the actual packet loss on a link and it is rarely steady.

On top of that, what is the actual gain with lower (i.e. realistic) packet loss? TCP (inside the tunnel) will take care of recovering packets, therefore I don't expect scp to collapse like that, no?

comment:7 in reply to:  6 ; Changed 17 months ago by wangyucn

Replying to Antonio:

Does this feature use an already existing library? or does it implement all the FEC logic in OpenVPN?

"implement all the FEC logic in OpenVPN" was my goal, no external library is needed. It should be easy to maintain.

But its not done (or started) yet, when the ticket was created the test result was done by OpenVPN + UDP-proxy.

When you say "The parameter for FEC is 20:10,that means sending 10 redundant packets for every 20 original packets" what technique have you used to forge the 10 redundant packets? I think this is also a good case where might come handy

Reed Solomon

network coding

Erasure Code

However, I personally think this is a clear case of a feature that could be implemented as a standalone client/server so that it could also be re-used by other software (like Steffan said).

Sounds like this is just an excuse to avoid the work involved...

Built-in FEC improves (a lot of) performance (no extra system calls), and much easier.

Talking about reuse, OpenVPN with built-in FEC can also be re-used by other software, and can be re-used by more softwares...

Especially because it comes with its own set of knobs and OpenVPN has already too many.

Sounds like this is the real reason

It would be nice if parameters could be chosen automatically, because the user rarely knows what's the actual packet loss on a link and it is rarely steady.

I implemeted a mechanism to change parameters without losing connection, so that you can implement your own policy to do it dynamiclly.

On top of that, what is the actual gain with lower (i.e. realistic) packet loss? TCP (inside the tunnel) will take care of recovering packets, therefore I don't expect scp to collapse like that, no?

TCP is designed assuming there will be no packetloss if no congestion happens,in typical implementions(for example linux cubic).Thats not true for a complicated network environment and leads to very poor performance. Also, TCP's re-transmission hurts latency badly for game.

If you suspect the result, you can find my UDP-proxy (google with keyword UDP FEC) on github, and reproduce the test result easily.

By the way, there is also a demo(but stable) VPN with built-in FEC on my github, have been running for half a year with no problems, but its not as powerful as OpenVPN and supports only Linux (so I still hope to integrate FEC into OpenVPN).

comment:8 in reply to:  7 Changed 17 months ago by tincantech

Replying to wangyucn:

If you suspect the result, you can find my UDP-proxy (google with keyword UDP FEC) on github, and reproduce the test result easily.

Presumably .. https://github.com/wangyu-/UDPspeeder

comment:9 in reply to:  7 Changed 17 months ago by Antonio

Replying to wangyucn:

On top of that, what is the actual gain with lower (i.e. realistic) packet loss? TCP (inside the tunnel) will take care of recovering packets, therefore I don't expect scp to collapse like that, no?

TCP is designed assuming there will be no packetloss if no congestion happens,in typical implementions(for example linux cubic).Thats not true for a complicated network environment and leads to very poor performance. Also, TCP's re-transmission hurts latency badly for game.

If you suspect the result, you can find my UDP-proxy (google with keyword UDP FEC) on github, and reproduce the test result easily.

Sure, however I was hoping you could share your results when it comes to lower/realistic packet loss (if available).

It would also be nice to have something that show what's the overhead in comparison to the gain (for various parameters values). i.e. do we need 50% redundancy also at 2% packet loss in order to have a working connection?

Maybe sharing how the adaptive mechanism works might help understanding some more.

Other than that, (as also Gert said at the beginning) saying "yes" or "no" before seeing the code is very difficult. But I guess nobody will object to merge a meaningful feature with a clear and not complex implementation that boosts performance.

comment:10 in reply to:  6 Changed 17 months ago by Gert Döring

Replying to Antonio:

On top of that, what is the actual gain with lower (i.e. realistic) packet loss? TCP (inside the tunnel) will take care of recovering packets, therefore I don't expect scp to collapse like that, no?

5% loss is sufficient that (standard) TCP totally stops doing useful things - every lost packet is interpreted as "oh, we sent data too fast, slow down a bit!".

It will (usually) eventually recover, but performance will be in the "totally unusable" range. Even SSH might be very unpleasant to use.

So, FEC is totally cool :-) - FEC over two parallel and independent IP paths might be even more cool, but complicated.

comment:11 Changed 11 months ago by alexr

This is actually a really cool idea with some specific use-cases. Many SD-WAN providers charge an arm and a leg for forward error correction, often requiring two different ISPs to send application traffic over both rather than doing error correction within a single ISP. Something like this is particularly useful for things like remote SQL management or SQL replication across great geographic distances, or real-time ATM traffic for example.

comment:12 Changed 6 months ago by redemerald

I would love this feature for when I am at hotels and conferences with poor connections. I've used UDPspeeder in the past (awesome work wangyucn), but I want to use this from my iPhone and portable router that already have OpenVPN clients setup. And on those devices, it isn't easy/possible to run a separate process.

Note: See TracTickets for help on using tickets.