[[TOC(inline, depth=1)]] = Introduction = OpenVPN's current codebase has a number of limitations. Some of the limitations require changes to the underlying architecture to be fixed. Roadmap issues and relationship of OpenVPN 2.x series and 3.0 have been discussed earlier in a few IRC meetings: * [http://thread.gmane.org/gmane.network.openvpn.devel/3447 1st Apr 2010] * 21:36 - 21:47: James' plans for 3.0 * 21:59 - 22:09: Multithreading and the event system * [http://thread.gmane.org/gmane.network.openvpn.devel/3525 15th Apr 2010] * 14:28 - 14:31: Roadmap using Trac * [http://thread.gmane.org/gmane.network.openvpn.devel/3673 29th Apr 2010] * 22:23 - 22:53: Planning of the roadmap meeting * 6th May 2010 * The roadmap meeting. James' views about OpenVPN 3.0 are available [wiki:RoadMap@1 here] == Monolithic architecture == The fact that OpenVPN is currently implemented as a monolithic C application makes it difficult to control OpenVPN's internal operation from higher level languages. Having the core functionality of OpenVPN implemented as a C library would allow it to be wrappable by higher-level language objects more easily. For example, suppose you wanted to build a full-mesh OpenVPN cloud, where OpenVPN runs on hundreds of machines, and where each machine has multiple processors. It would be much more straightforward to implement the cloud in a high-level language such as python, and wrap the OpenVPN library as a python extension. Another related problem is that OpenVPN's components are not easily exchangable. This means adding certain types of functionality (e.g. IPv6) or replacing current SSL and compression functionality is much more difficult than it needs to be. As OpenVPN is essentially a special-case of a user-space network stack, it could be modularized so that the central core is implemented as a user-space network stack, and the other components such as VPN and routing would be modules in this stack. The above changes make it more straightforward to implement other protocols in the stack, such as IPv6. The network stack changes would also make it much more straightforward to implement alternative topologies for OpenVPN, such as full-mesh. The SSL and compression functionality should be modularized so that OpenVPN can be used with different SSL libraries or different compression algorithms. == Threading == Currently, OpenVPN is scaled on SMP machines by adding processes rather than threads. While it might be interesting to look into scaling OpenVPN across threads, there may be kernel-level bottlenecks that impede this, e.g. note the problems facebook had when trying to scale memcached, specifically the problems of having multiple threads contend for a single UDP socket: * http://www.facebook.com/note.php?note_id=39391378919&ref=mf Lack of multithreading is closely tied to the current event system implementation. == Event system == The current non-asynchrous-clean status of the event system makes maintenance of certain OpenVPN components quite tenuous, such as mtcp.c. While the current event model is partially asynchronous, it is not sufficiently clean to allow certain features to be implemented such as concurrent multithreading or the ability to listen on multiple interfaces simultaneously. The limitations of current event system are also closely tied to OpenVPN's lack of multithreading. To get these features the current i/o event system needs to be revamped into a true asynchronous model. It might be worthwhile to look into using libevent as the underlying i/o event system for OpenVPN (libevent is used by memcached). = OpenVPN 3.0 = == Organic vs. planned development == Community-driven development model excels in developing software incrementally in small steps. Most community developers are driven by healthy self-interest and concentrate on ''features'' that are of interest to them. Fixing parts of the architecture may or may not have value for them. An example of an architectural change that developers would probably be interested in is making OpenVPN multithreaded. That said, there are certainly some community developers who may be interested in less concrete work such as rewriting parts of the codebase to allow easier development in the future. Also, if the architecture is made more modular, people will be able to contribute to the project more easily in the future. == Start from scratch vs. incremental approach == Starting from scratch has the benefit that we can focus on fixing the current architectural problems. Also, we would not really need to start from scratch, as many parts of the old codebase can be utilized in the new codebase with minor modifications. However, as the codebase as whole would be new, it would almost certainly have unknown problems. There is also the problem that the new codebase will be competing for users and developers against the old codebase. Non-developers are unlikely to use the new codebase until it provides something the old codebase does not. This means the new codebase will be relatively untested for a long time even after it's somewhat functional. Attracting developers to work on the new codebase may also be difficult unless somebody (e.g. at the company) bootstraps and leads the development effort. Incremental approach solves the problem with competing codebases. However, as software architecture is difficult to change afterwards, incremental approach works only on a limited subset of the code. Some of the problematic code may be too integrated to be fixable without significant changes. == Generic network stack vs. focus on VPN functionality =