wiki:RoadMap

Version 10 (modified by Samuli Seppänen, 14 years ago) (diff)

--

Introduction

OpenVPN's current codebase has a number of limitations. Some of the limitations require changes to the underlying architecture to be fixed. Roadmap issues and relationship of OpenVPN 2.x series and 3.0 have been discussed earlier in a few IRC meetings:

  • 1st Apr 2010
    • 21:36 - 21:47: James' plans for 3.0
    • 21:59 - 22:09: Multithreading and the event system
  • 15th Apr 2010
    • 14:28 - 14:31: Roadmap and Trac
  • 29th Apr 2010
    • 22:23 - 22:53: Planning of the roadmap meeting
  • 6th May 2010
    • The roadmap meeting. James' views about OpenVPN 3.0 are available here

Current issues and potential fixes

Monolithic architecture

The fact that OpenVPN is currently implemented as a monolithic C application makes it difficult to control OpenVPN's internal operation from higher level languages. Having the core functionality of OpenVPN implemented as a C library would allow it to be wrappable by higher-level language objects more easily. For example, suppose you wanted to build a full-mesh OpenVPN cloud, where OpenVPN runs on hundreds of machines, and where each machine has multiple processors. It would be much more straightforward to implement the cloud in a high-level language such as python, and wrap the OpenVPN library as a python extension.

Another related problem is that OpenVPN's components are not easily exchangable. This means adding certain types of functionality (e.g. IPv6) or replacing current SSL and compression functionality is much more difficult than it needs to be. As OpenVPN is essentially a special-case of a user-space network stack, it could be modularized so that the central core is implemented as a user-space network stack, and the other components such as VPN and routing would be modules in this stack. The above changes make it more straightforward to implement other protocols in the stack, such as IPv6. The network stack changes would also make it much more straightforward to implement alternative topologies for OpenVPN, such as full-mesh. The SSL and compression functionality should be modularized so that OpenVPN can be used with different SSL libraries or different compression algorithms.

Threading

Currently, OpenVPN is scaled on SMP machines by adding processes rather than threads. While it might be interesting to look into scaling OpenVPN across threads, there may be kernel-level bottlenecks that impede this, e.g. note the problems facebook had when trying to scale memcached, specifically the problems of having multiple threads contend for a single UDP socket:

Lack of multithreading is closely tied to the current event system implementation.

Event system

The current non-asynchrous-clean status of the event system makes maintenance of certain OpenVPN components quite tenuous, such as mtcp.c. While the current event model is partially asynchronous, it is not sufficiently clean to allow certain features to be implemented such as concurrent multithreading or the ability to listen on multiple interfaces simultaneously. The limitations of current event system are also closely tied to OpenVPN's lack of multithreading. To get these features the current i/o event system needs to be revamped into a true asynchronous model. It might be worthwhile to look into using libevent as the underlying i/o event system for OpenVPN (libevent is used by memcached).

OpenVPN 3.0: Development issues

Organic vs. planned development

Community-driven development model excels in developing software incrementally in small steps. Most community developers are driven by healthy self-interest and concentrate on features that are of interest to them. Fixing parts of the architecture may or may not have value for them. An example of an architectural change that developers would probably be interested in is making OpenVPN multithreaded. That said, there are certainly some community developers who may be interested in less concrete work such as rewriting parts of the codebase to allow easier development in the future. Also, if the architecture is made more modular, people will be able to contribute to the project more easily in the future.

Start from scratch vs. incremental approach

Starting from scratch has the benefit that we can focus on fixing the current architectural problems. Also, we would not really need to start from scratch, as many parts of the old codebase can be utilized in the new codebase with minor modifications. However, as the codebase as whole would be new, it would almost certainly have unknown problems.

There is also the problem that the new codebase will be competing for users and developers against the old codebase. Non-developers are unlikely to use the new codebase until it provides something the old codebase does not. Attracting developers to work on the new codebase may also be difficult unless somebody (e.g. at the company) bootstraps and leads the development effort. This means the new codebase will be relatively untested for a long time even after it's somewhat functional. To minimize this time period we'd need solid data on how people use OpenVPN (e.g. what features) and focus on developing those. Asking the users directly (e.g. on mailing lists) would help, but the dataset would be relatively small. An automated opt-in system similar to Debian's popularity contest would give a larger dataset, but would have to be coupled with a new software release to get widest possible audience.

Incremental approach solves the problem with competing codebases. However, as software architecture is difficult to change afterwards, incremental approach works only on a limited subset of the code. Some of the problematic code may be too tightly integrated to be fixable without invasive changes. However, there are a few good candidates for incremental modularization:

  • Logger
  • Encryption
  • Authentication

Generic network stack vs. focus on VPN functionality

In the roadmap meeting (6th May 2010) James presented his views of OpenVPN 3.0. In a nutshell, OpenVPN 3.0 would become a generic user-space network stack. This would solve many of the architectural problems with the current codebase. This approach would also potentially allow a much wider user- and developer base as people could build non-VPN functionality on top of the core.

The big question is whether going 100% generic is beneficial or not. If there's no interest in a generic userspace network stack then focusing on VPN functionality and just modularizing and cleaning up the code would be the best option.

Attachments (2)

Download all attachments as: .zip