wiki:PerformanceTesting

Version 3 (modified by Samuli Seppänen, 13 years ago) (diff)

--

Introduction

There have been significant changes in the future 2.3 branch since the relatively stable and conservative 2.1.x and 2.2.x releases. Therefore there's a definite need for wide testing in various areas, including performance. This page tries to outline what the most relevant tests are so that they can be ran on a regular basis during development to spot any regressions and bottlenecks. These tests are primarily aimed at the future OpenVPN 2.3 branch (current "master"). However, same setup can used to test older versions, too. Focus is on testing the most common and/or potentially useful configurations, as testing all combinations is practically impossible.

Goals

The tests have several goals:

  • Get an idea of OpenVPN performance with given parameters
  • Identify performance bottlenecks and fix the underlying issues insofar as possible
  • Identify non-linear resource comsumption growth, e.g. find out if 100 clients consumes (significantly) more resources than that of 10 clients (multiplied by ten)

Prequisites

Identification of performance bottlenecks requires plotting resource consumption, e.g. memory usage and processor load. This, in turn may require selective adding of new debugging code, so that excessively verbose, generic logging does not degrade performance and thus produce biased results. Maximum performance tests don't have this requirement, as they only need minimal logging.

There are further requirements, depending on the test environments being used; see below for details.

Throughput tests

There are a number of factors affecting throughput:

  • number of clients
  • bandwidth per client
  • encryption algorithms/ciphers
  • compression
  • packet size
  • transport protocol (TCP/UDP)
  • network errors
  • others factors, which?

There are a few specific questions that need an answer:

  • Throughput using IPv4 code
  • Throughput using IPv6 code
  • Throughput using OpenSSL
  • Throughput using PolarSSL
  • MTU settings and throughput
  • How does (excessive) packet loss affect performance?
  • Does number of clients affect performance, if data transfer rate remains the same?
  • Other questions, which?

Obviously the goal is to identify poorly performing parts of the code (if any) and improve them.

Connection tests

Connection tests are aimed at identifying bottlenecks during spikes of (re)connection initiations, e.g. when 100 clients try to connect to the server simultaneously.

Insofar as possible, the effects of various configuration parameters are tested; especially those which generate extra traffic between server to client.

A few specific questions:

  • Max number of (re)connections per second before server chokes
  • What happens _when_ server is overloaded?
    • Do the clients connect properly after the server starts responding again? Or do they lose connectivity until they are restarted?
    • Do all the clients end up in expected state if the server is overloaded during connection initialization?
    • Are all client configuration directives (e.g. routes) pushed properly?
  • Reconnections
    • Are these any different from initial handshakes?
  • Other questions, which?

Test environment

General considerations

For max connection testing each client is configured to launch several OpenVPN instances connecting to the same server, if possible. For throughput testing each client only runs one instance of OpenVPN (per processor core).

Amazon EC2

Amazon EC2 is used for maximum simultaneous connection testing, where a large number of clients is required. It can also be used for throughput tests, provided that the server and clients are in the same availability zone (to reduce unnecessary costs).

Test environment setup is handled by the Python deployment tool (see below) that utilizes Fabric under the hood.

Local (LAN)

LAN environment is/can be used for throughput tests. Server and clients are connected to a high-quality, high-bandwidth switch to avoid forming external bottlenecks.

Test environment setup is handled by the Python deployment tool that utilizes Fabric under the hood. Initial installation of client OSes is automated using FAE (Debian) or Kickstart (RHEL/Fedora derivatives).

Configuring test server(s)

Test servers can be/are setup manually or using simple Fabric rules and static host lists, as there are only a few of them.

The required steps:

  • Install a precompiled OpenVPN binary and shared libraries
  • Install an OpenVPN configuration file (for every configuration to test)
  • Start OpenVPN
  • Start memory/CPU usage monitoring tools

For throughput tests also do:

  • Start iperf server instance(s)

Configuring test clients

Configuring test clients consists of the following steps:

  • Install precompiled OpenVPN binary and shared libraries
  • Install OpenVPN configuration file(s)
  • Synchronize the clock using ntpdate: allows properly timed DDOS-style tests
  • Install a test script (modified t_client.rc?) that
    • Verifies that system state is sane
    • Gathers information about system state
    • Launches OpenVPN using a configuration file given as command-line parameter
    • For throughput tests launch iperf client pointed at correct iperf server
    • Wait until the test is over
    • Cleans up system state:
      • Kill OpenVPN gracefully; should it fail, make note of it and use kill -9
      • Flush routes
    • Push collected data into a centralized location (e.g. using SSH)
  • Install crontab entry for each test scenario to launch a test script at a given time

Tools

Mattock (on IRC) has written a small application that eases deployment of large number of OpenVPN client computers that can adequately stress a server. At the moment (Sep 2011) the app is used for Amazon EC2 deployments, but it's pretty general purpose and easily extendable. It's also not fully functional yet, as one of it components (Fabric) does not play well in this kind of scenarios with updating and/or small modifications. For details, se the included README file.

To checkout this tool (perftest), do the following:

git clone http://build.openvpn.net/tools/perftest.git perftest

The README file contains everything you need to know. At least initially patches should be sent to samuli at openvpn and net, or to mattock who hangs around at #openvpn-devel on the Freenode IRC network (irc.freenode.net).

Test cases

Here we should have a high-level list of test cases. Technical details (e.g. OpenVPN client configuration files) can also be attached. Mattock can then convert these descriptions into Fabric code which makes the magic happen.