wiki:PerformanceTesting

Context Navigation

Version 33 (modified by Samuli Seppänen, 12 years ago) (diff)
--

Introduction

There have been significant changes in the future 2.3 branch since the relatively stable and conservative 2.1.x and 2.2.x releases. Therefore there's a definite need for wide testing in various areas, including performance. This page tries to outline what the most relevant tests are so that they can be ran on a regular basis during development to spot any regressions and bottlenecks. These tests are primarily aimed at the future OpenVPN 2.3 branch (current "master"). However, same setup can used to test older versions, too. Focus is on testing the most common and/or potentially useful configurations, as testing all combinations is practically impossible.

Goals

The tests have several goals:

Get an idea of OpenVPN performance with any given parameters
Identify performance bottlenecks and fix the underlying issues insofar as possible
Identify non-linear resource comsumption growth, e.g. find out if 100 clients consumes (significantly) more resources than that of 10 clients (multiplied by ten)
Detect and fix abnormal behavior (e.g. partially configured clients) during peak load periods

When these tests are repeated over time, they also allow spotting regressions.

Prequisites

Identification of performance bottlenecks requires plotting resource consumption, e.g. memory usage and processor load. This, in turn may require selective adding of new debugging code, so that excessively verbose, generic logging does not degrade performance and thus produce biased results. Maximum performance tests don't have this requirement, as they only need minimal logging. In addition, spotting and debugging configuration errors requires collecting information about OpenVPN's state on the client side.

There are further requirements, depending on the test environments being used; see below for details.

Test cases

Here we should have a high-level list of test cases. Technical details (e.g. OpenVPN client configuration files) can also be attached. Mattock can then convert these descriptions into Fabric code which makes the magic happen.

Throughput tests

There are a number of factors affecting throughput:

number of clients
bandwidth per client
encryption algorithms/ciphers
compression
packet size
transport protocol (TCP/UDP)
network errors
others factors, which?

There are a few specific questions that need an answer (feel free to add more):

Maximum throughput using various parameters
How does (excessive) packet loss affect performance?
Does number of clients affect performance, if data transfer rate remains the same? (or "non-linear resource consumption growth")

Below is a list of test cases:

Test name	Clients	Answer to
MP-1	5	1,3
MP-2	20	1,3
MP-3	50	1,3
MP-4	100	1,3

Protocol needs to be UDP, as TCP (in iperf) prevents controlling throughput. Each test will be repeated for each of the following combinations:

UDP	IPv4	IPv6	OpenSSL	PolarSSL	OpenVPN 2.2.1	OpenVPN (master)
X	X		X			X
X	X			X		X
X		X	X			X
X		X		X		X
X	X		X

Question: Should a subset of these tests repeated for TCP?

Connection tests

Connection tests are aimed at identifying bottlenecks during spikes of (re)connection initiations, e.g. when 100 clients try to connect to the server simultaneously.

Insofar as possible, the effects of various configuration parameters are tested; especially those which generate extra traffic between server to client.

A few specific questions:

Max number of (re)connections per second before server chokes (or, expensiveness of connection establishment)
What happens _when_ server is overloaded?
- Do the clients connect properly after the server starts responding again? Or do they lose connectivity until they are restarted?
- Do all the clients end up in expected state if the server is overloaded during connection initialization?
- Are all client configuration directives (e.g. routes) pushed properly?
Reconnections
- Are these any different from initial handshakes?

Proposed test cases are shown below:

Test name	VM x OpenVPN instances	Answer to
C-1	50x2	1,2?
C-2	50x6	1,2?
C-3	50x10	1,2?

The least powerful instances in EC2 (e.g. m1.small) can be slowed down to a crawl fairly easily with 50-100 (real) clients. For more than 100 simultaneous connections several parallel OpenVPN processes are used on the same client instance; although the clients will fail to initialize properly, they should still stress the server in relatively realistic fashion.

Number of simultaneous concurrent connections is increased until server chokes. Each test will be repeated for the following configurations:

IPv4	IPv6	OpenSSL	PolarSSL	OpenVPN 2.2.1	OpenVPN (master)
X		X		X
X		X			X

Question: does IP protocol version or SSL backend affect initial connection establishment significantly?

Test results

Test setup details for m1.small servers

All server instances on EC2 were the same:

AMI: OpenVPN Access Server 1.8.3 (ami-31d21358)
Operating system: Ubuntu 11.04 "Natty Narwhal" (i386)
Kernel: 2.6.38-8-virtual #42-Ubuntu SMP (aki-407d9529)

Client instances on EC2 used a slightly different setup:

AMI: Ubuntu 10.04 (ami-81b275e8)
Operating system: Ubuntu 10.04 "Lucid Lynx" (i386)
Kernel: 2.6.32-317-ec2 #36-Ubuntu SMP (aki-407d9529)

All EC2 instances were created to the same availability zone (us-east-1d). The setup of 100 client instances took ~4 minutes with 34 threads running Fabric (~3 runs per thread).

Test setup details for m1.large servers

All server instances on EC2 were the same:

AMI: ami-fd589594
Operating system: Ubuntu 11.04 "Natty Narwhal" (amd64)
Kernel: 2.6.38-11-virtual #50-Ubuntu SMP (aki-427d952b)

Client instances on EC2 used a slightly different setup:

AMI: Ubuntu 10.04 (ami-81b275e8)
Operating system: Ubuntu 10.04 "Lucid Lynx" (i386)
Kernel: 2.6.32-317-ec2 #36-Ubuntu SMP (aki-407d9529)

All EC2 instances were created to the same availability zone (us-east-1d). The setup of 100 client instances took ~4 minutes with 34 threads running Fabric (~3 runs per thread).

MP-1 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/5 clients (MP-1)

Test name: 07-mp1-m1.small-openvpn221

Connection establishment

Insignificant CPU load when all clients (5) connected.

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
07-mp1-m1.small-openvpn221	10s	average	5	148.9	15.56	3.11
07-mp1-m1.small-openvpn221	30s	average	5	440.4	15.38	3.08
07-mp1-m1.small-openvpn221	60s	average	5	888.8	15.51	3.1

Server statistics

Test	Subtest	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
07-mp1-m1.small-openvpn221	10s	29.2981	41.9214	89.374	0
07-mp1-m1.small-openvpn221	30s	32.9692	48.3951	94.4449	0.060871
07-mp1-m1.small-openvpn221	60s	35.8375	48.7774	97.0125	0

Analysis

OpenVPN 2.2.1 on an Amazon EC2 m1.small instance can handle 5 OpenVPN client instances well.

MP-2 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/20 clients (MP-2)

Test name: 13-mp2-m1.small-openvpn221

Connection establishment

Test repeated twice due to timing issues:

First run: CPU usage 29-33% for 2 seconds when all clients (20) connected at the same time (at 12:22:02-12:22:03).
Second run: CPU usage 100% for 2 seconds...

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
13-mp2-m1.small-openvpn221	10s	average	20	177.22	16.53	0.8265
13-mp2-m1.small-openvpn221	30s	average	20	474.95	14.72	0.736
13-mp2-m1.small-openvpn221	60s	average	19	829.02	12.56	0.661053

Server statistics

Test	Subtest	Real length	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
13-mp2-m1.small-openvpn221	10s	21	24.6989	54.91	97.7355	0
13-mp2-m1.small-openvpn221	30s	56	26.3991	54.5965	96.4954	0
13-mp2-m1.small-openvpn221	60s	98	26.0081	56.4454	97.1554	0

Analysis

OpenVPN 2.2.1 on an Amazon EC2 m1.small instance seems to be able to handle 20 OpenVPN clients fairly well.

OpenVPN 2.2.1/m1.large/UDP/IPv4/OpenSSL/20 clients (MP-2)

Test names:

16-mp2-m1.large-openvpn221
18-mp2-m1.large-openvpn221
19-mp2-m1.large-openvpn221
20-mp2-m1.large-openvpn221

Connection establishment

During all tests the client initialization phase took 1-2 seconds, judging from server's CPU load and network traffic levels. CPU utilization was 2-6% during this period.

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
16-mp2-m1.large-openvpn221	10s	average	19	386.42	37.06	1.95053
16-mp2-m1.large-openvpn221	30s	average	20	339.85	9.27	0.4635
16-mp2-m1.large-openvpn221	60s	average	20	1168.61	18.91	0.9455
18-mp2-m1.large-openvpn221	10s	average	20	275.55	26.54	1.327
18-mp2-m1.large-openvpn221	30s	average	20	741.01	23.8	1.19
18-mp2-m1.large-openvpn221	60s	average	20	1176.57	18.78	0.939
19-mp2-m1.large-openvpn221	120s	average	14	2323.11	18.96	1.35429
20-mp2-m1.large-openvpn221	120s	average	17	2369.42	19.28	1.13412

Server statistics

Test	Subtest	Real length	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
16-mp2-m1.large-openvpn221	10s	22	8.34841	14.974	32.1606	0.0163636
16-mp2-m1.large-openvpn221	30s	72	8.82207	15.8551	30.9363	0.00511111
16-mp2-m1.large-openvpn221	60s	72	8.73214	15.6736	31.3002	0.252833
18-mp2-m1.large-openvpn221	10s	16	7.90512	15.4132	34.4907	0
18-mp2-m1.large-openvpn221	30s	41	8.6408	15.4649	31.9355	0.241415
18-mp2-m1.large-openvpn221	60s	65	8.66112	15.8789	31.0061	0.0868
19-mp2-m1.large-openvpn221	120s	132	9.18264	15.7622	31.2403	0.104886
20-mp2-m1.large-openvpn221	120s	132	9.15555	15.8739	30.8286	0.0837879

Analysis

A m1.large instance can easily handle 20 clients. As tests were more precisely timed (with "at") than in earlier MP-2 tests (which used chained commands with "sleep" in between), the overly long test periods must have been caused by the iperf server not being able to handle client connections quickly enough.

OpenVPN 2.2.1/m1.large/UDP/IPv4/OpenSSL/20 clients (MP-2)

Test names:

24-mp2.m1.large-openvpn221-1daemon-4iperfs

Connection establishment

Connection establishment took ~1 second, with peak CPU load at 5%.

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
24-mp2.m1.large-openvpn221-1daemon-4iperfs	120s	average	18	2181.7	17.98	0.998889

Server statistics

Test	Subtest	Real length	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
24-mp2.m1.large-openvpn221-1daemon-4iperfs	120s	130	8.70823	15.874	29.8097	0.0595538

MP-3 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/50 clients (MP-3)

Test name: 14-mp3-m1.small-openvpn221

Connection establishment

Client initialization phase took 10 seconds, judging from server's CPU load and network traffic levels.

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
14-mp3-m1.small-openvpn221	10s	average	49	210	19.3	0.393878
14-mp3-m1.small-openvpn221	30s	average	49	529.2	15.85	0.323469
14-mp3-m1.small-openvpn221	60s	average	48	1980.64	31.65	0.659375

Server statistics

Test	Subtest	Real length	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
14-mp3-m1.small-openvpn221	10s	24	26.7727	52.8184	98.4287	0
14-mp3-m1.small-openvpn221	30+60s	304	23.5826	50.4217	86.8597	0.0105822

Analysis

Same as for 100 client version of this test, except that the fake performance increase in the last test is not as pronounced.

MP-4 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/100 clients (MP-4)

Test name: 15-mp4-m1.small-openvpn221

Connection establishment

Client initialization phase took 11 seconds, judging from server's CPU load and network traffic levels.

Client statistics

Test	Subtest	Host	Clients	Total transfer (MB)	Total bandwidth (MB/s)	Average bandwidth (MB/s)
15-mp4-m1.small-openvpn221	10s	average	56	264.02	23.43	0.418393
15-mp4-m1.small-openvpn221	30s	average	62	598	17.17	0.276935
15-mp4-m1.small-openvpn221	60s	average	59	3026.79	48.13	0.815763

Server statistics

Test	Subtest	Real length	CPU usr (%)	CPU sys (%)	CPU total (%)	CPU wait (%)
15-mp4-m1.small-openvpn221	10s	24	26.7727	52.8184	98.4287	0
15-mp4-m1.small-openvpn221	30+60s	304	23.5826	50.4217	86.8597	0.0105822

Analysis

100 clients connecting at the same time is clearly above the capabilities of m1.small instance in EC2. Also, the data transfer statistics are skewed because of the long client connection delays, which create a "tail" of clients that raise the average bandwidth numbers because they get a larger proportion of server's resources for their use.

Test environments

General considerations

In environments that are not isolated (e.g. Amazon EC2) the same tests need to be run twice or more to see if external factors affect the results noticeably.

Amazon EC2

Amazon EC2 is a service that can be used to generate lots of VMs, e.g. for performance testing purposes. There are several different types of instance types (=VMs) available. The pricing pricing is divided into hourly rent and bandwidth costs. The latter only apply to traffic going out of the EC2. So, in order to save bandwidth costs, the whole test environment can be built to EC2 within the same availability zone.

Amazon EC2 is used for maximum simultaneous connection testing, where a large number of clients is required. It is also - for now at least - used for maximum throughput tests.

Test environment setup is handled by the Python deployment tool (see below) that utilizes Boto libraries and Fabric under the hood.

Local (LAN)

LAN environment is/can be used for throughput tests. Server and clients are connected to a high-quality, high-bandwidth switch to avoid forming external bottlenecks.

Test environment setup is handled by the Python deployment tool that utilizes Fabric under the hood. Initial installation of client OSes is automated using FAE (Debian) or Kickstart (RHEL/Fedora derivatives).

Configuring test server(s)

Test servers can be/are setup manually or using simple Fabric rules and static host lists, as there are only a few of them.

The required steps:

Install a precompiled OpenVPN binary and shared libraries
Install an OpenVPN configuration file (for every configuration to test)
Start OpenVPN
Start memory/CPU usage monitoring tools

For throughput tests also do:

Start iperf server instance(s)

Configuring test clients

Configuring test clients consists of the following steps:

Install precompiled OpenVPN binary and shared libraries
Install OpenVPN configuration file(s)
Synchronize the clock using ntpdate: allows properly timed DDOS-style tests
Install a test script (modified t_client.rc?) that
- Verifies that system state is sane
- Gathers information about system state
- Launches OpenVPN using a configuration file given as command-line parameter
- For throughput tests launch iperf client pointed at correct iperf server
- Wait until the test is over
- Cleans up system state:
  - Kill OpenVPN gracefully; should it fail, make note of it and use kill -9
  - Flush routes
- Push collected data into a centralized location (e.g. using SSH)
Install crontab entry for each test scenario to launch a test script at a given time

Tools

Deploying clients

Mattock (on IRC) has written a small application that eases deployment of large number of OpenVPN client computers that can adequately stress a server. At the moment (Sep 2011) the app is used for Amazon EC2 deployments, but it's pretty general purpose and easily extendable. It's also not fully functional yet, as one of it components (Fabric) does not play well in this kind of scenarios with updating and/or small modifications. For details, se the included README file.

To checkout this tool (perftest), do the following:

git clone http://build.openvpn.net/tools/perftest.git perftest

The README file contains everything you need to know. At least initially patches should be sent to samuli at openvpn and net, or to mattock who hangs around at #openvpn-devel on the Freenode IRC network (irc.freenode.net).

Generating network traffic

Iperf can be used to generate network traffic. It has previously been used to tune OpenVPN performance: look here.

Collecting data

Dstat can be used to gather detailed resource utilization statistics on the server side.

Notes

Head/tail effect

Initial tests with m1.small server instances, all test phases (connection -> 10s -> 30s -> 60s) were launched from the same (serially executed) shell script, so any delays propagated to later phases. In later tests that used m1.large server instances each subtest was precisely timed using at, so this propagation of delays was eliminated. Regardless of this the delays remained, meaning that a 10 second test could easily become a 20+ second test. The culprit was not the serial nature of timing, but the incapability of the server processes - especially iperf - to establish new connections quickly started to show with as few as 20 clients. This was clearly visible when looking at server-side iperf output: clients connected with delay of several seconds. This had two effects:

Prolonging the subtests (e.g. 10 second test could become 20+ second test)
Artificially increased the average client bandwidth due to the head and tail of clients that had less competition (bandwidth-vise)

OpenVPN connection initialization seems to be much faster.

This head-tail effect can skew the results badly. These test results are a good example: two identical tests can produce quite different results. There are a few ways to counter this effect:

Lengthen the iperf test periods, so that any delays have a less pronounced effect
Randomize the client connection to iperf
Launch several iperf instances and split client connections between these

These changes combined should produce more reliable results test after test.

Download in other formats:

Plain Text

Context Navigation

Table of Contents

Introduction

Goals

Prequisites

Test cases

Throughput tests

Connection tests

Test results

Test setup details for m1.small servers

Test setup details for m1.large servers

MP-1 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/5 clients (MP-1)

MP-2 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/20 clients (MP-2)

OpenVPN 2.2.1/m1.large/UDP/IPv4/OpenSSL/20 clients (MP-2)

OpenVPN 2.2.1/m1.large/UDP/IPv4/OpenSSL/20 clients (MP-2)

MP-3 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/50 clients (MP-3)

MP-4 results

OpenVPN 2.2.1/m1.small/UDP/IPv4/OpenSSL/100 clients (MP-4)

Test environments

General considerations

Amazon EC2

Local (LAN)

Configuring test server(s)

Configuring test clients

Tools

Deploying clients

Generating network traffic

Collecting data

Notes

Head/tail effect

Download in other formats: