6th October 2020

Continuous Performance Testing with Hydra Platform

When we set out to build a high-performance application development framework, we knew it would be important to measure, profile, and benchmark the entire stack to ensure that we provided our customers with outstanding performance.

Our performance testing strategy for the platform was to write targeted JMH [1] benchmarks on critical components (e.g. message codecs, which are executed for every message handled by the platform), and larger end-to-end tests to exercise multiple components representative of a customer system. We use the end-to-end tests to enable system profiling, helping us to understand areas that would benefit from optimisation effort.

Both of these classes of test are run continuously on machines in the Adaptive performance lab. Test results are recorded in a database, and regressions in throughput or response-time cause the build to fail. In this way, we are applying the lessons learned in Continuous Delivery (CD) pipelines to performance testing. The process also has the added bonus of allowing us to publish a performance report as part of the platform release notes.

Our end-to-end performance testing framework was initially developed to benchmark Aeron Cluster [2], the technology underlying Hydra’s clustering capability, and was further extended to test a crypto-currency exchange via the FIX protocol. We recognise that developing a robust performance testing harness is a substantial undertaking in terms of development, so we aim to provide our customers with an out-of-the-box testing framework for their Hydra Platform applications.

In this post, we look at how our testing experience can be leveraged by users of Hydra Platform to create and deploy continuous performance testing of their own applications.

Benefits of performance testing

Thorough performance testing has multiple benefits:

Understanding the capacity of the system – discover what the system can handle in terms of traffic before it begins to degrade.
Graceful degradation – test how the system will respond if traffic begins to exceed capacity.
Response time measurement – understand what system response times users can expect, and determine monitoring thresholds.
Regression testing – detect and fix performance regressions before they are released to production.

Designing a test harness

When measuring the performance of a distributed system, there are several approaches that can be taken. In our case, we wanted something that could be deployed in native environments or the cloud, so we couldn’t rely on any special hardware (such as a network tap that could be used to measure response times). We also didn’t want to rely on message tracing for timing metrics, as this would introduce some small overhead into the measurements. This meant that the load-generator would also be responsible for recording measurements, making the engineering of the load-generator as important as the platform components themselves.

Hydra Platform performance tests consist of an existing system (the customer application), and a measurement application, which is used to drive messages through the system, and record response times.

width=
The measurement application is optionally generated as part of the build by tagging an endpoint with an annotation:

@GeneratePerfTest()
web-gateway AdminGateway = {
   connectsTo: Engine
   services: {
       AuthService
       EchoService
   }
}

The GeneratePerfTest annotation instructs the Hydra code generator that we wish to measure the performance of the AdminGateway component.

Developers using Hydra Platform need to supply the measurement application with the following:

A sender component – this must be able to encode a timestamp into a message handled by the gateway, such as a request ID.
A filter component – this must be able to filter out any gateway responses that do not contain an encoded timestamp.
A decoder component – this must be able to extract the encoded timestamp from the gateway response.

When the application is executed, it will instruct the sender to send messages to the gateway at a configured rate, for a configured interval. Response times are calculated and reported at the end of the test. The framework handles concerns such as warm-up and back-pressure monitoring, meaning that the developer can focus on just implementing the application-specific parts of the test.

The components used in the load-generator are designed to be allocation-free, zero-copy, and highly performant. This ensures that measurements taken by the application are not negatively affected by system jitter such as garbage collection. We also utilise Hydra’s thread-affinity library to assign the load-generator threads to specific CPU cores, further reducing jitter in the recorded response times.

Testing the Platform

In our internal tests for Hydra Platform, we are primarily interested in testing the baseline performance of the platform. Since every customer application is going to be different in terms of work performed in the cluster and gateway components, we can only provide guidance on the kind of response times that can be achieved; any application business logic will of course add overhead to overall performance.

To test the platform components, we define a simple Echo service, where the gateway acts as a simple pass-through channel. The Echo service runs within the cluster, and simply echoes back the messages it receives from clients.

To test the Echo service, the client connects to the AdminGateway via web-socket (or other channel depending on its type).

The sender implements the encode method, called by the framework when a message needs to be sent:

public class UserClientToAdminGatewaySender extends ClientToAdminGatewayChannelSender
{
   public UserClientToAdminGatewaySender(ClientToAdminGatewayChannel channel)
   {
       super(channel);
   }

   private final StringBuilder buffer = new StringBuilder(20);
  
   @Override
   public void encode(long sendingTimestamp, long sequenceNumber)
   {
       EchoServiceProxy echoServiceProxy = getEchoServiceProxy();
       try (MutableEchoRequest echoRequest = echoServiceProxy.acquireEchoRequest())
       {
           buffer.setLength(0);
           buffer.append("TIMING:");
           buffer.append(sendingTimestamp);
           echoRequest.body(buffer);
           echoServiceProxy.echo(sequenceNumber, echoRequest);
       }
   }
}

On the receive path, the filter class is used to select responses from the Echo service containing a timing message:

public class PrefixEchoResponseFilter implements Predicate<EchoResponse>
{
   @Override
   public boolean test(final EchoResponse echoResponse)
   {
      return echoResponse.body().startsWith("TIMING:");
   }
}

When a message passes the filter, it will be passed to the decoder implementation to be retrieved by the framework:

public final class EchoResponseTimestampDecoder implements TimestampDecoder&lt;EchoResponse&gt;
{
   @Override
   public long decode(final EchoResponse event)
   {
       return Long.parseLong(event.body(), "TIMING:".length(), event.body().length(), 10);
   }
}

Metrics

The performance test framework can capture the following data during the course of a test:

Garbage collection pauses [3] – both the measurement application and application components
Time to safepoint pauses – these can be indicative of resource starvation
Response time histogram – this models the response time users of the application will experience
Dropped message count – indicates message loss
Co-ordinated omission [5] – implies that the sending application could not send at the requested rate, for example, due to back-pressure

Summary

We have described the performance testing capabilities available to users of the Hydra Platform. With minimal development required, they can be used to create automated end-to-end performance testing of your Hydra application. This helps de-risk performance issues pre-launch, and provides a framework for regression testing to identify any future degradations.

The main benefits of using the Hydra Platform performance testing features are:

Accelerating Delivery

Using a ready-made solution for performance testing means that your development and QA effort can be focussed on writing test scenarios, rather than building a framework.

Reduce Risk

By enabling continuous performance testing early in the project lifecycle, users can reduce the risk of discovering performance problems when the project is near to completion.

Writing a correct performance test harness is difficult; our tried-and-tested solution has been designed by experts to calculate and record accurate measurements.

Flexibility and Control

Performance tests remain the property of the user, and can be modified as business functionality changes. Performance tests can be kept up-to-date with newly implemented features, in line with functional tests.