JLBH Examples 4 – Benchmarking QuickFix vs ChronicleFix
In this post:
- Using JLBH to test QuickFIX
- Observing how QuickFix latencies degrade through the percentiles
- Comparing QuickFIX with Chronicle FIX
As mentioned in the Introduction to JLBH the primary reason JLBH was created was to measure the Chronicle-FIX engine.
We use all the features of JLBH, particularly the throughput lever and the accounting for coordinated omission to get some realistic timings for QuickFIX.
Later in this article we will look at some results for ChronicleFIX but first let’s look at benchmarking QuickFix the open source implementation of a Fix Engine.
This is the scenario we will be benchmarking:
- The client creates a NewOrderSingle which is then passed to the server.
- The server parses the NewOrderSingle
- The server creates an ExecutionReport which is send back to the client.
- The client receives the ExecutionReport
End to end time is measured from when the client starts creating the NewOrderSingle to when the client receives the ExecutionReport.
Note: We need to keep the start time that the benchmark was called with right the the program. In order to do this we use a trick and set the start time to the tag ClOrdId.
If you want to get the benchmark running on your servers you should clone this GitHub repo all jars and configurations files are set up there.
For the sake of this post here’s the code for the benchmark.
package org.latency.quickfix; import net.openhft.chronicle.core.Jvm; import net.openhft.chronicle.core.jlbh.JLBHOptions; import net.openhft.chronicle.core.jlbh.JLBHTask; import net.openhft.chronicle.core.jlbh.JLBH; import quickfix.*; import quickfix.field.*; import quickfix.fix42.ExecutionReport; import quickfix.fix42.NewOrderSingle; import java.util.Date; import java.util.concurrent.Executors; /** * Created by daniel on 19/02/2016. * Latency task to test sending a message in QuickFix */ public class QFJLBHTask implements JLBHTask { private QFClient client; private JLBH lth; private static NewOrderSingle newOrderSingle; private static ExecutionReport executionReport; public static void main(String[] args) { executionReport = new ExecutionReport(); executionReport.set(new AvgPx(110.11)); executionReport.set(new CumQty(7)); executionReport.set(new ClientID("TEST")); executionReport.set(new ExecID("tkacct.151124.e.EFX.122.6")); executionReport.set(new OrderID("tkacct.151124.e.EFX.122.6")); executionReport.set(new Side('1')); executionReport.set(new Symbol("EFX")); executionReport.set(new ExecType('2')); executionReport.set(new ExecTransType('0')); executionReport.set(new OrdStatus('0')); executionReport.set(new LeavesQty(0)); newOrderSingle = new NewOrderSingle(); newOrderSingle.set(new OrdType('2')); newOrderSingle.set(new Side('1')); newOrderSingle.set(new Symbol("LCOM1")); newOrderSingle.set(new HandlInst('3')); newOrderSingle.set(new TransactTime(new Date())); newOrderSingle.set(new OrderQty(1)); newOrderSingle.set(new Price(200.0)); newOrderSingle.set(new TimeInForce('0')); newOrderSingle.set(new MaturityMonthYear("201106")); newOrderSingle.set(new SecurityType("FUT")); newOrderSingle.set(new IDSource("5")); newOrderSingle.set(new SecurityID("LCOM1")); newOrderSingle.set(new Account("ABCTEST1")); JLBHOptions jlbhOptions = new JLBHOptions() .warmUpIterations(20_000) .iterations(10_000) .throughput(2_000) .runs(3) .accountForCoordinatedOmmission(false) .jlbhTask(new QFJLBHTask()); new JLBH(jlbhOptions).start(); } @Override public void init(JLBH lth) { this.lth = lth; Executors.newSingleThreadExecutor().submit(() -> { QFServer server = new QFServer(); server.start(); }); Jvm.pause(3000); client = new QFClient(); client.start(); } @Override public void complete() { System.exit(0); } @Override public void run(long startTimeNs) { newOrderSingle.set(new ClOrdID(Long.toString(startTimeNs))); try { Session.sendToTarget(newOrderSingle, client.sessionId); } catch (SessionNotFound sessionNotFound) { sessionNotFound.printStackTrace(); } } private class QFServer implements Application { void start() { SocketAcceptor socketAcceptor; try { SessionSettings executorSettings = new SessionSettings( "src/main/resources/acceptorSettings.txt"); FileStoreFactory fileStoreFactory = new FileStoreFactory( executorSettings); MessageFactory messageFactory = new DefaultMessageFactory(); FileLogFactory fileLogFactory = new FileLogFactory(executorSettings); socketAcceptor = new SocketAcceptor(this, fileStoreFactory, executorSettings, fileLogFactory, messageFactory); socketAcceptor.start(); } catch (ConfigError e) { e.printStackTrace(); } } @Override public void onCreate(SessionID sessionId) { } @Override public void onLogon(SessionID sessionId) { } @Override public void onLogout(SessionID sessionId) { } @Override public void toAdmin(Message message, SessionID sessionId) { } @Override public void fromAdmin(Message message, SessionID sessionId) throws FieldNotFound, IncorrectDataFormat, IncorrectTagValue, RejectLogon { } @Override public void toApp(Message message, SessionID sessionId) throws DoNotSend { } @Override public void fromApp(Message message, SessionID sessionId) throws FieldNotFound, IncorrectDataFormat, IncorrectTagValue, UnsupportedMessageType { try { executionReport.set(((NewOrderSingle) message).getClOrdID()); Session.sendToTarget(executionReport, sessionId); } catch (SessionNotFound invalidMessage) { invalidMessage.printStackTrace(); } } } private class QFClient implements Application { private SessionID sessionId = null; void start() { SocketInitiator socketInitiator; try { SessionSettings sessionSettings = new SessionSettings("src/main/resources/initiatorSettings.txt"); FileStoreFactory fileStoreFactory = new FileStoreFactory(sessionSettings); FileLogFactory logFactory = new FileLogFactory(sessionSettings); MessageFactory messageFactory = new DefaultMessageFactory(); socketInitiator = new SocketInitiator(this, fileStoreFactory, sessionSettings, logFactory, messageFactory); socketInitiator.start(); sessionId = socketInitiator.getSessions().get(0); Session.lookupSession(sessionId).logon(); while (!Session.lookupSession(sessionId).isLoggedOn()) { Thread.sleep(100); } } catch (Throwable exp) { exp.printStackTrace(); } } @Override public void fromAdmin(Message arg0, SessionID arg1) throws FieldNotFound, IncorrectDataFormat, IncorrectTagValue, RejectLogon { } @Override public void fromApp(Message message, SessionID arg1) throws FieldNotFound, IncorrectDataFormat, IncorrectTagValue, UnsupportedMessageType { long startTime = Long.parseLong(((ExecutionReport) message).getClOrdID().getValue()); lth.sample(System.nanoTime() - startTime); } @Override public void onCreate(SessionID arg0) { } @Override public void onLogon(SessionID arg0) { System.out.println("Successfully logged on for sessionId : " + arg0); } @Override public void onLogout(SessionID arg0) { System.out.println("Successfully logged out for sessionId : " + arg0); } @Override public void toAdmin(Message message, SessionID sessionId) { boolean result; try { result = MsgType.LOGON.equals(message.getHeader().getField(new MsgType()).getValue()); } catch (FieldNotFound e) { result = false; } if (result) { ResetSeqNumFlag resetSeqNumFlag = new ResetSeqNumFlag(); resetSeqNumFlag.setValue(true); ((quickfix.fix42.Logon) message).set(resetSeqNumFlag); } } @Override public void toApp(Message arg0, SessionID arg1) throws DoNotSend { } } }
These are the results that I see running on my server Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz.
At a throughput of 2,000/s
Percentile run1 run2 run3 % Variation 50: 270.34 270.34 233.47 9.52 90: 352.26 335.87 1867.78 75.25 99: 6684.67 4849.66 69206.02 89.84 99.9: 13369.34 12845.06 163577.86 88.67 99.99: 81788.93 20447.23 163577.86 82.35 worst: 111149.06 98566.14 163577.86 30.54
At a throughput of 10,000/s
Percentile run1 run2 run3 % Variation 50: 184.32 176.13 176.13 0.00 90: 573.44 270.34 249.86 5.18 99: 19398.66 2686.98 5111.81 37.56 99.9: 28835.84 7733.25 7995.39 2.21 99.99: 30932.99 9699.33 9175.04 3.67 worst: 30932.99 9699.33 9175.04 3.67
The mean here is ~200us but as you up through the percentiles the latencies really begin to degrade. A lot of this is due to the amount of garbage that is being created! You can see this by running the benchmark with the jvm flag -verbosegc. In fact when you up the throughput to 50,000/s even your 90th percentile (1 in every 10 iterations) is completely blown and you end up with multi millisecond delays.
At a throughput of 50,00/s
Percentile run1 run2 run3 % Variation var(log) 50: 176.13 176.13 176.13 0.00 11.82 90: 12845.06 29884.42 3604.48 82.94 21.01 99: 34603.01 94371.84 17301.50 74.81 25.26 99.9: 42991.62 98566.14 25690.11 65.41 25.84 99.99: 45088.77 98566.14 27787.26 62.94 25.93 worst: 45088.77 98566.14 27787.26 62.94 25.93
The issues here are not just the mean time (assuming that ~200us is too slow for you) but even more worryingly the way the numbers degrade as throughput is raised and as you study the higher percentiles. Let’s compare this Chronicle-FIX. The test is run for exactly the same scenario and on the same machine.
The results look like this:
At a throughput of 2000/s
Percentile run1 run2 run3 % Variation 50: 16.90 16.90 16.90 0.00 90: 18.94 18.94 18.94 0.00 99: 26.11 30.21 23.04 17.18 99.9: 35.84 39.94 33.79 10.81 99.99: 540.67 671.74 401.41 65.41 worst: 638.98 1081.34 606.21 61.59
At a throughput of 10,000/s
Percentile run1 run2 run3 % Variation 50: 16.90 16.90 16.13 3.08 90: 18.94 18.94 18.94 0.00 99: 26.11 22.02 20.99 3.15 99.9: 88.06 33.79 83.97 49.75 99.99: 999.42 167.94 802.82 71.59 worst: 1146.88 249.86 966.66 65.67
At a throughput of 50,000/s
Percentile run1 run2 run3 % Variation 50: 15.62 15.10 15.62 2.21 90: 17.92 16.90 16.90 0.00 99: 22.02 30.21 29.18 2.29 99.9: 120.83 352.26 33.79 86.27 99.99: 335.87 802.82 96.26 83.03 worst: 450.56 901.12 151.55 76.73
The Chronicle-FIX mean is around 16us which is 12 times faster than QuickFIX. But it’s more than that because nearly all that time is in the TCP round trip. When you measure the TCP time (see last post JLBH Examples 3 – The Affects of Throughput on Latency) it turns out that most of that time is TCP ~10us. Therefore if you deduct the TCP times you get.
- QuickFix 200 – 10 = 190
- Chronicle-FIX 16 – 10 = 6
- Chronicle-FIX is over 30 times faster than QF
And as has been demonstrated if you care about your upper percentiles than it get much worse than that. For completeness it should be noted that that the server this was benchmarked on was reasonably noisy. It had latency spikes of around 400us which accounts for the larger numbers that are shown up in the higher percentiles. Also this test uses loopback TCP which puts a huge stress on the Linux kernel. In fact strange things happen when you push the throughput up really high (you can try this with the simple TCP test) – so this is not the best way to test Chronicle-FIX. It is only used as a comparison against Quick FIX.
Using Chronicle-FIX, if you measure the parsing of a fix message into its data model, including logging, on a tuned server you actually see this profile tested across throughput profiles from 10,000/s to 200,000/s:
Percentile run1 run2 run3 run4 run5 50: 1.01 1.01 1.01 1.01 1.06 90: 1.12 1.12 1.12 1.12 1.12 99: 1.38 1.31 1.44 1.31 2.11 99.9: 2.88 2.88 2.88 2.88 4.03 99.99: 3.26 3.14 3.39 3.14 6.02 worst: 5.25 6.27 22.02 20.99 18.94
Reference: | JLBH Examples 4 – Benchmarking QuickFix vs ChronicleFix from our JCG partner Daniel Shaya at the Rational Java blog. |