Core Java

JLBH Examples 3 – The Affects of Throughput on Latency

In this post:

  • A discussion about the effects of thoughput on latency
  • How use JLBH to measure TCP loopback
  • Adding probes to test both halves of the TCP round trip
  • Watching the effect of increasing throughput on latency
  • Understanding that you have to drop throughput to achieve good latencies at high percentiles.

In the post we saw the effects of accounting for coordinated omission or measuring the effect a delay to one iteration will have an affect on subsequent iterations.

Intuitively we understand that throughput will affect latency. It seems natural that a
s we raise throughput we will also raise latency.

Going into a very crowded shop will affect how fast you can select and purchase your goods. On the other hand consider a very rarely frequented shop. It could be that in such a shop the shopkeeper is away from the till having a tea break and that your purchase will be delayed as you wait for him to put down his cup of tea and make his way to the counter where he can server you.

This is exactly what you find when running benchmarks and vary throughput.

In general as you raise throughput latency increases but at some point when throughput drops below a certain threshold latency can increase as well.

The code below times a round trip TCP call over loopback.

We add two probes:

  • client2server – the amount of time taken to complete the first half of the round trip
  • server2client – the amount of time taken to complete the second half of the trip

These probes do not take into account co-ordinated omission – it is only the end to end time that takes coordinated omission into account.

Here’s the code for the benchmark:

001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
package org.latency.tcp;
 
import net.openhft.affinity.Affinity;
import net.openhft.chronicle.core.Jvm;
import net.openhft.chronicle.core.jlbh.JLBHOptions;
import net.openhft.chronicle.core.jlbh.JLBHTask;
import net.openhft.chronicle.core.jlbh.JLBH;
import net.openhft.chronicle.core.util.NanoSampler;
 
import java.io.EOFException;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
 
public class TcpBenchmark implements JLBHTask {
    private final static int port = 8007;
    private static final boolean BLOCKING = false;
    private final int SERVER_CPU = Integer.getInteger("server.cpu", 0);
    private JLBH jlbh;
    private ByteBuffer bb;
    private SocketChannel socket;
    private NanoSampler client2serverProbe;
    private NanoSampler server2clientProbe;
 
    public static void main(String[] args) {
        JLBHOptions jlbhOptions = new JLBHOptions()
                .warmUpIterations(50000)
                .iterations(50000)
                .throughput(20000)
                .runs(5)
                .jlbhTask(new TcpBenchmark());
        new JLBH(jlbhOptions).start();
    }
 
    @Override
    public void init(JLBH jlbh) {
        this.jlbh = jlbh;
        client2serverProbe = jlbh.addProbe("client2server");
        server2clientProbe = jlbh.addProbe("server2clientProbe");
        try {
            runServer(port);
            Jvm.pause(200);
 
            socket = SocketChannel.open(new InetSocketAddress(port));
            socket.socket().setTcpNoDelay(true);
            socket.configureBlocking(BLOCKING);
 
        } catch (IOException e) {
            e.printStackTrace();
        }
 
        bb = ByteBuffer.allocateDirect(8).order(ByteOrder.nativeOrder());
    }
 
    private void runServer(int port) throws IOException {
 
        new Thread(() -> {
            if (SERVER_CPU > 0) {
                System.out.println("server cpu: " + SERVER_CPU);
                Affinity.setAffinity(SERVER_CPU);
            }
            ServerSocketChannel ssc = null;
            SocketChannel socket = null;
            try {
                ssc = ServerSocketChannel.open();
                ssc.bind(new InetSocketAddress(port));
                System.out.println("listening on " + ssc);
 
                socket = ssc.accept();
                socket.socket().setTcpNoDelay(true);
                socket.configureBlocking(BLOCKING);
 
                System.out.println("Connected " + socket);
 
                ByteBuffer bb = ByteBuffer.allocateDirect(8).order(ByteOrder.nativeOrder());
                while (true) {
                    readAll(socket, bb);
 
                    bb.flip();
                    long time = System.nanoTime();
                    client2serverProbe.sampleNanos(time - bb.getLong());
                    bb.clear();
                    bb.putLong(time);
                    bb.flip();
 
                    writeAll(socket, bb);
                }
            } catch (IOException e) {
                e.printStackTrace();
            } finally {
                System.out.println("... disconnected " + socket);
                try {
                    if (ssc != null)
                        ssc.close();
                } catch (IOException ignored) {
                }
                try {
                    if (socket != null)
                        socket.close();
                } catch (IOException ignored) {
                }
            }
        }, "server").start();
 
    }
 
    private static void readAll(SocketChannel socket, ByteBuffer bb) throws IOException {
        bb.clear();
        do {
            if (socket.read(bb) < 0)
                throw new EOFException();
        } while (bb.remaining() > 0);
    }
 
    @Override
    public void run(long startTimeNs) {
        bb.position(0);
        bb.putLong(System.nanoTime());
        bb.position(0);
        writeAll(socket, bb);
 
        bb.position(0);
        try {
            readAll(socket, bb);
            server2clientProbe.sampleNanos(System.nanoTime() - bb.getLong(0));
        } catch (IOException e) {
            e.printStackTrace();
        }
 
        jlbh.sample(System.nanoTime() - startTimeNs);
    }
 
    private static void writeAll(SocketChannel socket, ByteBuffer bb) {
        try {
            while (bb.remaining() > 0 && socket.write(bb) >= 0) ;
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
 
    @Override
    public void complete() {
        System.exit(0);
    }
}

And here are the results when run at a throughput of 20,000 iterations / second:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Warm up complete (50000 iterations took 2.296s)
-------------------------------- BENCHMARK RESULTS (RUN 1) ---------Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 34 / 2,950  19,400 / 20,450  20,450 - 20,450
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 16 / 26  38 / 72  287 - 336
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 16 / 27  40 / 76  319 - 901
OS Jitter (26,960)                              50/90 99/99.9 99.99 - worst was 9.0 / 16  44 / 1,340  10,220 - 11,800
--------------------------------------------------------------------
-------------------------------- BENCHMARK RESULTS (RUN 2) ---------
Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 42 / 868  4,590 / 5,110  5,370 - 5,370
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 20 / 27  38 / 92  573 - 2,560
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 19 / 27  38 / 72  868 - 1,740
OS Jitter (13,314)                              50/90 99/99.9 99.99 - worst was 9.0 / 16  32 / 96  303 - 672
--------------------------------------------------------------------
-------------------------------- BENCHMARK RESULTS (RUN 3) ---------
Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 34 / 152  999 / 2,160  2,290 - 2,290
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 17 / 26  36 / 54  201 - 901
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 16 / 25  36 / 50  225 - 1,740
OS Jitter (14,306)                              50/90 99/99.9 99.99 - worst was 9.0 / 15  23 / 44  160 - 184
---------------------------------------------------------------------------------------------------- SUMMARY (end to end)---------------
Percentile   run1         run2         run3      % Variation   var(log)
50:            33.79        41.98        33.79        13.91      
90:          2949.12       868.35       151.55        75.92      
99:         19398.66      4587.52       999.42        70.53      
99.9:       20447.23      5111.81      2162.69        47.62     99.99:      20447.23      5373.95      2293.76        47.24      
worst:      20447.23      5373.95      2293.76        47.24      
--------------------------------------------------------------------
-------------------------------- SUMMARY (client2server)------------
Percentile   run1         run2         run3      % Variation  
50:            16.13        19.97        16.90        10.81      
90:            26.11        27.14        26.11         2.55      
99:            37.89        37.89        35.84         3.67      
99.9:          71.68        92.16        54.27        31.76      
99.99:        286.72       573.44       200.70        55.32      
worst:        335.87      2555.90       901.12        55.04      
--------------------------------------------------------------------
-------------------------------- SUMMARY (server2clientProbe)-------
Percentile   run1         run2         run3      % Variation  
50:            16.13        18.94        16.13        10.43      
90:            27.14        27.14        25.09         5.16      
99:            39.94        37.89        35.84         3.67      
99.9:          75.78        71.68        50.18        22.22      
99.99:        319.49       868.35       225.28        65.55      
worst:        901.12      1736.70      1736.70         0.00      
--------------------------------------------------------------------

What should happen is that:

client2server + server2client ~= endToEnd

And this more of less what happens at the 50th percentile

Take the second run for the purposes of this demonstration:

19.97 + 18.94 ~= 41.98

If that was all you were measuring you might say that there would be no problem running 20k/s messages through my machine.

However my laptop clearly can’t handle this volume and if we look at the second run again for the 90th percentile.

27.14 + 27.14 !~= 868.35

And as you move up the percentiles it just gets worse and worse…

If however I change the throughput to 5k messages / second I see this on the 90th percentile:

32.23 + 29.38 ~= 62.46

So we see that if you want to achieve low latencies at the high percentiles you have to drop your throughput to the correct level.

This is why it so important that we are able to vary throughputs with JLBH.

Daniel Shaya

Daniel has been programming in Java since it was in beta. Working predominantly in the finance industry he has created real time trading and margin risk applications. He is currently a director at OpenHFT where we are building next generation Java low latency products.
Subscribe
Notify of
guest


This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button