Ultra Low Latency Trading Systems: 05/16/15

In the 80's I remember building NAND circuits to represent code. It was pretty cool seeing how code could be implemented at a circuit level. What I was unsure of when I started SubMicroTrading was the impact on performance by java byte code and whether there were any possible optimisations available.

To cut a long story short, I found only one worthwhile optimisation and that’s how a switch statement is represented in byte code.

Consider the following switch statement :-

Switch( a ) {

case 10 : doAStuff(); break;

case 20 : doBStuff(); break;

case 30 : doCStuff(); break;

case 40 : doDStuff(); break;

...

}

is conceptually the same as

if ( a == 10 ) doAStuff

else if ( a == 20 ) doBStuff

else if ( a == 30 ) doCStuff

else if ( a == 40 ) doDStuff

Think about that, if you are parsing fix and you have 1000 possible fix tags, and an average of 20 fields in a message. Then to process a fix message you could possibly be making an average of 10,000 comparisons . If you want to process 1,000,000 events per second, that would be 10,000,000,000 comparisons per second. (A huge number that would be based on linear top down search, binary search is clearly much better … but point is its still a cost that can be avoided).

Java has two bytecodes for switch statements, the LookUpSwitch statement is in effect a table of key value to jump label … ie you have to search the table to find correct key entry. TableSwitch is in effect a table of jump labels which are indexed directly by the key value (minus the table offset .. Ie lowest key value in switch statement).

For Ultra Low Latency you should consider adding an ant task and check the bytecode for any "lookupswitch" statements. For message processing on most exchanges you can safely force a switch statement to become a tableswitch by adding packer entries so there are no gaps between the key values. In my CODEC generators I stipulate a max pack range eg 100, and any sparse values are handled within the default statement eg via a second switch or if statement if only couple of keys of interest. Like everything test out with real data to see impact. For me, the tableswitch made HUGE difference.

Sample tableswitch with packering case statements :-

Here is the start of the switch statement within the Standard44Decoder generated class :-

final byte msgType = _fixMsg[ _idx ];

switch( msgType ) {

case '8':

if ( _fixMsg[_idx+1 ] != FixField.FIELD_DELIMITER ) { // 2 byte message type

throwDecodeException( "Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );

}

_idx += 2;

return decodeExecReport();

case 'D':

if ( _fixMsg[_idx+1 ] != FixField.FIELD_DELIMITER ) { // 2 byte message type

throwDecodeException( "Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );

}

_idx += 2;

return decodeNewOrderSingle();

………….

// packers

case '6': case '7': case ':': case ';': case '<': case '=':

case '>': case '?': case '@': case 'B': case 'C': case 'E':

break;

javap -c ./com/rr/model/generated/fix/codec/Standard44Decoder.class >s44.bc

protected final com.rr.core.model.Message doMessageDecode();

Code:

…….

74: tableswitch { // 48 to 71

48: 549 // '0' ie heartbeat is the first entry

49: 987

50: 841

51: 768

52: 914

53: 695

54: 1060

55: 1060

56: 184 // '8' this was the first switch entry in java code

57: 476

58: 1060 // 1060 is same as default and are the packer entries

59: 1060

60: 1060

61: 1060

62: 1060

63: 1060

64: 1060

65: 622

66: 1060

67: 1060

68: 257

69: 1060

70: 403

71: 330

default: 1060

}

184: aload_0

185: getfield #462 // Field _fixMsg:[B

188: aload_0

189: getfield #466 // Field _idx:I

192: iconst_1

193: iadd

194: baload

195: iconst_1

196: if_icmpeq 242

199: aload_0

200: new #475 // class java/lang/StringBuilder

203: dup

204: ldc_w #477 // String Unsupported fix message type

You could easily grep for lookupswitch … remember over time extra case statements could be added that cause the switch to become lookupSwitch again.

PS> clearly tableswitch isnt suitable for very sparse switch statements, but in my experience its very useful in trading systems.

Click here for my list of Ultra Low Latency Blogs and Future Topics

Ultra Low Latency Trading Systems

Saturday, 16 May 2015

Java Bytecode Latency Impact