In the 80's I
remember building NAND circuits to represent code. It was pretty cool seeing
how code could be implemented at a circuit level. What I was unsure of when I
started SubMicroTrading was the impact on performance by java byte code and
whether there were any possible optimisations available.
To cut a long story
short, I found only one worthwhile optimisation and that’s how a switch statement is represented
in byte code.
Consider the
following switch statement :-
Switch( a ) {
case 10 : doAStuff(); break;
case 20 : doBStuff(); break;
case 30 : doCStuff(); break;
case 40 : doDStuff(); break;
...
}
is conceptually the same as
if ( a == 10 ) doAStuff
else if ( a == 20 ) doBStuff
else if ( a == 30 ) doCStuff
else if ( a == 40 ) doDStuff
Think about that, if
you are parsing fix and you have 1000 possible fix tags, and an average of 20
fields in a message. Then to process a fix message you could possibly be making
an average of 10,000 comparisons . If you want to process 1,000,000 events per
second, that would be 10,000,000,000 comparisons per second. (A huge number
that would be based on linear top down search, binary search is clearly much
better … but point is its still a cost that can be avoided).
Java has two
bytecodes for switch statements, the LookUpSwitch statement is in effect a
table of key value to jump label … ie you have to search the table to find
correct key entry. TableSwitch is in effect a table of jump labels which are
indexed directly by the key value (minus the table offset .. Ie lowest key
value in switch statement).
For Ultra Low
Latency you should consider adding an ant task and check the bytecode for any
"lookupswitch" statements. For message processing on most exchanges
you can safely force a switch statement to become a tableswitch by adding
packer entries so there are no gaps between the key values. In my CODEC
generators I stipulate a max pack range eg 100, and any sparse values are
handled within the default statement eg via a second switch or if statement if
only couple of keys of interest. Like everything test out with real data to see
impact. For me, the tableswitch made HUGE difference.
Sample tableswitch
with packering case statements :-
Here is the start of
the switch statement within the Standard44Decoder generated class :-
final byte msgType = _fixMsg[ _idx ];
switch( msgType ) {
case '8':
if ( _fixMsg[_idx+1 ] !=
FixField.FIELD_DELIMITER ) { // 2 byte message type
throwDecodeException(
"Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );
}
_idx += 2;
return decodeExecReport();
case 'D':
if ( _fixMsg[_idx+1 ] !=
FixField.FIELD_DELIMITER ) { // 2 byte message type
throwDecodeException(
"Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );
}
_idx += 2;
return decodeNewOrderSingle();
………….
// packers
case '6': case '7': case ':': case ';':
case '<': case '=':
case '>': case '?': case '@': case
'B': case 'C': case 'E':
break;
javap -c
./com/rr/model/generated/fix/codec/Standard44Decoder.class >s44.bc
protected final com.rr.core.model.Message
doMessageDecode();
Code:
…….
74: tableswitch { // 48 to 71
48: 549 // '0' ie
heartbeat is the first entry
49: 987
50: 841
51: 768
52: 914
53: 695
54: 1060
55: 1060
56: 184 // '8'
this was the first switch entry in java code
57: 476
58: 1060 // 1060 is
same as default and are the packer entries
59: 1060
60: 1060
61: 1060
62: 1060
63: 1060
64: 1060
65: 622
66: 1060
67: 1060
68: 257
69: 1060
70: 403
71: 330
default: 1060
}
184: aload_0
185: getfield #462 // Field _fixMsg:[B
188: aload_0
189: getfield #466 // Field _idx:I
192: iconst_1
193: iadd
194: baload
195: iconst_1
196: if_icmpeq 242
199: aload_0
200: new #475 // class
java/lang/StringBuilder
203: dup
204: ldc_w #477 // String Unsupported fix
message type
You could easily
grep for lookupswitch … remember over time extra case statements could be added
that cause the switch to become lookupSwitch again.
PS> clearly tableswitch isnt suitable for very sparse switch statements, but in my experience its very useful in trading systems.