Saturday, 16 May 2015

Java Bytecode Latency Impact

In the 80's I remember building NAND circuits to represent code. It was pretty cool seeing how code could be implemented at a circuit level. What I was unsure of when I started SubMicroTrading was the impact on performance by java byte code and whether there were any possible optimisations available.

To cut a long story short, I found only one worthwhile optimisation and that’s how a switch statement is represented in byte code.

Consider the following switch statement :-

    Switch( a ) {
    case 10 : doAStuff(); break;
    case 20 : doBStuff(); break;
    case 30 : doCStuff(); break;
    case 40 : doDStuff(); break;
    ...
    }  
   
    is conceptually the same as
   
    if ( a == 10 ) doAStuff
    else if ( a == 20 ) doBStuff
    else if ( a == 30 ) doCStuff
    else if ( a == 40 ) doDStuff

Think about that, if you are parsing fix and you have 1000 possible fix tags, and an average of 20 fields in a message. Then to process a fix message you could possibly be making an average of 10,000 comparisons . If you want to process 1,000,000 events per second, that would be 10,000,000,000 comparisons per second. (A huge number that would be based on linear top down search, binary search is clearly much better … but point is its still a cost that can be avoided).

Java has two bytecodes for switch statements, the LookUpSwitch statement is in effect a table of key value to jump label … ie you have to search the table to find correct key entry. TableSwitch is in effect a table of jump labels which are indexed directly by the key value (minus the table offset .. Ie lowest key value in switch statement).

For Ultra Low Latency you should consider adding an ant task and check the bytecode for any "lookupswitch" statements. For message processing on most exchanges you can safely force a switch statement to become a tableswitch by adding packer entries so there are no gaps between the key values. In my CODEC generators I stipulate a max pack range eg 100, and any sparse values are handled within the default statement eg via a second switch or if statement if only couple of keys of interest. Like everything test out with real data to see impact. For me, the tableswitch made HUGE difference.

Sample tableswitch with packering case statements :-

Here is the start of the switch statement within the Standard44Decoder generated class :-

        final byte msgType = _fixMsg[ _idx ];
        switch( msgType ) {
        case '8':
            if ( _fixMsg[_idx+1 ] != FixField.FIELD_DELIMITER ) { // 2 byte message type
                throwDecodeException( "Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );
            }
            _idx += 2;
            return decodeExecReport();
        case 'D':
            if ( _fixMsg[_idx+1 ] != FixField.FIELD_DELIMITER ) { // 2 byte message type
                throwDecodeException( "Unsupported fix message type " + _fixMsg[_idx] + _fixMsg[_idx+1] );
            }
            _idx += 2;
            return decodeNewOrderSingle();
        ………….
        // packers
        case '6': case '7': case ':': case ';': case '<': case '=':
        case '>': case '?': case '@': case 'B': case 'C': case 'E':
            break;


javap -c ./com/rr/model/generated/fix/codec/Standard44Decoder.class >s44.bc

  protected final com.rr.core.model.Message doMessageDecode();
    Code:
…….
      74: tableswitch   { // 48 to 71
                    48: 549                                      // '0' ie heartbeat is the first entry
                    49: 987
                    50: 841
                    51: 768
                    52: 914
                    53: 695
                    54: 1060
                    55: 1060
                    56: 184                                        // '8' this was the first switch entry in java code
                    57: 476
                    58: 1060                                     // 1060 is same as default and are the packer entries
                    59: 1060
                    60: 1060
                    61: 1060
                    62: 1060
                    63: 1060
                    64: 1060
                    65: 622
                    66: 1060
                    67: 1060
                    68: 257
                    69: 1060
                    70: 403
                    71: 330
               default: 1060
          }
     184: aload_0
     185: getfield      #462                // Field _fixMsg:[B
     188: aload_0
     189: getfield      #466                // Field _idx:I
     192: iconst_1
     193: iadd
     194: baload
     195: iconst_1
     196: if_icmpeq     242
     199: aload_0
     200: new           #475                // class java/lang/StringBuilder
     203: dup
     204: ldc_w         #477                // String Unsupported fix message type

You could easily grep for lookupswitch … remember over time extra case statements could be added that cause the switch to become lookupSwitch again.


PS> clearly tableswitch isnt suitable for very sparse  switch statements, but in my experience its very useful in trading systems.