Saturday 6 June 2015

Setting Thread Affinity and Priority using JNI in SubMicroTrading


 
There is no real mystique in JNI calls, the concept that JNI is slow is a misnomer. If you keep your JNI interfaces simple then when the code is compiled its just another function call (allbeit with an extra two parameters).

I recommend wrapping JNI calls within an envelope which can allow switching between linux, windows and perhaps no custom JNI. I developed SubMicroTrading on a little Dell Adamo laptop and could run the exchange sim, market data sim, trading application all on dual core with 4GB RAM … try doing that in C++ !

In SubMicroTrading all custom JNI calls (excluding custom NIO) are wrapped within a class called NativeHooksImpl (simplified and cut down version below).

public class NativeHooksImpl implements NativeHooks {

    private static boolean _linuxNative   = false;
   
    static {
        if ( Env.isUseLinuxNative() ) {
            System.loadLibrary( "submicrocore" );
            _linuxNative = true;
        }
    }

    private static NativeHooks _instance = new NativeHooksImpl();
    public  static NativeHooks instance() { return _instance; }

    private static native void jniSetPriority( int mask, int priority );
   
    @Override public void setPriority( Thread thread, int mask, int priority ) {
        if ( _linuxNative ) {
            jniSetPriority( mask, priority );
        } else {
            thread.setPriority( priority );
        }
    }
    ………...

To generate the header file :-

javah -force -classpath ..\..\bin -o src\SubMicroCore_jni.h com.rr.core.os.NativeHooksImpl

Sample entry from the generated header …. Clearly the actual function must match the definition

/*
 * Class:     com_rr_core_os_NativeHooksImpl
 * Method:    jniSetPriority
 * Signature: (II)V
 */
JNIEXPORT void JNICALL Java_com_rr_core_os_NativeHooksImpl_jniSetPriority(JNIEnv *, jclass, jint, jint);

Implementation for the set priority method .. Note this sets the cpumask and priority for the CURRENT thread. Invoke this method at the start of the thread run() method. SubMicroTrading keeps all the thread and priority mappings in a config file which is essential. I use different configs for each different PC/server.

JNIEXPORT void JNICALL Java_com_rr_core_os_NativeHooksImpl_jniSetPriority( JNIEnv *env, jclass clazz, jint cpumask, jint priority ) {

    int topodepth;
    hwloc_topology_t topology;
    hwloc_cpuset_t cpuset;

    hwloc_topology_init(&topology);
    hwloc_topology_load(topology);
    topodepth = hwloc_topology_get_depth(topology);

    cpuset = hwloc_bitmap_alloc();
    hwloc_bitmap_from_ulong( cpuset, (unsigned int)cpumask );

    char *str;
    hwloc_bitmap_asprintf(&str, cpuset);

    printf("cpumask [%d] => hwloc [%s]\n", cpumask, str);

    if (hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD)) {
        printf("Couldn't bind cpuset %s\n", str);
    } else {
        printf("BOUND cpuset %s\n", str);
    }

    free(str);

    /* Free our cpuset copy */
    hwloc_bitmap_free(cpuset);

    /* Destroy topology object. */
    hwloc_topology_destroy(topology);
}

Heres the linux makefile I wrote :-

# g++: 3.2.3

.SUFFIXES:        .c

TMP_PATH=./target
BIN_PATH=./bin/linux

#DEBUG=            -g
DEBUG=

DLL_NAME=        libsubmicrocore.so

CC=                      gcc
CFLAGS=          -O3 -march=nocona -m64 -I"${JAVA_HOME}/include" -I"${JAVA_HOME}/include/linux" -DLINUX -fPIC -I${HWLOC_HOME}/include -I./sun
LD=                      gcc
LDFLAGS=-L${HWLOC_HOME}/lib -L${JAVA_HOME}/jre/lib/amd64 -L${JAVA_HOME}/jre/lib/amd64/server

LIBS=-m64 -lhwloc -ljava -ljvm -lverify -lnio -lnet -lrt

all:    setup lib

setup:
mkdir -p ${TMP_PATH}
mkdir -p ${BIN_PATH}

lib: ${BIN_PATH}/${DLL_NAME}

${BIN_PATH}/${DLL_NAME}: ${TMP_PATH}/SubMicroCore_jni.o
${LD} ${LDFLAGS} -LD ${LIBS} -shared -o ${TMP_PATH}/${DLL_NAME} ${TMP_PATH}/SubMicroCore_jni.o
cp -f ${TMP_PATH}/${DLL_NAME}  ${BIN_PATH}/${DLL_NAME}

${TMP_PATH}/SubMicroCore_jni.o: src/SubMicroCore_jni.c src/SubMicroCore_jni.h
${CC} ${CFLAGS} -o ${TMP_PATH}/SubMicroCore_jni.o -c src/SubMicroCore_jni.c

clean:
rm -rf ${TMP_PATH}/*
rm -rf ${BIN_PATH}/*

FYI I had written a windows version of the library but ditched it as for ultra low latency you really need the control level that linux gives you … especially as its free !

This is all EASY due to the good work on the hwloc project :-


I am using a pretty old version (I think 1.0.2 … cant check as my linux servers are offline atm) so API may have changed again, but I would expect impact is minimal.

I will give recommendations for how to use thread affinity and priority in post on threading models. Please use with care, you can grind a system to a halt by poor usage.

My plan is still to open source components from SubMicroTrading which will include the complete JNI layer .. Ie above + various timer and microsecond sleep functions.


2 comments:

  1. Hello,

    You should use JNA instead of calling JNI int the old school way.
    By the way there is the Thread-Affinity library from Peter Lawrey https://github.com/OpenHFT/Java-Thread-Affinity
    It uses JNA internally.

    Regards

    ReplyDelete
  2. JNA certainly looks easier, but to be honest the key here is not JNI vs JNA its the use of hwloc for platform independance. This requires a number of C functions to wrap the hwloc api.

    https://www.open-mpi.org/projects/hwloc/

    SubMicroTrading has a config file allowing core masks to be set for each named thread. Different config files are used for different target hardware and different application instances.

    ReplyDelete