Saturday 2 May 2015

Recommendations for Ultra Low Latency (ULL)

OS -> Centos 5.10

Use Redhat if you can afford it .. Don’t get Realtime its slower but more deterministic with more fine grained scheduling … not best for ULL

Avoid O/S upgrades unless you NEED it, … the devs are constantly adding new power saving tweaks which introduce jitter …. RedHat 5 to RedHat 6 was horrendous. I spent a week trying to undo all the new switches they had put on and failed …. Given I had no need for V6 I went back to V5.10.

Don’t just turn on "Huge Pages" or  other O/S or language settings. Test them in a test bed first … you may well be surprised, key is the affect of the change on the system as a whole. Every tweak has plus and negatives and they are different per system … so test measure, rinse, repeat.

Hardware -> Fastest Intel CPU with lowest latency RAM

Using overclocked CPU's with non ECC memory running 24*7 brings risk of crashing, if however the extra 10% to 30% performance boost is the difference between making money and not making money then a certain level of risk will be acceptable. I have run overclocked X5680's with ECC RAM and i7's at 5Ghz with overclocked memory for weeks under load without crashing so it is possible to achieve stable overclocking.

Solarflare NIC with Open Onload

My first NIC's were top Mellanox card in 2010, installation was aweful, performance was terrible at high throughput. Support was not great, also the one sided TCP acceleration was useless for colocation trading. I signed an NDA so wont say more but after two months of pain I switched to Chelsio. Chelsio was just as bad to install and performance even worse for a top 10GB NIC very disappointing.

I got my first Solarflare card in 2010, installation was a breeze and the cards outperformed Chelsio and Mellanox with no tuning. With OpenOnload and the simplest tuning parameter ever (--profile=latency)  they blew Mellanox completely away. My advice is ignore all the perf stats the NIC providers say and test your self in controlled environment with two servers having dual NICs connected directly together (no switch or anything else in the way).

Language -> Java 1.8

I use Sun … er sorry, Oracle standard Java 1.8 (don’t use Realtime java) …. No real perf difference between 1.6 and 1.8 for SMT.

There will future blogs on JVM args and another on application threading models and another on API design and latency impact.

Tools / Third Party Libs

In world of ULL I  avoid third party libs due to lack of control over GC, threading model and JIT jitter. That said "hwloc" has been invaluable for its abstraction layer to core binding.

A future blog will show how to do thread affinity in Java.


hwloc, hwinfo, i7z