Nitsan Wakart writes on optimizing SPSC lock-free queues (single producer, single consumer) with lots of code samples, performance data and refernces. Even the comments are interesting. An excellent read to understand state of the art in Java.
A random blub:
There are a few subtleties explored here:
- I'm using Unsafe for array access, this is nothing new and is a cut and paste job from the AtomicReferenceArray JDK class. This means I've opted out of the array bound checks we get when we use arrays in Java, but in this case it's fine since the ring buffer wrapping logic already assures correctness there. This is necessary in order to gain access to getVolatile/putOrdered.
- I switched Pressanas original field padding method with mine, mostly for consistency but also it's a more stable method of padding fields (read more on memory layout here).
- I doubled the padding to protect against pre-fetch caused false sharing (padding is omitted above, but have a look at the code).
- I replaced the POW final field with a constant (ELEMENT_SHIFT). This proved surprisingly significant in this case. Final fields are not optimized as aggressively as constants, partly due to the exploited backdoor in Java allowing the modification of final fields (here's Cliff Click's rant on the topic). ELEMENT_SHIFT is the shift for the sparse data and the shift for elements in the array (inferred from Unsafe.arrayIndexScale(Object.class)) combined.