Wednesday, May 22, 2013

Netflix delivers Java tools

Netflix continues as a Java technology powerhouse, delivering one open source tool or framework after another. The latest posting from their excellent blog is Brian Moore on Garbage Collection Visualization, a tool for turning gc.log into usable graphs.

The JVM heap teaser shot:

Sunday, May 19, 2013

Java for Fun - Shazam with Roy van Rijn

Roy van Rijn writes Creating Shazam in Java. This is way more fun than most Java posts:

If I record a song and look at it visually it looks like this:


(all the red dots are ‘important points’)

(I would hang that on my wall, wouldn't you?)

He has more fun code on GitHub.

Thursday, May 16, 2013

Escaping TFS

From 2011 Derek Hammer writes TFS is destroying your development capacity with advice on escaping TFS. How little has changed.

UPDATED: A delightful take from Prasanna Pendse, Top 10 Version Control Features of TFS.

Wednesday, May 15, 2013

Code imitates math

Jeff Preshing posted in 2010 High-Resolution Mandelbrot in Obfuscated Python, Python code to draw the Mandelbrot set (h/t Marcus Holtermann). Can you spot the relationship?

_                                      =   (
                                        255,
                                      lambda
                               V       ,B,c
                             :c   and Y(V*V+B,B,  c
                               -1)if(abs(V)<6)else
               (              2+c-4*abs(V)**-0.4)/i
                 )  ;v,      x=1500,1000;C=range(v*x
                  );import  struct;P=struct.pack;M,\
            j  ='<QIIHHHH',open('M.bmp','wb').write
for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C:
            i  ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9
                  *i-950*T  **99,T*70-880*T**18+701*
                 T  **9     ,T*i**(1-T**45*2)))(sum(
               [              Y(0,(A%3/3.+X%v+(X/v+
                               A/3/3.-x/2)/1j)*2.5
                             /x   -2.7,i)**2 for  \
                               A       in C
                                      [:9]])
                                        /9)
                                       )   )

Tuesday, May 14, 2013

Praising the opponents with PostgreSQL

I just read an unusual article, PostgreSQL at a glance, a lengthy, favorable review of a competing product by Kim Sung Kyu, Senior Software Engineer at CUBRID. The opening:

PostgreSQL shows excellent functionalities and performance. Considering its high quality, it may seem strange that PostgreSQL is not more popular. However, PostgreSQL continues to make progress. This article will discuss this database.

This is a good review of PostgreSQL, on CUBRID's blog no less. To openly write about alternative products speaks well of your own.

Linux C10M

Great list, the key is separation of concerns: The Secret To 10 Million Concurrent Connections -The Kernel Is The Problem, Not The Solution.

Wednesday, May 08, 2013

More Zing for Java

From LMAX Exchange Getting Up To 50% Improvement in Latency From Azul's Zing JVM:

The results LMAX Exchange are seeing are remarkable: a 10-20% improvement in the mean latency, increasing to around a 50% improvement at the 99th percentile. Moreover

At the max/99.99th percentile with HotSpot the number would jump all over the place so it is hard to produce a relative comparison, except to say that the Zing values are much more stable. A bad run with HotSpot could easily be an order of magnitude worse.

In terms of throughput

Zing gives us the ability to lift what we call our "red-line" - the throughput value at which the latency starts to drop off a cliff. This effect often manifests as a second order effect of GC pauses. If we get a stall that is sufficiently long, we will start to drop packets. The process of packet redelivery can create back-pressure throughout the rest of the system sending client latencies through the roof. Having a more efficient collector with very short predictable pauses should allow us to increase our "red-line".

Most eye popping for me:

Whilst these figures are impressive, the variations, caused primarily by stop-the-world pauses in the CMS collector that is part of HotSpot, are becoming a significant problem. LMAX Exchange tried upgrading to the CMS version in JDK 7, but encountered around a 20% increase in the length of GC pauses for the same work load. The reasons for this weren't entirely clear, but Barker suggested it was probably down to a need to re-tune the collector. That Zing's collector (C4) typically requires little or no tuning was a major selling point for LMAX Exchange.

I think that we really needed to do retuning of our GC setting and investigating whether JDK 7 specific options like -XX:+UseCondCardMark and -XX:+UseNUMA should be applied. One of the other big reasons to go with Azul is the reduced need to tune the collector. The general recommendation is that you should re-tune from scratch on each new version of the JDK, which sounds fine in theory, but can be impractical. Collector tuning in Oracle JDK is essentially walking through a large search space for a result that meets your needs. Experience, knowledge and guess-work can crop significant chunks off that search space, but even then an extensive tuning exercise can take weeks. For example, our full end-to-end performance test takes 1 hour (10 minutes build & deploy, 10 minutes warm-up, 40 minutes testing), so I could reasonably run 8 runs a day. If you consider the number of different collectors (CMS, Parallel, Serial,...) and all of their associated options (new and old sizes, survivor spaces, survivor ratios,...) how many runs do I need to do to get effective coverage of that search space: 20, 30, more? With Zing the defaults work significantly better than a finely tuned Oracle JDK. We still have some investigation over whether we can get a bit more out of the Zing VM through tuning (e.g. fewer collector threads as our allocation rate is relatively low). However, tuning Zing is just that, i.e. looking to eke out the very best from the system; compared to the Oracle JDK where tuning from the defaults can be the difference between usable and unusable. The effort involving in tuning does come with an opportunity cost. I would much rather have the developers that would typically be involved with GC tuning (they are probably the ones that have the best working knowledge of software performance) be focusing on improving the performance of other areas of the system.

Concluding:

Part of the reason Zing is so attractive to these companies is that it remains the only collector that eliminates stop-the-world pauses from the young generation as well as the old generation. Whilst young generation pauses are shorter, where an application is particularly performance sensitive they still matter. As a result, Tene told us, "All we have to do is point to newgen pauses in other JVMs and say: 'those too will be gone'."

...Furthermore, the fact that it can handle multi-GB-per-sec allocations without worsening latencies or pauses, makes it very appealing for developers who have been trying hard not to allocate things because "it hurts". With Zing, you can use plenty of memory to go fast, instead of trying to avoid using it so that things won't hurt and jitter.

For production use Zing is priced on an annual subscription/server. Unsurprisingly the vendor is reluctant to reveal pricing information, though it is in line with a supported Oracle or IBM JVM.

The Future of Java

Kindly transcribed in Cliff Click, Charlie Hunt and Doug Lea on 'The Future of the JVM'. Juicy bit:

My View: worst feature was a synchronized StringBuffer. I have blogged about this a couple of times. I would add mutable Date objects.

Preaching to the choir, brother.

Thursday, May 02, 2013

Agile is Money

Fascinating read by Daniel Greening: Agile Capitalization, found thanks to Jeff Sutherland. As Jeff describes:

In many companies, agile software development is misunderstood and misreported, causing taxation increases, higher volatility in Profit and Loss (P&L) statements and manual tracking of programmer hours. One large company’s confused finance department expenses all agile software development and capitalizes waterfall development; projects in this company that go agile see their headcounts cut by 50%. This discourages projects from going agile.

Scrum’s production experiment framework can align well with the principles of financial reporting. In this article, the author explains the basics of capitalization and expensing, and offers a financial framework for capitalizing agile projects that can be understood by both accountants and agile teams.

I'll take double headcount, please. Daniel's introduction:

In many companies, agile software development is misunderstood and misreported, causing taxation increases, higher volatility in Profit and Loss (P&L) statements and manual tracking of programmer hours. I claim Scrum teams create production cost data that are more verifiable, better documented, and more closely aligned with known customer value than most waterfall implementations. Better reporting can mean significant tax savings and greater investor interest. Agile companies should change their financial reporting practices to exploit Scrum’s advantages. It might not be easy, but I did it.

Scrum’s production experiment framework aligns perfectly with the principles of financial reporting.

When I restructured software capitalization according to the principles here during an all-company Scrum transition at a 900-person software company, we delighted auditors, gave more insight to upper management and raised more internal money to hire developers. We gained millions in tax savings, by using Scrum Product Backlog Items to more accurately document and capitalize our software work.

I hope to arm you with perspectives and resources to make the accounting argument for agile capitalization, potentially reducing your company’s tax burden, increasing available funds for engineers, and making your auditors happy

Agile is real money.

Monday, April 29, 2013

Moving a git repository to the subdirectory of another repository

How do I move a git repository to be the subdirectory of another repository? is a question answered umpteen times, and all answers vary somewhat. For my needs, this suffices:

#!/bin/bash

case $# in
  3 ) old_repo=$1 new_repo=$2 new_name=$3 ;;
  * ) echo "$0: old_repo new_repo new_name" ; exit 2 ;;
esac

set -e

trap 'rm -rf $old_work $new_work' EXIT
old_work=$(mktemp -d)
new_work=$(mktemp -d)
git clone $old_repo $old_work
git clone $new_repo $new_work

cd $old_work
# If new_name contains commas, edit the sed command accordingly
git filter-branch --index-filter 'git ls-files -s | sed "s,\t\"*,&'$new_name'/," | GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info && mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD

cd $new_work
git remote add $new_name $old_work
git pull $new_name master
git push

-XshowSettings

A java flag I hadn't noticed before but wish I had: -XshowSettings. It dumps out the system properties and a guess at which VM to use (server v. client), among other things. Look here for more magic -X flags.

Monday, April 15, 2013

More good Java 8, living with null

Edwin Dalorzo posts wonderfully on Java 8 Optional Objects: what is the problem, how it is handled in other languages, what Java 8 does about it. It all boils down to the "billion dollar mistake": null.