Thursday, May 30, 2013

Bloaty XMLBeans

Nikita Salnikov-Tarnovski writes Reducing memory consumption by 20x. One change, dropping XML beans, reduced footprint from 1.5G to 214M. It's not quite that simple, and there's more tuning for the headline 20x improvement, but this is the gist of it. Amazing.

Tuesday, May 28, 2013

For old time's sake, Uncle Bob

I forgot how good Uncle Bob can be. Thank you, The Old Reader: Transformation Priority and Sorting. Quoting:

In this post we explore the Transformation Priority Premise in the context of building a sort algorithm.

We also explore comic books as a pedagogical tool.

Want to read the rest? (In the interest of fair use you should follow the link.)

Sunday, May 26, 2013

Is it live, or is it structured deferral?

A delightful ACM paper by Paul E. McKenney, Structured Deferral: Synchronization via Procrastination. The "motivating example":

In this example, Schrödinger would like to construct an in-memory database to keep track of the animals in his zoo. Births would of course result in insertions into this database, while deaths would result in deletions. The database is also queried by those interested in the health and welfare of Schrödinger's animals.

Schrödinger has numerous short-lived animals such as mice, resulting in high update rates. In addition, there is a surprising level of interest in the health of Schrödinger's cat,19 so much so that Schrödinger sometimes wonders whether his mice are responsible for most of these queries. Regardless of their source, the database must handle the large volume of cat-related queries without suffering from excessive levels of contention. Both accesses and updates are typically quite short, involving accessing or mutating an in-memory data structure, and therefore synchronization overhead cannot be ignored.

Schrödinger also understands, however, that it is impossible to determine exactly when a given animal is born or dies. For example, suppose that his cat's passing is to be detected by heartbeat. Seconds or even minutes will be required to determine that the poor cat's heart has in fact stopped. The shorter the measurement interval, the less certain the measurement, so that a pair of veterinarians examining the cat might disagree on exactly when death occurred. For example, one might declare death 30 seconds after the last heartbeat, while another might insist on waiting a full minute, in which case the veterinarians disagree on the state of the cat during the second half of the minute after the last heartbeat.

Fortunately, Heisenberg8 has taught Schrödinger how to cope with such uncertainty. Furthermore, the delay in detecting the cat's passing permits use of synchronization via procrastination. After all, given that the two veterinarians' pronouncements of death were separated by a full 30 seconds, a few additional milliseconds of software procrastination is perfectly acceptable.

Among the pleasures in this paper: Linux's RCU, hazard pointers, and "read-only cat-heavy performance of Schrödinger's hash table".

Enjoy!

Friday, May 24, 2013

If it's broke, don't fix it

Saul Caganoff writes Netflix: Dystopia as a Service on Adrian Cockroft keynote describing Netflix's seemingly impossible infrastructure. The opening:

"As engineers we strive for perfection; to obtain perfect code running on perfect hardware, all perfectly operated." This is the opening from Adrian Cockroft, Cloud Architect at Netflix in his recent Keynote Address at Collaboration Summit 2013. Cockroft continues that the Utopia of perfection takes too long, that time to market trumps quality and so we compromise. But rather than lamenting the loss of "static, better, cheaper" perfection, Cockroft describes how the Netflix "Cloud Native" architecture embraces the Dystopia of "broken and inefficient" to deliver "sooner" and "dynamic". Cockroft says that "the new engineering challenge is not to construct perfection but to construct highly agile and highly available services from ephemeral and often broken components."

Were I looking for a new challenge, as interesting as Google is, I'd go to Netflix—unless I got to work on asteroid mining!


Do you read the Netflix Tech Blog — why not?

Thursday, May 23, 2013

Census data with Bash

For fun I played with the 2012 US census data on metropolitan areas. I wanted to answer a simple questions, which "big" cities are growing the fastest?

First a note on file format. If you are splitting this line on comma, you want 3 fields:

apple,"banana,split",coconut

But it is easy to get 4 fields instead. My quick fix was to change comma to colon for field separators with this bit of python in a file named "csv-comma-to-colon.py":

import csv
import fileinput

for row in csv.reader(fileinput.input()):
    print(':'.join(row))

Simple, effective. An alternative I did not explore enough was csvprintf.

Once I have colon field separators, the rest writes itself. Seeing the CSV file from the census has non-data header and footer lines, I filter and sort:

sed '/^[^1-9]/d' CBSA-EST2012-05.csv \
    | python ./csv-comma-to-colon.py \
    | sort -t: -k4 -nr \
    | head -50 \
    | sort -t: -k6 -nr \
    | head -10 \
    | cut -d: -f2,4,6

Breaking it down:

  1. Skip non-data lines from the CSV file.
  2. Convert comma field separators to colon.
  3. Reorder the data by city size, descending.
  4. Keep only the top 50 biggest cities.
  5. Reorder the remaining data by growth rate.
  6. Keep only the 10 fastest growing cities.
  7. Display only these columns: city name, current population, growth rate.

There may be better ways, this was enough for some quick fun. I would be remiss not to show the results nicely formatted for the web:

Metropolitan Area Population Growth
Austin-Round Rock, TX 1,834,303 3.0
Raleigh, NC 1,188,564 2.2
Orlando-Kissimmee-Sanford, FL 2,223,674 2.2
Houston-The Woodlands-Sugar Land, TX 6,177,035 2.1
Dallas-Fort Worth-Arlington, TX 6,700,991 2.0
San Antonio-New Braunfels, TX 2,234,003 1.9
Phoenix-Mesa-Scottsdale, AZ 4,329,534 1.8
Denver-Aurora-Lakewood, CO 2,645,209 1.8
Nashville-Davidson--Murfreesboro--Franklin, TN 1,726,693 1.7
Las Vegas-Henderson-Paradise, NV 2,000,759 1.7

Wednesday, May 22, 2013

Netflix delivers Java tools

Netflix continues as a Java technology powerhouse, delivering one open source tool or framework after another. The latest posting from their excellent blog is Brian Moore on Garbage Collection Visualization, a tool for turning gc.log into usable graphs.

The JVM heap teaser shot:

Sunday, May 19, 2013

Java for Fun - Shazam with Roy van Rijn

Roy van Rijn writes Creating Shazam in Java. This is way more fun than most Java posts:

If I record a song and look at it visually it looks like this:


(all the red dots are ‘important points’)

(I would hang that on my wall, wouldn't you?)

He has more fun code on GitHub.

Thursday, May 16, 2013

Escaping TFS

From 2011 Derek Hammer writes TFS is destroying your development capacity with advice on escaping TFS. How little has changed.

UPDATED: A delightful take from Prasanna Pendse, Top 10 Version Control Features of TFS.

Wednesday, May 15, 2013

Code imitates math

Jeff Preshing posted in 2010 High-Resolution Mandelbrot in Obfuscated Python, Python code to draw the Mandelbrot set (h/t Marcus Holtermann). Can you spot the relationship?

_                                      =   (
                                        255,
                                      lambda
                               V       ,B,c
                             :c   and Y(V*V+B,B,  c
                               -1)if(abs(V)<6)else
               (              2+c-4*abs(V)**-0.4)/i
                 )  ;v,      x=1500,1000;C=range(v*x
                  );import  struct;P=struct.pack;M,\
            j  ='<QIIHHHH',open('M.bmp','wb').write
for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C:
            i  ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9
                  *i-950*T  **99,T*70-880*T**18+701*
                 T  **9     ,T*i**(1-T**45*2)))(sum(
               [              Y(0,(A%3/3.+X%v+(X/v+
                               A/3/3.-x/2)/1j)*2.5
                             /x   -2.7,i)**2 for  \
                               A       in C
                                      [:9]])
                                        /9)
                                       )   )

Tuesday, May 14, 2013

Praising the opponents with PostgreSQL

I just read an unusual article, PostgreSQL at a glance, a lengthy, favorable review of a competing product by Kim Sung Kyu, Senior Software Engineer at CUBRID. The opening:

PostgreSQL shows excellent functionalities and performance. Considering its high quality, it may seem strange that PostgreSQL is not more popular. However, PostgreSQL continues to make progress. This article will discuss this database.

This is a good review of PostgreSQL, on CUBRID's blog no less. To openly write about alternative products speaks well of your own.

Linux C10M

Great list, the key is separation of concerns: The Secret To 10 Million Concurrent Connections -The Kernel Is The Problem, Not The Solution.

Wednesday, May 08, 2013

More Zing for Java

From LMAX Exchange Getting Up To 50% Improvement in Latency From Azul's Zing JVM:

The results LMAX Exchange are seeing are remarkable: a 10-20% improvement in the mean latency, increasing to around a 50% improvement at the 99th percentile. Moreover

At the max/99.99th percentile with HotSpot the number would jump all over the place so it is hard to produce a relative comparison, except to say that the Zing values are much more stable. A bad run with HotSpot could easily be an order of magnitude worse.

In terms of throughput

Zing gives us the ability to lift what we call our "red-line" - the throughput value at which the latency starts to drop off a cliff. This effect often manifests as a second order effect of GC pauses. If we get a stall that is sufficiently long, we will start to drop packets. The process of packet redelivery can create back-pressure throughout the rest of the system sending client latencies through the roof. Having a more efficient collector with very short predictable pauses should allow us to increase our "red-line".

Most eye popping for me:

Whilst these figures are impressive, the variations, caused primarily by stop-the-world pauses in the CMS collector that is part of HotSpot, are becoming a significant problem. LMAX Exchange tried upgrading to the CMS version in JDK 7, but encountered around a 20% increase in the length of GC pauses for the same work load. The reasons for this weren't entirely clear, but Barker suggested it was probably down to a need to re-tune the collector. That Zing's collector (C4) typically requires little or no tuning was a major selling point for LMAX Exchange.

I think that we really needed to do retuning of our GC setting and investigating whether JDK 7 specific options like -XX:+UseCondCardMark and -XX:+UseNUMA should be applied. One of the other big reasons to go with Azul is the reduced need to tune the collector. The general recommendation is that you should re-tune from scratch on each new version of the JDK, which sounds fine in theory, but can be impractical. Collector tuning in Oracle JDK is essentially walking through a large search space for a result that meets your needs. Experience, knowledge and guess-work can crop significant chunks off that search space, but even then an extensive tuning exercise can take weeks. For example, our full end-to-end performance test takes 1 hour (10 minutes build & deploy, 10 minutes warm-up, 40 minutes testing), so I could reasonably run 8 runs a day. If you consider the number of different collectors (CMS, Parallel, Serial,...) and all of their associated options (new and old sizes, survivor spaces, survivor ratios,...) how many runs do I need to do to get effective coverage of that search space: 20, 30, more? With Zing the defaults work significantly better than a finely tuned Oracle JDK. We still have some investigation over whether we can get a bit more out of the Zing VM through tuning (e.g. fewer collector threads as our allocation rate is relatively low). However, tuning Zing is just that, i.e. looking to eke out the very best from the system; compared to the Oracle JDK where tuning from the defaults can be the difference between usable and unusable. The effort involving in tuning does come with an opportunity cost. I would much rather have the developers that would typically be involved with GC tuning (they are probably the ones that have the best working knowledge of software performance) be focusing on improving the performance of other areas of the system.

Concluding:

Part of the reason Zing is so attractive to these companies is that it remains the only collector that eliminates stop-the-world pauses from the young generation as well as the old generation. Whilst young generation pauses are shorter, where an application is particularly performance sensitive they still matter. As a result, Tene told us, "All we have to do is point to newgen pauses in other JVMs and say: 'those too will be gone'."

...Furthermore, the fact that it can handle multi-GB-per-sec allocations without worsening latencies or pauses, makes it very appealing for developers who have been trying hard not to allocate things because "it hurts". With Zing, you can use plenty of memory to go fast, instead of trying to avoid using it so that things won't hurt and jitter.

For production use Zing is priced on an annual subscription/server. Unsurprisingly the vendor is reluctant to reveal pricing information, though it is in line with a supported Oracle or IBM JVM.

The Future of Java

Kindly transcribed in Cliff Click, Charlie Hunt and Doug Lea on 'The Future of the JVM'. Juicy bit:

My View: worst feature was a synchronized StringBuffer. I have blogged about this a couple of times. I would add mutable Date objects.

Preaching to the choir, brother.

Thursday, May 02, 2013

Agile is Money

Fascinating read by Daniel Greening: Agile Capitalization, found thanks to Jeff Sutherland. As Jeff describes:

In many companies, agile software development is misunderstood and misreported, causing taxation increases, higher volatility in Profit and Loss (P&L) statements and manual tracking of programmer hours. One large company’s confused finance department expenses all agile software development and capitalizes waterfall development; projects in this company that go agile see their headcounts cut by 50%. This discourages projects from going agile.

Scrum’s production experiment framework can align well with the principles of financial reporting. In this article, the author explains the basics of capitalization and expensing, and offers a financial framework for capitalizing agile projects that can be understood by both accountants and agile teams.

I'll take double headcount, please. Daniel's introduction:

In many companies, agile software development is misunderstood and misreported, causing taxation increases, higher volatility in Profit and Loss (P&L) statements and manual tracking of programmer hours. I claim Scrum teams create production cost data that are more verifiable, better documented, and more closely aligned with known customer value than most waterfall implementations. Better reporting can mean significant tax savings and greater investor interest. Agile companies should change their financial reporting practices to exploit Scrum’s advantages. It might not be easy, but I did it.

Scrum’s production experiment framework aligns perfectly with the principles of financial reporting.

When I restructured software capitalization according to the principles here during an all-company Scrum transition at a 900-person software company, we delighted auditors, gave more insight to upper management and raised more internal money to hire developers. We gained millions in tax savings, by using Scrum Product Backlog Items to more accurately document and capitalize our software work.

I hope to arm you with perspectives and resources to make the accounting argument for agile capitalization, potentially reducing your company’s tax burden, increasing available funds for engineers, and making your auditors happy

Agile is real money.