I just finished with my first github fork, the maven-protoc-plugin. David Trott's original works great, but does not provide all the options available to the protoc
program. I added --descriptor_set_out
and --include_imports
. Enjoy!
Monday, September 26, 2011
My first github fork - maven-protoc-plugin
Maven, the ideal vs. the reality
Rob Williams describes his maven woes. His solution is nexus, which is great software. But why is this Rube Goldberg set up required? Again, maven.
Friday, September 23, 2011
Web application scaling punch list
Sean Hull posts a punch list of do-nots for web application scalability:
- Object Relational Mappers
- Synchronous, Serial, Coupled or Locking Processes
- One Copy of Your Database
- Having No Metrics
- Lack of Feature Flags
A good introduction for intermediate web developers looking to broaden their perspective: read the whole thing.
Thursday, September 22, 2011
Java code injection
Jakub HolĂ˝ posts a first-rate introduction to code injection in Java. As Jakub points out, The coolest thing is that it enables you to modify third party, closed-source classes and actually even JVM classes.
My personal experience with AOP and code injection is mixed, I generally prefer code generation and classpath fiddling, but Jakub is right that one should use the best tool for the job. If a tool is unfamiliar, it's an opportunity.
Non-portable
Oh, yeah... The "_np" suffix means "non-portable". The pthread_getattr_np(3) function is available on glibc and bionic, but not (afaik) on Mac OS. But let's face it; any app that genuinely needs to query stack addresses and sizes is probably going to contain an #ifdef or two anyway...
Friday, September 09, 2011
Generic database records in Java
I haven't posted code in too long. In part I am enjoying work so much these days I don't need this blog as an outlet for my creativity. Still, I feel a touch of guilt.
In a trivial context this came up: How to represent a database record efficiently and generically in Java? By efficiently I mean close to the efficiency of standard Java beans. By generically I mean without the custom writing of standard Java beans. The traditional map representation is certainly generic and simple to write but is not efficient in time or space.
With one condition a nice solution arises: generically applies only to compile-time; that is, I know at compile-time the types and labels of columns read from the database record. That solution: EnumMap.
Some code:
interface Field<R, X extends Exception> { <T> T get(final R set) throws X; } class EnumRecord<R, X extends Exception, E extends Enum<E> & Field<R, X>> implements Field<E, RuntimeException> { private Map<E, Object> fields; EnumRecord(Class<E> enumType, R set) throws X { this(enumType, getKeyUniverse(enumType), set); } EnumRecord(Class<E> enumType, E[] values, R set) throws X { fields = new EnumMap<E, Object>(enumType); for (E field : values) fields.put(field, field.get(set)); } EnumRecord(Class<E> enumType, Iterable<E> values, R set) throws X { fields = new EnumMap<E, Object>(enumType); for (E field : values) fields.put(field, field.get(set)); } @Override public final <T> T get(E value) { return (T) fields.get(value); } @Override public boolean equals(Object o) { if (this == o) return true; if (null == o || getClass() != o.getClass()) return false; EnumRecord detail = (EnumRecord) o; return fields.equals(detail.fields); } @Override public int hashCode() { return fields.hashCode(); } private static <K extends Enum<K>> K[] getKeyUniverse(Class<K> enumType) { // Copied shamelessly from EnumMap return SharedSecrets.getJavaLangAccess() .getEnumConstantsShared(enumType); } } class ResultSetRecord<E extends Enum<E> & Field<ResultSet, SQLException>> extends EnumRecord<ResultSet, SQLException, E> { ResultSetRecord(Class<E> enumType, ResultSet set) throws SQLException { super(enumType, set); } ResultSetRecord(Class<E> enumType, E[] values, ResultSet set) throws SQLException { super(enumType, values, set); } ResultSetRecord(Class<E> enumType, Iterable<E> values, ResultSet set) throws SQLException { super(enumType, values, set); } } enum SampleSQLEnum implements Field<ResultSet, SQLException> { nick_name() { @Override public String get(ResultSet set) throws SQLException { return getString(set); } }, lucky_number() { @Override public Integer get(final ResultSet set) throws SQLException { return getInteger(set); } }; @Override public abstract <T> T get(final ResultSet set) throws SQLException; protected String getString(ResultSet set) throws SQLException { String value = set.getString(name()).trim(); return isBlank(value) ? null : value; } protected Integer getInteger(ResultSet set) throws SQLException { int value = set.getInt(name()); return set.wasNull() ? null : value; } } class SampleRecord extends ResultSetRecord<SampleSQLEnum> { SampleRecord(ResultSet set) throws SQLException { super(SampleSQLEnum.class, SampleSQLEnum.values(), set); } String nick_name() { return get(SampleSQLEnum.nick_name); } Integer luck_number() { return get(SampleSQLEnum.lucky_number); } } public class SampleRecordMain { public static void main(final String... args) throws SQLException { SampleRecord record = new SampleRecord(readFromSomewhere()); System.out.println(record.nick_name()); System.out.println(record.luck_number()); } private static ResultSet readFromSomewhere() { throw null; } }
UPDATE: Some have had trouble importing SharedSecrets
. I used it as a convenience to avoid reflection. As an alternative you could reflect over the enum
to implement getKeyUniverse
yourself.
Is state wrong?
An interesting take on state by Tony Arcieri summarized here. My favorite passage is on immuability:
In mutable state languages, performance problems can often be mitigated by mutating local (i.e. non-shared) state instead of creating new objects. To give an example from the Ruby language, combining two strings with the + operator, which creates a new string from two old ones, is significantly slower than combining two strings with the concatenating >> operator, which modifies the original string. Mutating state rather than creating new objects means there's fewer objects for the garbage collector to clean up and helps keep your program in-cache on inner loops. If you've seen Cliff Click's crash course on modern hardware, you're probably familiar with the idea that latency from cache misses is quickly becoming the dominating factor in today's software performance. Too much object creation blows the cache.
Cliff Click also covered Actors, the underpinning of Erlang's concurrency model, in his Concurrency Revolution from a Hardware Perspective talk at JavaOne. One takeaway from this is that actors should provide a safe system for mutable state, because all mutable state is confined to actors which only communicate using messages. Actors should facilitate a shared-nothing system where concurrent state mutations are impossible because no two actors share state and rely on messages for all synchronization and state exchange.
The Kilim library for Java provides a fast zero-copy messaging system for Java which still enables mutable state. In Kilim, when one actor sends a message, it loses visibility of the object it sends, and it becomes the responsibility of the recipient. If both actors need a copy of the message, the sender can make a copy of an object before it's sent to the recipient. Again, Erlang doesn't provide zero-copy (except for binaries) so Kilim's worst case is actually Erlang's best case.
Thursday, September 08, 2011
Props for github
You cannot pay for an endorsement like this: Linus Torvalds posts the Linux kernel source on github rather than kernel.org, at least for now.
Sunday, September 04, 2011
Paen to the state machine
Alan Skorkin writes a paen to the state machine. I've only written state machines for two classes of problems. They are indispensable for scanner-parsers, but I did not do the actual writing: a tool took my grammar and wrote the state machine for me. And for small logic problems I've written trivial state machines around the switch/case statement.
I think Skorkin (and van Bergen, whom he references) overlook the main reason few programmers write state machines: they are hard to reason about which makes them hard to write and hard to test.