Wednesday, December 29, 2004

Using apt

While exploring the new JDK 5 tool apt (annotation processing tool), I figured out how to write new source code that is compiled into my build tree along side my regular Java sources. Here is a trivial example.

First, I need an annotation processing factory (I ignore imports and such throughout):

public class MyAnnotationProcessorFactory
        implements AnnotationProcessorFactory {
    public Collection supportedOptions() {
        return Collections.emptySet();
    }

    public Collection supportedAnnotationTypes() {
        return Collections.singleton(
                getClass().getPackage().getName() + ".*");
    }

    public AnnotationProcessor getProcessorFor(
            final Set atds,
            final AnnotationProcessorEnvironment env) {
        return new MyAnnotationProcessor(atds, env);
    }
}

Second is to have an annotation processor:

public class MyAnnotationProcessor
        implements AnnotationProcessor {
    private final Set atds;
    private final AnnotationProcessorEnvironment env;

    MyAnnotationProcessor(final Set atds,
            final AnnotationProcessorEnvironment env) {
        this.atds = atds;
        this.env = env;
    }

    public void process() {
        for (final AnnotationTypeDeclaration atd : atds) {
            for (final Declaration decl
                    : env.getDeclarationsAnnotatedWith(atd)) {
                final String typeName = decl.getSimpleName() + "Example";
                final String fullTypeName = getPackageName() + "." + typeName;

                try {
                    final PrintWriter writer
                            = env.getFiler().createSourceFile(fullTypeName);

                    writer.println("package " + getPackageName() + ";");
                    writer.println("public class " + typeName + " {");
                    writer.println("}");

                } catch (final IOException e) {
                    throw new RuntimeException(fullTypeName, e);
                }
            }
        }
    }

    private String getPackageName() {
        return getClass().getPackage().getName();
    }
}

Last is to tell apt how to fit it all together (I'm using a Maven-style layout):

apt -cp target/classes
    -s target/gen-java -d target/gen-classes
    -target 1.5 -factorypath target/classes
    -factory MyAnnotationProcessorFactory
    src/java/AnnotatedExample

When I run the apt command, given suitable annotations in AnnotatedExample, it pulls them out, instantiates my annotation processor via my factory, and hands them to process() therein. The key is to use com.sun.mirror.apt.Filer, a class in $JAVA_HOME/lib/tools.jar. There are no online javadocs that I have found yet. Here is what the JDK 1.5.0_01 sources say about the Filer interface:

This interface supports the creation of new files by an annotation processor. Files created in this way will be known to the annotation processing tool implementing this interface, better enabling the tool to manage them. Four kinds of files are distinguished: source files, class files, other text files, and other binary files. The latter two are collectively referred to as auxiliary files.

There are two distinguished locations (subtrees within the file system) where newly created files are placed: one for new source files, and one for new class files. (These might be specified on a tool's command line, for example, using flags such as -s and -d.) Auxiliary files may be created in either location.

During each run of an annotation processing tool, a file with a given pathname may be created only once. If that file already exists before the first attempt to create it, the old contents will be deleted. Any subsequent attempt to create the same file during a run will fail.

My next step is to glue velocity into my processor so I can use templates for writing the new Java sources.

Tuesday, December 28, 2004

The future in futures

Brian McCallister has started a great set of posts on futures. This fits in perfectly with my work with Gregor Hohpe on Mercury, a light-weight messaging library for Java. Brian explains futures:

Futures are really nice conceptually, and provide for much more natural and easy to use concurrent design than Java style threading and monitors. It relates to a lot of functional programming concepts, but the key idea is that a future represents an evaluation which has not yet occured. Maybe it is being lazily evaluated, maybe it is being evaluated concurrently in another thread.

Java has no futures, but I do fake closures using reflection and a sort of distended proxy. The core of Mercury is that when code publishes a message to a channel, it is not immediately delivered. Instead, the channel records an activation record (binding) and begins reordering the records before activating the bindings. This lets Mercury have breadth-first message delivery instead of depth-first, and circumvents the effects of using a call stack in a messaging system.

Gregor's excellent Enemy of the State post describes the thinking in more detail along with some ramifications. I'm just the humble mechanic who coded the library. :-)

UPDATE: I just got an anonymous comment pointing me to java.util.concurrent.Future. Zowie! That's a cool thing to find.

Sunday, December 26, 2004

Using JDK 5 varags for testing

One JDK 5 feature I have started using to clean up my testing is varargs, the ... (or for the entity-aware). It is particularly elegant to turn this:

public void testWidgetProcessor() throws Exception {
    final Widget expected1 = createWidget(), expected2 = createWidget();

    processor.swallow(new Widget[]{
        expected1,
        expected2
    });

    assertWidgetsGained(new Widget[] {
        expected1,
        expected2
    });
}

Into this:

public void testWidgetProcessor() throws Exception {
    final Widget expected1 = createWidget(), expected2 = createWidget();

    processor.swallow(expected1, expected2);

    assertWidgetsGained(expected1, expected2);
}

And even better:

public void testWidgetProcessorHandlesTrivialCase() throws Exception {
    assertWidgetsGained(); // nothing swallowed, nothing gained
}

XOM Design Principles

A very, very interesting article, XOM Design Principles, on Java library design. What took me particularly off-guard was the preference for classes over interfaces. The argument over this point presents much to digest.

Sunday, December 19, 2004

What are Java annotations?

This must be described elsewhere, but a quick Googling didn't give it to me. A little experimentation reveals to me that JDK 5.0 annotations are dynamic proxies under the hood.

To find this out, I made an annotation named @annotate and exammined annotation.class; it revealed itself to be an interface. I then decorated a method with @annotate, got the Method with reflection, pulled the appropriate annotation class off with getAnnotation(annotate.class).getClass() and examined that: dynamic proxy, $Proxy3(java.lang.reflect.InvocationHandler).

I wonder how I can use this knowlege for some real Java-fu.

UPDATE: An even stronger answer: Proxy.isProxyClass(Class) returns true on the method annotation class, and the proxy invocation handler is a sun.reflect.annotation.AnnotationInvocationHandler. Good thing Sun provides source for the non-public bits.

Saturday, December 18, 2004

A taste of things to come

Now to make it work!

/**
 * Marks a method with a <em>pre-condition</em>.  The required annotation
 * {@link #value()} is a <code>boolean</code> condition.  The optional
 * {@link #message()} is used in {@link DBCException} if the condition
 * fails.
 */
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface pre {
    boolean value();

    String message() default "";
}

/**
 * Marks a method with a <em>post-condition</em>.  The required annotation
 * {@link #value()} is a <code>boolean</code> condition.  The optional
 * {@link #message()} is used in {@link DBCException} if the condition
 * fails.
 */
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface post {
    boolean value();

    String message() default "";
}

/**
 * Marks a method with an <em>invariant</em>.  The required annotation
 * {@link #value()} is a <code>boolean</code> invariant.  The optional
 * {@link #message()} is used in {@link DBCException} if the invariant
 * fails.
 */
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface inv {
    boolean value();

    String message() default "";
}

UPDATE: I'll have to think about this longer. Turns out that annotations only accept constants or expressions which evaluate to constants. A constant pre-condition isn't very interesting.

Friday, December 17, 2004

Looking up the location of a class file

IBM's JAR Class Finder reminds me of a snippet of utility code I wrote for PCGen which used a clever trick to find the location on disk of the definition for a given class, either a jar or a directory in the classpath:

new File(clazz.getProtectionDomain().getCodeSource().getLocation().toURI().normalize())

The only drawback: for the JDK itself the protection domain has a null code source. Since Class.getProtectionDomain() relies internally on a native method to create the protection domain, I'm unsure if this is a bug or intended behavior. It certainly isn't documented in the javadocs. I consider this a bug, but others disagree and it has changed between JDK versions.

Thursday, December 16, 2004

Clever IDEA

I just noticed very clever behavior by IntelliJ IDEA 4.5.3 on Windows. I ran a junit test that crashed the test JVM from badly behaved JNI code. Unfortunately, this caused the JVM for IDEA itself to eventually exhaust memory and crash. IDEA handled this well, informed me of the problem, and gracefully shutdown.

The cool part: when I launched it again, it ran the console window for the LAX launch wrapper. Normally this does not appear so that you just have the main application window, but the console emits all sorts of interesting information useful to someone developing IDEA. Presumably, had the crash I mentioned been IDEA's fault, I could use that information to help IntelliJ fix the bug.

Further, I closed IDEA normally and relaunched it. This time it come up normally with no extra console window. The console window feature is only triggered by a crash in IDEA. Nifty!

Wednesday, December 15, 2004

Messaging with annotations

I'm working on a talk for SD West 2005 with a former coworker, Gregor Hohpe of ThoughtWorks. We coded a simple Java messaging system for single VM apps. Most messaging systems are based on messages segregated by subject, usually a String field. Our system though is type-based. Messages implement (or extend) Message, and receivers provide methods receive(Message) to receive published messages.

The dispatch loop makes extensive use of reflection to find a "best matching" method. If you have:

public class FooReceiver extends Receiver {
    public void receiver(final Message message) { }

    public void receiver(final FooMessage message) { }
}

Then if a FooMessage (or a subtype) comes along, the better matching method receives the message, otherwise the more general method does.

But I have found a better way: annotations :-) Consider:

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface MessageHandler {
    Class[] value();
}

This bit of annotation magic lets me say this in my receiver class:

public class FooReceiver {
    @MessageHandler({Message.class})
    void receiveAllTypes(final Message message) { }

    @MessageHandler({FooMessage.class})
    void receiveFooTypes(final FooMessage message) { }
}

Now if a FooMessage comes along, both methods see the message: logically, there is no method overloading to consider since they are differently-named methods. But this has several other advantages:

  • No need to extend or implement anything; just annotate the methods
  • Methods can have better names than just receive; it's the annotation, not the name, which counts
  • Interestingly, you can write methods to handle disjoint types:
public class DisjointReceiver {
    @MessageHandler({FooMessage.class, BarMessage.class})
    void handleEitherFooOrBarType(final Message message) { }
}

Nifty — there is no need for FooMessage or BarMessage to be related types. In fact, to carry the disjunction one step further, I could dispense entirely with the Message class! Hmmm...

UPDATE: I fixed the bad links. That's what I get for blogging while woozy with flu.

Maven and test cases

I have a problem with maven I am hoping someone knows the best solution to. As is typical with Java projects using maven, I have separate src/java and src/test trees and they compiles to build/classes and build/test-classes, respectively. When I build a distribution jar, maven zips up build/classes into the jar (along with LICENSE.txt and any resources). This much I know and appreciate.

However, my project uses JUnit for testing, so I have *Test classes, one for each class under test. And my project provides *TestCase base classes for other test classes to extend. The test case classes add functionality to junit.framework.TestCase. These live under the src/test tree since they are only used for testing, and they are dependent on junit-3.8.1.jar. Of course, they compile to build/test-classes.

But since the dist goal only packages up build/classes, the test case classes do not become part of my distribution jar. And I want to package them for distribition. Oops.

For now, I've moved the test case classes from src/test to src/java so that maven will bundle them in the distribution jar, but I feel awkward doing that.

Is there some way to teach maven to pull src/test/**/*TestCase.java classes from build/test-classes into the distribution jar, but no other test classes?

Delegation, a problem and solution

I'm adding unit testing to existing code and encounter this problem:

public class Foo {
    private static Foo SINGLETON;
    private static Trouble TROUBLE;

    private Foo() { }

    public Foo newInstance() {
        if (null == SINGLETON) SINGLETON = new Foo();
        return SINGLETON;
    }

    public Trouble getTrouble() {
        if (null == TROUBLE) TROUBLE = new Trouble();
        return TROUBLE;
    }

    public void doSomething() {
        getTrouble().doSomethingElse();
    }
}

Callers are expected to write Foo.newInstance().doSomething(). Fair enough. For unit testing, I want to replace SINGLETON with a stub or mock object that extends Foo in setUp() and put it back to null in tearDown(). Foo is not under test, but classes I am testing call to it and I need to control the interaction.

What about TROUBLE? It is a complex object with its own behavior, so I need to stub or mock it as well. But here's the rub: I don't have control over its source. It could be a SWIG-generated wrapper for JNI, or from a third-party jar I lack the sources to. And the class looks something like this:

public final class Trouble {
    public void doSomethingElse() { }
}

Oops! No extending for me. what to do? It is time to rely on delegation.

First, extract an interface from Trouble, say Troubling, which contains all the public methods in Trouble. We cannot change Trouble to implement Troubling, but I'll overcome that in a moment.

Second, update Foo and all callers of Foo.getTrouble to refer to Troubling and not Trouble. This decouples them from dependency on Trouble.

Third, create a new class which implements Troubling and delegates to Trouble:

public class NoTrouble implements Troubling {
    private final Trouble trouble;

    public NoTrouble(final Trouble trouble) {
        this.trouble = trouble;
    }

    public void doSomethingElse() {
        trouble.doSomethingElse();
    }
}

Lastly, update Foo to use NoTrouble instead of Trouble:

public class Foo {
    private static Foo SINGLETON;
    private static Troubling TROUBLE;

    private Foo() { }

    public Foo newInstance() {
        if (null == SINGLETON) SINGLETON = new Foo();
        return SINGLETON;
    }

    public Troubling getTrouble() {
        if (null == TROUBLE) TROUBLE = new NoTrouble(new Trouble());
        return TROUBLE;
    }

    public void doSomething() {
        getTrouble().doSomethingElse();
    }
}

We're following the ancient dictum, any problem can be solved by introducing an extra level of indirection. Foo was already delegating doSomething() to Trouble; we just replaced that relationship with an extra level of delegation, Foo to NoTrouble to Trouble. Now I can mock or stub NoTrouble without needing access to Trouble.

Go have a peanut butter sandwich.

Saturday, December 11, 2004

The easy way

While updating Java code to permit replacement of one implementation of a system-wide object with another (in this case, we have a physical device which is needed for production code, but which is a problem to require for unit tests — I want to use a mock or stub implementation for testing to avoid accessing a real device)

I started down the inversion of control path and faced a daunting problem: large portions of the system needed updating to support appropriate constructors, new class members to hold the IoC objects, rewritten callers to cope with the new constructors, ad nauseum. Yikes!

What to do?

After some tinkering, I decided to give up on the "pure" solution and hack something just for the immediate problem at hand. Rather than adjust code all through out the system (and try to adjust the culture around me to accept those changes "just for testing"), I limited the scope of change to just those places which actually need a device object.

First I implemented the singleton pattern for the device class, then refactored all the users to call Device.newInstance() instead of new Device().

Next, I made sure the singleton implementation used lazy initialization:

private static Device SINGLETON;

public static Device newInstance() {
    if (null == SINGLETON)
        SINGLETON = new Device();

    return SINGLETON;
}

There's the crux: now testing code can put a stub or mock implementation into the SINGLETON field with reflection in setUp() and replace it with null in tearDown(). This ensures that tests see only a stub or mock implementation anywhere in the production code which uses a device object. And no code need know any details about who uses a device object or how.

Finally, tearDown sees to it that other tests get a real device implementation if they need one since it sets the singleton back to null (see the code above).

The only other update to the device object is to make the constructor protected so that a stub or mock implementation can extend the class.

Thursday, December 09, 2004

What is remove in a map?

While coding a utility map class I was struck by this problem: what is the meaning of remove? That is, what does does it mean to remove a key when the map has more than one value for that key? How could that happen, you ask. My map isn't the ordinary sort: it's a union map, or a view of a list of maps.

Picture a stack of maps like a column of egg cartons over which you stand. As you look down through a particular cell from above, some cartons have an egg there, some do not. My union map is as if you collapsed the stack of egg cartons. Into each cell goes some kind of combination of the eggs from the corresponding cells in the original stack. The usual rule selects the top-most egg; remove just the top-most egg, and the rule selects the next egg down in the stack of cells; if there are no eggs in any cell, the collapsed cell is also lacking.

So I am faced with the question, what does it mean to remove a key from the union map? Do I remove all eggs in the matching cell from all cartons, or do I just remove the top-most one and the cell has a new value? Either way seems to me not quite right. As a practical matter, I chose for remove to remove all values for the key in all stacked maps so that collectively they would still follow the requirements of remove in the JDK, and I provided a removeOne to remove just the top-most value. Linguistically, I'd prefer for remove to perform this function, and for a removeAll to remove the key-value pairs from all stacked maps, but that would break expectations.

Suggestions?

Friday, December 03, 2004

Reinventing Smalltalk

What a cool idea!

But ah-hah! We do, we can use a different implementation of FunctionTable which stores the method body as a String in Derby and uses BeanShell (or Janino!) to invoke it. So a method invocation is completely configurable at runtime! Not deploy time, runtime. If you find a bug, you can fix it without a redeploy.
Yes, Virginia, there is runtime update of objects in memory. What these two have come up with is the redefinition of object methods in a running system, no restart required. There's infrastructure to build to make it happen, but the principle is demonstrated. Java is catching up with Smalltalk bit by bit.

Thursday, December 02, 2004

Have IDEA mark all qualifying variables as final

I finally figured out a trick with IDEA that I've been searching for for more than a year now. I like to mark all qualifying variables as final when I work in Java. I used to strip final as noise, but was convinced by Chapter 2[PDF] of Hardcore Java that final is a fine thing and good coding practice.

But back to IntelliJ IDEA. Marking all qualifying variables as final is a real pain. Why not have the editor do it for me? I certainly can flag unmarked variables with warnings (Settings|Errors|Local Code Analysis|Local variable or parameter can be final).

To have the editor mark all qualifying variables globally as final, run the analysis tool on a file or your whole project with Analyze|Inspect code...|Local Code Analysis|Local variable or parameter can be final, then Run.

When the results return, the left pane displays analysis problems. Expand the Local Code Analysis node, select the Local variable or parameter can be final node underneath, and push the "lightbulb" (Apply a quickfix) button. This fixes the problem for all qualifying variables all in one go. Wonderful!

Monday, November 29, 2004

Finally! IDEA has built-in Subversion support

As of Irida build 3117 IntelliJ IDEA finally has built-in Subversion support. I just created a fresh project, looked at the options, looked at version control, and BAM!, there it was. There is nothing now to keep me from using IntelliJ IDEA for all my Java work.

However, you cannot please everyone. Go figure.

Linux VFS tutorial

What a neato article! You build a vitual filesystem in a file with emphasis on ACLs. Time for me to start playing with this more in my shell scripting; it makes an interesting alternative to traditional shar file installers.

Friday, November 26, 2004

Cool IntelliJ IDEA features for ant

I'm not used to using IDEA to run my ant builds, but I'm going to start using this feature as often as possible. I just discovered that when you run an ant target from inside IDEA, and you have the option enabled to Autoscroll to Source, IDEA jumps around in build.xml from target to target as it runs: I can visually trace ant as it executes!

To boot, when it is working on an individual class (compiling, say), it jumps to that source file as it works.

What a great visual indicator of how the build is progressing!

Wednesday, November 24, 2004

Why Java needs delegation

The problem

A problem especially exacerbated in Java by inversion of control (IoC) is composition or delegation v. inheritance. In C++, this is a non-issue given features such as multiple inheritance, private inheritance and using declarations. However, Java has no such help. Just look at this code:

public interface Commuter {
    public void walkToWork();
    public void driveToWork();
    public void flyToWork();
}

And yet another interface:

public interface Parent {
    public void produceChildren();
    public void feedChildren();
    public void disciplineChildren();
    public void teachChildren();
}

And finally a concrete class:

public class Dad
        implements Commuter, Parent {
    private Commuter commuter;
    private Parent parent;

    public Dad(final Commuter commuter, final Parent parent) {
        this.commuter = commuter;
        this.parent = parent;
    }

    public void walkToWork() {
        commuter.walkToWork();
    }

    public void driveToWork() {
        commuter.driveToWork();
    }

    public void flyToWork() {
        commuter.flyToWork();
    }

    public void produceChildren() {
        parent.produceChildren();

Hey, wait! You didn't finish! you cry. You are right; I got bored with all that redundant typing. This is the problem with delegation in Java — the language support sucks and it's all hand-made. What Java needs is proper delegation support.

In C++ it looks like this:

class Dad
  : public HoustonCommuter, public GoodParent
{
};

That's it! (Given suitable implementations of HoustonCommuter and GoodParent.) Even better, just like in Java you can change the implementation:

template<typename A, typename B>
class CommutingParent
  : public A, public B
{
};

typedef CommutingParent<HoustonCommuter, GoodParent> Dad;

If anything, it's now too amazing as you can construct a Dad from the wrong base classes, although latent typing keeps this problem theoretical. (However, note the C++ FAQ on private inheritance as a counter.)

What's to be done?

The solution

The solution is to change Java, of course. :-) I don't suggest C#'s delegate approach, and introducing new keywords is bad policy. What I suggest is both simpler and more expressive, and also more flexible. Let's turn back to the Commuter/Parent example. Try your mind on this:

class Dad implements Commuter, Parent {
    public Dad(final Commuter commuter, final Parent parent) {
        this = commuter;
        this = parent;
    }
}

What's going on here? I'm teaching the Java compiler to take meaning from an otherwise illegal statement: assignment to this. I wave my compiler-writer wand and declare that when a class implements an interface, assignment to this implements that interface with the object assigned (the RHS) provided that the RHS implements the interface.

Internally, I'd teach the compiler for this example to add a hidden Commuter __commuter instance member to Dad, assign to that member, and add the forwarding methods to Dad automatically. I'd expect reflection to work as expected and make all this clear, just as if I'd had coded it all by hand. No magic secrets, please.

More details

If Dad directly implements any of the methods in Commuter or Parent, those let the compiler off the hook and it generates no silent methods. That is:

public class Dad implements Commuter, Parent {
    public Dad(final Commuter commuter, final Parent parent) {
        this = commuter;
        this = parent;
    }

    public void teachChildren() {
        teachSinging();
    }

    private void teachSinging() {
        System.out.println("Mir ist so wunderbar!");
    }
}

This implementation of Dad should provide silent methods for everything except void teachChildren(). Note that this implies you can implement portions of an interface as well:

public interface CommutingParent extends Commuting, Parent {
}

public class Dad implements CommutingParent {
    public Dad(final Commuter commuter) {
        this = commuter;
    }
}

Now Dad will not compile until it implements the methods in Parent. However this leads to an interesting question: what about ambiguities?

public class Dad implements CommutingParent {
    public Dad(final Commuter commuter, final CommutingParent commutingParent) {
        this = commuter;
        this = commutingParent
    }
}

Dad will not compile with the compiler complaining about an ambiguity: which delegate implements Commuter, commuter or commutingParent? The simple approach is best to fix this error:

public class Dad implements CommutingParent {
    public Dad(final Commuter commuter, final CommutingParent commutingParent) {
        this = commuter;
        this = (Parent) commutingParent;
    }
}

Here the (Parent) cast extracts the just the Parent implementation from commutingParent and ignores the Commuter implementation.

A final problem

Back to the earlier example:

public class Dad implements Commuter, Parent {
    public Dad(final Commuter commuter, final Parent parent) {
        this = commuter;
        this = parent;
    }

    public void teachChildren() {
        teachSinging();
    }

    private void teachSinging() {
        System.out.println("Mir ist so wunderbar!");
    }
}

Notice anything funny? Contrast with this:

public class Dad extends HoustonCommuter, GoodParent {
    public void teachChildren() {
        super.teachSinging();
        teachSinging();
    }

    private void teachSinging() {
        System.out.println("Mir ist so wunderbar!");
    }
}

Ok, we call super.teachChildren() before extending the behavior of void teachChildren() so that we add to instead of replace existing behavior. How can I accomplish that now with delegation? Simple, write the exact same code! Since I'm giving meaning to assignment to this, I feel free to extend the meaning of calls through super. The compiler will take this as a request to inline the code for the silent delegation method. It will still not generate the silent method as we need reflection to find the our own code, not the silent method.

The only remaining problem is (ok, so the preceding problem was the penultimate one, not the final one):

public class BeetleDriver implements Driver {
    public void driveToWork() {
        System.out.println("Baby you can drive my car!");
    }
}

And:

public class Dad extends BeetleDriver implements CommutingParent {
    public Dad(final CommutingParent commutingParent) {
        this = commutingParent;
    }
}

Who implements void driveToWork()? It's another ambiguity error and the class will not compile. Back to explicit delegation:

public class Dad extends BeetleDriver implements CommutingParent {
    public Dad(final CommutingParent commutingParent) {
        this = commutingParent;
    }

    public void driveToWork() {
        super.driveToWork();
    }
}

This is just like the solution above for super.teachChildren().

IntelliJ IDEA, Windows, CVS, SSH and SourceForge

I'm having trouble with one of my open source projects, PCGen, getting the fabulous IntelliJ IDEA on Windows XP to integrate with CVS over SSH to SourceForge. When I imported the project into IDEA, it read the connection settings from the CVS/Root which were :ext:binkley@cvs.sourceforge.net:/cvsroot/pcgen. Ok, then. However, IDEA was then unable to make a correct SSH connection and IDEA would simply hang. I had to kill an errant ssh.exe process

I followed the directions on CVSWithSSH except that I really didn't want to add a depedency to PuTTY. PuTTy is great software, but I wanted Cygwin's ssh to do the job if possible.

Then I found SSH and CVS and got an idea. I checked the repository settings for my IDEA project and after some experimenting switched from :ext:binkley@cvs.sourceforge.net:/cvsroot/pcgen to :ssh:binkley@cvs.sourceforge.net:/cvsroot/pcgen. What's the difference?

Using the "ssh" protocol instead of "ext" (external based on the CVS_RSH environment variable), I told IDEA to use its own internall SSH implementation. Amazing! Now everything "just works".

UPDATE: My only complaint now is that IDEA keeps the CVS settings in my user preferences, instead of with the project. Perhaps in the next version. IDEA 5.0 is looking to have quite a nice feature set including serious CSS & HTML support, XML refactoring and the YourKit Java Profiler built right in.

Monday, November 22, 2004

Yet more JNI help

When I posted A litte JNI there were several details I glossed over. One important point: the sample code leaks memory. Why? When you call FindClass, the pointer that the JNI returns needs freeing later on. Rather than deal with such a mundane task directly, I actually wrote a small helper template class and specialized it for jclass:

template<>
struct local<jclass>
{
  JNIEnv* env;
  jclass clazz;

  inline local(JNIEnv* env, const char* name) throw()
    : env(env), clazz(env->FindClass(name))
  { }

  inline ~local() throw()
  { if (clazz) env->DeleteLocalRef(clazz); }

  inline operator jclass() const throw() { return clazz; }

#warning Implement the safe bool idiom
  operator bool() const { return clazz; }
};

There is still one nit: since there is definitely a sense of "validity" for this class (did FindClass find anything?), I should implemnt the Safe Bool Idiom instead of blindly providing operator bool(). But, as they say, I leave that as an exercise for the reader.

Sunday, November 14, 2004

A handy JNI trick for package names

One thing I find myself doing a lot is writing "package/Class" strings in calls to C++ JNI. A shortcut— define them as static members of a struct hierarchy mirroring the Java package hierarchy. To illustrate, take java.lang.Exception. In some header the declaration:

extern struct java
{
  struct lang
  {
    const char* Exception;
  } lang;
} java;

And elsewhere in the matching definition:

const char*
java::lang::Exception = "java/lang/Exception";

(Remember that JNI uses slashes instead of dots to separate package names.)

Now you can write code like this:

inline void
throwJavaException(JNIEnv* env, const std::string& msg)
{
  env->ThrowNew(env->FindClass(java.lang.Exception), msg.c_str());
}

I find using C++ program text that looks like Java program text is easier for me to read and conceptualize. I more interesting problems to occupy my mind than mentally translating strings to class names.

A little JNI

For an integration project at work, I found myself needing to implement portions of a Java class in C++ to access a custom shared library another group wrote. JNI, of course. So I looked around and remembered SWIG for auto-generating things. The problem with SWIG, however, is that it is designed to go C++->Java, and I need to go the other direction.

After some searching, I decided to just go it alone and see what happened. About a half-day later, I had a decent looking setup. First, the GNUmakefile (for some hypothetical project named Pants). Just as important as tests are an easy build (yes, this is for Cygwin):

MAKEFILE = $(word 1,$(MAKEFILE_LIST))

TARGET = Pants

FLAGS = -g3 # -Os -W -Wall

CC = cc
CFLAGS = $(FLAGS)
CPPFLAGS = -I. -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32"
CXX = c++
CXXFLAGS = $(FLAGS) -fabi-version=0 # -Weffc++
JAR = $(JAVA_HOME)/bin/jar
JARFLAGS =
JAVAC = $(JAVA_HOME)/bin/javac
JAVACFLAGS = -g -Xlint:all
JAVAH = $(JAVA_HOME)/bin/javah
JAVAHFLAGS =
LDFLAGS = -L. -L"$(JAVA_HOME)/bin" -L"$(JAVA_HOME)/lib"
TARGET_ARCH = -mno-cygwin # turn off Cygwin-specific dependencies

COMPILE.java = $(JAVAC) $(JAVACFLAGS)
LINK.o = $(CXX) $(LDFLAGS) $(TARGET_ARCH)

%.d: %.cpp $(MAKEFILE)
	@$(CXX) -MM $(CPPFLAGS) $< > $@.$$$$; 	  sed 's,\($*\)\.o[ :]*,\1.o $@ : $(MAKEFILE) ,g' < $@.$$$$ > \ $@;
	  rm -f $@.$$$$

%.class: %.java $(MAKEFILE)
	$(COMPILE.java) $<

%.h: %.class
	$(JAVAH) $(JAVAHFLAGS) -jni $(patsubst %.class,%,$^)
	@touch $@ # javah doesn't update the timestamp

SRCS = $(wildcard *.cpp)

all: $(TARGET).jar

-include $(SRCS:.cpp=.d)

# Teach make to generate the header when compiling the source
$(TARGET).o: $(TARGET).h

$(TARGET).dll: $(SRCS:.cpp=.o)
	$(LINK.cpp) -shared -o $@ $^ $(LDLIBS)

$(TARGET).jar: $(TARGET).class $(TARGET).dll
	[ -e $@ ] 	  && $(JAR) uf $@ $^
	  || $(JAR) cf $@ $^

clean:
	$(RM) *~ *.d *.o
	$(RM) $(TARGET).h $(TARGET).class $(TARGET).dll $(TARGET).jar

Ok, then. What is this all for?

You start with a directory listing such as:

  1. GNUmakefile
  2. Pants.cpp
  3. Pants.java

And running make compiles Pants.class, magically creates Pants.h containing the JNI bindings for any native methods in Pants.class, uses Pants.cpp to implement the methods, links Pants.dll, and finally combines Pants.class and Pants.dll into Pants.jar. Easy, peasy. The output is:

Pants.cpp:1:19: Pants.h: No such file or directory
javac -g -Xlint:all Pants.java
javah  -jni Pants
c++ -g3  -fabi-version=0  -I. -I"$JAVA_HOME/include" -I"$JAVA_HOME/include/win32" -mno-cygwin -c -o Pants.o Pants.cpp
c++ -g3  -fabi-version=0  -I. -I"$JAVA_HOME/include" -I"$JAVA_HOME/include/win32" -L. -L"$JAVA_HOME/bin" -L"$JAVA_HOME/lib" -mno-cygwin -shared -o Pants.dll
 Pants.o
[ -e Pants.jar ]   && jar uf Pants.jar Pants.class Pants.dll   || jar cf Pants.jar Pants.class Pants.dll

(The warning in the first line only happens once, and is a side-effect of auto-generating header dependencies. A fix would be very welcome. And the nuisome check for Pants.jar existence is because jar isn't very smart.)

Take a trivial Pants.java:

public class Pants {
    public native void wear();
}

And a simple-minded implementation:

#include "Pants.h"

#include <iostream>

using namespace std;

/*
 * Class:     Pants
 * Method:    wear
 * Signature: ()V
 */
JNIEXPORT void JNICALL
Java_Pants_wear(JNIEnv* env, jobject self)
{
  cout << "One leg at a time." << endl;
}

Now I find it easier to work with more natural-looking C++ methods than with Java_Pants_wear(JNIEnv*, jobject). Here is my solution. Rather than trying a full-blown peer wrapper (such as JNI++ or Jace), I did the simplest thing that could possibly work. First, a hand-coded matching C++ peer class to the Java class, JPants.h:

// Emacs, this is -*- c++ -*- code.
#ifndef J_PANTS_H_
#define J_PANTS_H_

#include 
#include 

class JPants
{
  JNIEnv* env;
  jobject self;

public:
  JPants(JNIEnv* env, jobject self);
  void wear();
};

#endif // J_PANTS_H_

Then I moved the implementation code from Pants.cpp to JPants.cpp:

#include "JPants.h"

#include <iostream>

using namespace std;

JPants::JPants(JNIEnv* env, jobject self)
  : env(env), self(self)
{
}

void
JPants::wear()
{
  cout << "One leg at a time." << endl;
}

Lastly, I updated Pants.cpp to be a purely forwarding implementation:

#include "Pants.h"
#include "JPants.h"

/*
 * Class:     Pants
 * Method:    wear
 * Signature: ()V
 */
JNIEXPORT void JNICALL
Java_Pants_wear(JNIEnv* env, jobject self)
{
  JPants(env, self).wear();
}

The only thing left is to have Pants actually do something interesting. If I were to formalize this, I'd write a simple wrapper generator to write the forwarding code and peer class, and provide some helpers such as a std::string factory for jstring. But after a while, I'd be rewriting those other packages I mentioned up front that I was avoiding. It is always so tempting to over-generalize and write meta-code instead of delivering functionality.

UPDATE: There is a definitely gotcha with using Cygwin: the DLL is fine except that Sun's JVM cannot find the symbols in it. The symptom is an java.lang.UnsatisfiedLinkError exception when using the DLL. The reason is esoteric, but the solution is straight-forward. Fix GNUmakefile and replace:

$(TARGET).dll: $(SRCS:.cpp=.o)
	$(LINK.cpp) -shared -o $@ $^ $(LDLIBS)

with:

$(TARGET).dll: $(SRCS:.cpp=.o)
	$(LINK.cpp) -shared -Wl,--add-stdcall-alias -o $@ $^ $(LDLIBS)

UPDATE: I failed to include a very imporant bit of code in Pants.java:

static {
    System.loadLibrary("Pants");
}

Otherwise, the JVM will never find your native methods and you'll see the abysmal java.lang.UnsatisfiedLinkError error. Also I've uploaded a sample ZIP file of the code in this posting along with a simple unit test: http://binkley.fastmail.fm/Pants.zip.

Thursday, November 11, 2004

Custom URLs with Java

One of the cool things about the JDK is that it easily supports custom URLs. Say you had a cool object database that returned result sets as XML documents. You could refer to one with odb://user:pass@server/table_name?field1=blah&field2=borf#row_3 which would fetch the third object as an XML document from the query SELECT * FROM table_name WHERE field1 = 'blah' AND field2 = 'borf'. How would you teach Java to recognize the URL and forward it to your clever implementation code?

First, create a protocol handler:

package protocol.odb;

public class Handler extends URLStreamHandler {
    protected URLConnection openConnection(final URL u)
            throws IOException {
        return new OdbURLConnection(u);
    }
}

(Note the package name: the JDK requries that protocol handlers be in a package named the same as the URL scheme. This is how the JDK maps the URL to your handler.)

Second, create the custom connection:

public class OdbURLConnection extends URLConnection {
    /**
     * Constructs a new OdbURLConnection.
     *
     * @param url the input URL
     *
     * @see Handler#openConnection(URL)
     * @see URLConnection#URLConnection(URL)
     */
    public OdbURLConnection(final URL url) {
        super(url);
    }

    public void connect()
            throws IOException {
        if (connected) return; // noop if already connected

        // Connect to the object database here
    }

    // Override the various getters appropriate to the object database
}

For extra credit, you can make the connection bidirectional. You could then read in an XML POST and update the object database accordingly.

See the excellent RFC 2396 for information on URI syntax (and by extension their subset, URLs).

UPDATE: An even better resource, Chapter 24 by Mike Fletcher from Java Unleashed, 2nd Edition by Michael Morrison, et al entitled Developing Content and Protocol Handlers. Very excellent and blends well with A New Era for Java Protocol Handlers.

UPDATE:See Brian McCallister's excellent help.

Sunday, November 07, 2004

Why chaining constructors is good

Cedric writes why chaining construtors is bad. Actually, what he advocates is concentrating all the construction logic in a single method (e.g., init), and turning constructors into forwarders to this privileged method.

But he does not mention that this is no different from having a single privileged constructor and chaining all other constructors to that. Hence:

class ChooChoo {
    public static final Station DEFAULT_STATION = new Station();
    public static final Conductor DEFAULT_CONDUCTOR = new Conductor();

    private final Station home;
    private final Conductor chief;

    public ChooChoo() {
        this(DEFAULT_STATION, DEFAULT_CONDUCTOR);
    }

    public ChooChoo(final Station home) {
        this(home, DEFAULT_CONDUCTOR);
    }

    public ChooChoo(final Conductor chief) {
        this(DEFAULT_STATION, chief);
    }

    public ChooChoo(final Station home, final Conductor chief) {
        this.home = home;
        this.chief = chief;
    }
}

In my code, init is a privileged constructor (sometimes a private one) which I find more clear. But Cedric mentions that init increases clarity for him. We should pair together. :-)

Saturday, October 30, 2004

Premature design

We just hired a contractor to help out with my project until we can hire someone permanently. I'm having trouble keeping him focused on the coding needs of the present. He wants to look forward, design for the future, and revisit existing design decisions that are well beyond the scope of the project. I admire the breadth of his thinking, but I find the lack of his faith in YAGNI disturbing.

Any suggestions?

Saturday, October 23, 2004

I hate C++

I hate C++, no, really. I've been a C++ fan since I started programming in 1991 and learned about it from a friend, Brian McDonnell, who worked on CLIPS for NASA and was a big OOP fan. It was the first language I learned (in tandem with "C") by mistaking the ARM for a more complete textbook ("annotated") rather than the compiler-writer's guide that it was. And that led to the dragon book as my third computer science text. A rough way to start, but very fun, and Larry Wall is spot on about the value of hubris.

Enough about me.

What happened to the joy of C++? And wasn't the moo book a gas? Clearly I enjoy the stuff. But now I hate C++. The explanation is easy: tools.

After a solid year of Java at its finest, returning to C++ and no good unit test libraries (cppunit being about it), no mock objects, no IDEA or decent Eclipse support, no ant or maven, no good coverage tools, no good code analysis, no good dependency analysis on and on and on. Tools make the programmer. And C++ has dog food for tools compared to the wealth of open-source projects for Java. Now that I'm working on a C++ project, my development efforts are at least doubled for the same work, and I continually feel that I am leaving imporant parts of best practices out of the picture. Nor does it look like that's about to change anytime.

And that is why I hate C++. Goodbye, dear friend.

Sunday, October 17, 2004

Frisson

It was a small fright and not a slight pleasure to see the resuls of scattered conversations with my pairmate Gregor Hohpe make an appearance in such an interesting post. It is satisfying to see prominent ThoughtWorkers like Gregor and Dragos (my first pairmate at ThoughtWorks) getting broader exposure for their interesting ideas. ThoughtWorks is like grad school for good development ideas.

Thursday, October 14, 2004

Installing XP

Having moved on from ThoughtWorks for personal reasons, I find myself in an interesting position. My new employer SimDesk is a mixture of old-style top-down development practice and new-style agile practices struggling to get out. One of my top tasks is helping good triumph over bad in that struggle. What's highest on my task list?

Get CruiseControl running
Without continuous integration builds, it is very hard to track just where the source code base for a project stands. My boss is very excited by this, and now I'm just waiting for an official machine to run on.
Daily stand ups
Daily stand ups keep everyone in the loop, help developers get a larger picture of things and are a great leveler. Not happening yet, but I hope to get some buy in this week.
User story notecards
This is a major point against old-style waterfall. User story notecards are a very visible difference and lead to using development as part of the design process, and to short release cycles. Best news yet: my boss and his boss—who is acting on behalf of the actual customer—were fine with this. I have notecards stuck up on my wall now! And with XP-style estimates.

None of this would have been possible without the environment of ThoughtWorks. The place is like grad-school for best practices; a year there is worth four years at most other places. Now if only they had a Houston office.

UPDATE: I left out mention of the great resource, Extreme Programming Installed by Jeffries et al. And there is a bonus: a group here is alread using an internal Wiki which I immediately latched onto for posting story cards and working out design problems, plus some XP envangelizing. Plus I a co-lead agreed to start daily morning stand up meetings next week. I also find that a small satellite group in Austin is using Scrum, but I don't know that much about it. All in all a good environment to build upon.

Monday, October 11, 2004

CruiseControl and version problems

It took me several days to track down the source of my troubles, but it turns out that the latest versions of CruiseControl, Tomcat and the JDK5 do not play together. To get a working CC build results page, I need these versions with Microsoft:

  • Windows XP SP2
  • CruiseControl 2.1.6
  • Jakarta Tomcat 5.0.27
  • Sun JDK 1.4.2_05

Possibly JDK 1.4.2 works with Tomcat 5.0.28, but I did not try that combination. I spent enough time struggling with it that once I had a working combination, I stuck with it.

Next up: — installing on production which for us is a Linux Mandrake 9 box. No one here is foolish enough to use a Microsoft server for production.

Saturday, October 09, 2004

Templates vs. Generics

Bruce Eckels has yet another brilliant article on Java generics. I've been trying generics out quite a bit and I continue to be disappointed. I would have been much happier if Sun had adopted the more ambitious generics projects wholesale such as Rice's Projet NextGen. Given the state of things, I would jump whole-heartedly into Nice if only it had a decent editor such as IntelliJ IDEA. At last I finally understand first-hand why Microsoft had such a grip on the C++ market with VisualStudio which way back when must have seemed pretty slick to many Windows programmers.

Wednesday, October 06, 2004

Logging into NT with Java

While researching JAAS I scratch-coded this interesting bit:

final String name = "Bob the Builder";
final LoginContext context = new LoginContext(name, null, null, getNTConfiguration(name));

context.login();
context.logout();

Of course, the secret is in getNTConfiguration:

static Configuration getNTConfiguration(final String name) {
    final Map options
            = new HashMap() {
        {
            put("debug", "true");
            put("debugNative", "true");
        }
    };

    final AppConfigurationEntry[] appConfigurationEntry
            = new AppConfigurationEntry[]{
        new AppConfigurationEntry(NT_LOGIN_MODULE_NAME, REQUIRED, options),
    };

    final Map entries
            = new HashMap() {
        {
            put(name, appConfigurationEntry);
        }
    };

    return new Configuration() {
        public AppConfigurationEntry[] getAppConfigurationEntry(final String name) {
            return entries.get(name);
        }

        public void refresh() { }
    };
}

And the super-secret is the value of NT_LOGIN_MODULE_NAME: "com.sun.security.auth.module.NTLoginModule".

The output when I run using all the debug options is:

An attempt was made to reference a token that does not exist.
		[NTLoginModule] succeeded importing info: 
			user name = boxley
			user SID = S-1-5-21-123456789-839522115-1060284298-38670
			user domain = MYDOMAIN
			user domain SID = S-1-5-21-123456789-839522115-1060284298
			user primary group = S-1-5-21-123456789-839522115-1060284298-513
			user group = S-1-1-0
			user group = S-1-5-32-544
			user group = S-1-5-32-545
			user group = S-1-5-4
			user group = S-1-5-11
			user group = S-1-5-5-0-77027
			user group = S-1-2-0
			impersonation token = 7120
		[NTLoginModule] completed logout processing
getting access token
  [getToken] OpenThreadToken error [1008]:   [getToken] got user access token
getting user info
  [getUser] Got TokenUser info
  [getUser] userName: boxley, domainName = MYDOMAIN
  [getUser] userSid: S-1-5-21-123456789-839522115-1060284298-38670
  [getUser] domainSid: S-1-5-21-123456789-839522115-1060284298
getting primary group
  [getPrimaryGroup] Got TokenPrimaryGroup info
  [getPrimaryGroup] primaryGroup: S-1-5-21-123456789-839522115-1060284298-513
getting supplementary groups
  [getGroups] Got TokenGroups info
  [getGroups] group 0: S-1-5-21-123456789-839522115-1060284298-513
  [getGroups] group 1: S-1-1-0
  [getGroups] group 2: S-1-5-32-544
  [getGroups] group 3: S-1-5-32-545
  [getGroups] group 4: S-1-5-4
  [getGroups] group 5: S-1-5-11
  [getGroups] group 6: S-1-5-5-0-77027
  [getGroups] group 7: S-1-2-0
getting impersonation token
  [getImpersonationToken] token = 7120

Friday, October 01, 2004

Upside-down inheritance

Here is a classic, persisted object in Java:

public class Foo extends Persisted {
    // fields

    // constructors

    // getters, setters
}

Pretty dull. What's wrong with that?

Now here's a typical query-by-example (QBE) method in some finder class (assuming something suitable like Hibernate or iBatis, and a supporting framework):

public Foo findFooByExample(Foo foo) {
    return (Foo) findByExample(FOO_TABLE, foo);
}

And here's a typical data transfer object (DTO) for passing around the found Foo to some other layer of the program:

public class FooData {
    // same fields as Foo

    // same constructors as Foo

    // same getters, setters as Foo
}

And then there's methods which take a Foo and need testing:

public void doBar(Foo foo) {
    // Does Bar look at anything in Persisted, or just Foo?
    // The test code needs to mock the persisted methods, if so.  Rats.
}

Oh, wait. Hrm. Lots of code duplication, lots of overhead to make changes, extra things to test. This is not looking so good. Why is that?

The inheritance is upside-down!

Once you disabuse yourself of the preconception that domain/database objects (DO) extend a persistence base class, the solution is trivial:

public class Foo {
    // fields

    // constructors

    // getters, setters
}

public class FooPersisted extends Foo implements Persisted {
    // fields, getters, setters for persistence
}

Now the QBE example uses Foo for input, and FooPersisted for output; FooData goes away completely; and the doBar method explictly requires either a Foo or a FooPersisted, making it clear if it does or does not fiddle with persistence.

And a bonus: it is almost always less work to implement the two or three fields and getter/setters which persistence uses rather than the larger number of fields which the DO uses. And as they are the same few fields everywhere, you can automate the process using code generation or annotations.

(Aside: Or course, it would be even easier still if Java only supported mixins. Then you just defined FooPersisted as:

public class FooPersisted extends Foo, Persisted {
    // empty -- no further code needed
}

No new keywords; no confusion—super always refers to the first class mentioned in the extends list.

But that is a different post.)

Safer collections

In my post on IndexMap I mentioned in passing SafeHashMap. What is that?

The JDK is very useful but has some warts. One of the worst is this inconsistency: if you try to index into a list with a non-existent index (i.e., beyond the end of the list) the collections throw IndexOutOfBoundsException. But what happens when you try to fetch from a map with a non-existent key? The JDK map implementations silent insert a null value into the map for you and return it. This leads to the following anti-idiom:

Value getValue(Map map, Key key) {
    return (Value) map.get(key);
}

See the problem? Now all code dealing with map anywhere in the program needs to test for null values and decide how to handle them, or else be happy with NullPointerException:

for (Iterator it = map.values().iterator(); it.hasNext(); ) {
    Value value = (Value) it.next();

    if (null == value)
        handleNullValue();
    else
        doTheRealWorkWhichIsTheWholePoint(value);
}

Try coding that 10 times real fast.

What is the solution? Force the correct idiom in the first code fragment:

Value getValue(Map map, Key key) {
    if (!map.containsKey(key))
        map.put(key, createMissingValue(key));

    return (Value) map.get(key);
}

And to enforce this idiom, extend a concrete class such as HashMap with a safe wrapper, hence SafeHashMap, and forbid missing keys or null inserts:

public Object get(Object key) {
    if (null == key) throw new NullPointerException();
    if (!containsKey(key)) throw new IllegalArgumentException();

    return super.get(key);
}

public Object put(Object key, Object value) {
    if (null == key) throw new NullPointerException();
    if (null == value) throw new NullPointerException();

    return super.put(key, value);
}

Friday, September 24, 2004

IndexMap

In my project there are several idiomatic uses of JDK collections. One is using Map to uniquely index domain objects by some field. The code is simple:

public class IndexMap extends SafeHashMap {
    public static interface Mapper {
        public Object getKeyFor(final Object value);
    }

    private final Mapper mapper;

    public IndexMap(final Class keyClass, final Class valueClass,
            final Mapper mapper) {
        super(keyClass, valueClass);

        this.mapper = mapper;
    }

    public IndexMap(final Class keyClass, final Class valueClass,
            final Mapper mapper, final Collection values) {
        this(keyClass, valueClass, mapper);

        addAll(values);
    }

    public void add(final Object value) {
        put(mapper.getKeyFor(value), value);
    }

    public void addAll(final Collection values) {
        for (final Iterator it = values.iterator(); it.hasNext();)
            add(it.next());
    }
}

(SafeHashMap is another JDK collection extension. It forbids null keys and values, and requires they be of certain classes.)

Idiomatic use looks like:

new IndexMap(KeyType.class, DomainType.class, new IndexMap.Mapper() {
    public Object getKeyFor(final Object value) {
        return ((DomainType) value).getKey();
    }
}, initialValues);

Which indexes a collection of DomainType domain objects by the key property.

UPDATE: Because of editing several files at once, I suffered a brain fart and mixed IndexMap (the point of this post) with AutoHashMap (another, still interesting collection).

Tuesday, September 21, 2004

Two ways to model tables in code

Generally I run into two ways to model tables in code.

The first way is to have a single object type respresenting a full row in a table, matching the SQL query SELECT * FROM TABLE. I'm going to call this way the model-the-table method. Every use of the table in code gets all fields regardless of need. This is wasteful but simpler to maintain.

The second way is to have several object types, each representing a single use of a row in the table, matching the SQL query SELECT COLUMN_1, COLUMN_2 FROM TABLE. I'm going to call this way the model-the-use method. Each use of the table in code gets only the fields needed. This is more precise but harder to maintain.

I'm undecided which I like better and have used both in projects, even within the same project. Perhaps enlightenment will come my way.

My pairmate, Karthik Chandrasekariah pointed out to me the similarity of this choice to using the Adapter pattern. Only instead of changing the view of an underlying code object with an adapter, you change the view of a database table.

Domain objects and compareTo

I wrote earlier about domain objects and boolean equals(Object) and how to handle primary keys. What about int compareTo(Object)? The same remarks about primary keys still apply, and the code pattern is:

public class Something implements Comparable {
    /* ... */

    public int compareTo(final Object o) {
        final Something that = (Something) o;
        int compareTo = firstPK.compareTo(that.firstPK);
        if (0 != compareTo) return compareTo;
        compareTo = secondPK.compareTo(that.secondPK);
        if (0 != compareTo) return compareTo;
        // ... likewise for other primary keys
        return lastPK.compareTo(that.lastPK);
    }
}

This sort of comparison method will group sorts by the order of comparison, so that firstPK groups together, then secondPK, etc.. If, say, you sorted [Apple, Blue], [Orange, Orange], [Apple, Red], this way, they would be grouped as [Apple, Blue], [Apple, Red], [Orange, Orange].

Layers and containers

I've been thinking about how J2EE containers work. They provide a world of many layers: application-container, container-Java libraries, Java libraries-byte code, byte code-JVM, JVM-platform libraries, platform libraries-native code, native code-OS, OS-hardware. (And there are, of course, calls from higher layers to lower layers even further down.) That's a lot of layers. I wonder what could be stripped out?

For example, Java handles thread scheduling for Java code, but the OS does the same for native code. How much are Java threads mapped onto native threads? (The answer varies quite a bit between platform and JVM implementation.) The same question arises comparing byte code to native CPU instructions. How much of a JVM could be performed directly by an OS?

There is plenty of research in these areas already (I was going to make a list of interesting links, but Google turned up so much, it hardly seems worth the effort—Google sure changes how research works), so I am looking forward to reading more about these ideas over time. Layers are good for abstraction, but over time the most successful abstractions become concrete and implementors take advantage to improve performance and transparancy. Witness pointers: C abstracted hardware addressing as pointers, which then C++ abstracted as virtual methods, which then Java abstracted as methods: success begats success.

I expect the same trend to continue as byte code slowly displaces native code and JVM-like things appear more in hardware and operating systems such as the cool work at Transmeta.

Saturday, September 18, 2004

Actor/Director

Fellow ThoughtWorker Andrew McCormick pointed out Actor/Director to me, a good pattern he coined. The simple example he casually posted was turning this:

public class TemplateMethodClass {
    public void doSomething() {
        setup();
        yourstuff();
        finish();
    }

    protected void yourStuff() {
    }
}

Into this:

public class DirectorClass {
    public void doSomething() {
        setup();
        Actor actor = getActorForYourStuff();
        finish();
      }

    protected Actor getActorForYourStuff() {
        // other possibilities
        // return new SingleActor();
        // return new MultipleActor(new SingleActor(), new SingleActor())
        return new NullActor();
    }
}

Seeing this, I immediately cleaned up a long-standing solution to the database connection problem, becoming:

public class ConnectionDirector {
    private final ConnectionProvider provider;
    private final ConnectionActor actor;

    public ConnectionDirector(final ConnectionProvider provider,
            final ConnectionActor actor) {
        this.provider = provider;
        this.actor = actor;
    }

    public void doSomething() {
        final Connection conn = provider.openConnection();
        try {
            actor.doSomething(conn);

        } finally {
            provider.closeConnection(conn);
        }
    }
}

Isn't that tidy? Both the connection provider and the connection actor are passed in. As Andrew noted, Dependency injection is probably the most important obvious-in-hindsight idea that's come along recently. And it fits perfectly into the Actor/Director pattern.

(Note to pattern mavens: does Actor/Director duplicate an existing pattern? I find the name to be very intuitive.)

UPDATE: Andrew later noted: Actor/Director is sort of a twist on Chain of Responsibility...or at least along the same lines. Both have the intent of separating what gets done from how it gets done.

Sunday, September 12, 2004

More on domain objects

Several excellent comments and communications asked me questions about my post on domain objects. There are several points I want to discuss further.

Why implement boolean equals(Object), and why compare only primary keys?
This is a the heart of equality v identity or object equality v instance equality. The basic complaint is why not just rely on identity as implemented in boolean Object.equals(Object)? Why not, indeed. Do not forget that even if you override equals, you can just call == yourself which cannot be overriden (in Java) and which always provides identity, not equality. But for domain behavior you really do want things to compare equal which behave identically. In fact, without this property all sorts of code becomes very tedious. Just try to implement an object cache without overriding equals.
Why mark primary keys as final?
This follows from using equality instead of identity for equals. If a primary key changes, the object is no longer the same. The primary keys are therefore immutable. If you want a new domain object, you must create a new instance with different primary keys.

The Program

I want a domain-specific language (DSL) in Java for domain objects. Were I writing in a more modern, flexible language such as Scheme (just an example; there are others), I would simply design a DSL for the problem of domain objects. As it is, Java forces an amalgam of coding conventions and extra-linguistic baggage such as XDoclet (equivalently, JDK 5 annotations with code generation) or AOP to accomplish the same task. I want to consider here the Java-only techniques for this goal.

Tuesday, September 07, 2004

A clever production v. testing trick

Say you have a database connection provider for production, and a mock persistence layer for testing. This setup is common for iBatis, Hibernate or similar solutions. How should you design the consumers of these? I fell upon this clever pattern today while reworking an event replayer which read events from a database and injected them into a messaging system:

/**
 * <code>Foo</code> does something clever. It has two
 * modes: production and testing.  For production, use {@link
 * #Foo(ConnectionProvider)} and pass in a provider.  For testing,
 * use {@link #Foo(BarMapper)} and pass in a mock mapper.
 *
 * The code takes great care in the constructors to ensure you cannot mix uses.
 * All instance variables are marked <code>final</code>.
 *
 * If you are in production use, the constructor tests <var>provider</var> and
 * assigns it, making the mapper <code>null</code>.  Then, if you try to use
 * the mapper instance variable directly instead of via the getter for it,
 * you will throw a <code>NullPointerException</code>.
 *
 * Contrariwise, if you are in testing use, the constructor tests the
 * mapper parameter and assigns it, making the provider <code>null</code>.
 * Then, if you try to use the provider instance variable directly instead of
 * via the getter for it, you will throw a <code>NullPointerException</code>.
 *
 * Lastly, there is no {@link Connection} instance variable.  Instead {@link
 * #barNone(Id)} gets one from the provider on the fly (the getter
 * ensure this only really does something if in production mode), and closes it
 * before the method returns.  There is never a leak, the connection is never
 * held for longer than needed, the method may be run more than once
 * statelessly, and the method uses a getter for the mapper, again ensuring
 * the code <cite>does the right thing</cite>.
 */
public class Foo {
    private final ConnectionProvider provider;
    private final BarMapper mapper;

    public Foo(final ConnectionProvider provider) {
        if (null == provider)
            throw new NullPointerException();

        this.provider = provider;
        this.mapper = null;
    }

    protected Foo(final BarMapper mapper) {
        if (null == mapper)
            throw new NullPointerException();

        this.provider = null;
        this.mapper = mapper;
    }

    public void barNone(final Id barId) {
        final Connection conn = getConnection();

        try {
            doSomethingCleverHere(barId, getMapper(conn));

        } finally {
            close(conn);
        }
    }

    private Connection getConnection() {
        return null != provider
            ? provider.getConnection()
            : null;
    }

    private void close(final Connection conn) {
        if (null != provider) provider.close(conn);
    }

    private BarMapper getMapper(final Connection conn) {
        return null != mapper
            ? mapper
            : new BarMapper(conn);
    }
}

See the idea? Several things are going on at once here. Read the class javadoc comment carefully. This trick will serve you well.

A footnote: why is the one constructor public and the other protected? Simple. The constructor taking a connection provider is for production and is marked public. The constuctor taking a mapper is for testing and is marked protected. Remember to keep production and test case code in the same package but in separate source trees.

Advice to persistence frameworks

In my previous post I described how to make good domain objects in Java. However, much of the advice is circumscribed by limitations in popular persistence layers below the domain objects. Therefore, I have some advice to persistence layer authors.

Support non-default constructors. Admittedly, this takes some cleverness. Say you have a class like this:

public class Foo {
    private final int count;
    private final int total;
    private int numberOfMonkeysInABarrel;

    public Foo(final int count, final int total) {
        this.count = count;
        this.total = total;
    }

    public int getPercentage() {
        return Math.round((float) (count * 100) / total);
    }

    public void setNumberOfMonkeysInABarrel(final int n) {
        numberOfMonkeysInABarrel = n;
    }

    public int getNumberOfMonkeysInABarrel() {
        return numberOfMonkeysInABarrel;
    }
}

A really clever persistence layer can work out what the inputs are for the constructor by noting the following:

  1. The Sun VM returns reflected fields in order of declaration. The documentation does not specify an order, however, so even though this is emprically true in JDK 1.4 and 5.0 VMs, this is a logically weak link.
  2. For given fields, filtering is easy to find just private final fields.
  3. By requiring than the order of inputs for a constructor match the ordering of fields (very common by convention), it is mechanical to match up constructor inputs to required fields, assuming required fields are declared private final.

And there you go: a persistence layer can create domain-safe instances without requiring an otherwise useless default constructor or getters and setters.

Saturday, September 04, 2004

How to make a domain object

There are several basic rules for designing a domain object. These rules keep safety and correctness in mind:

Always implement equals and hashCode
I have seen many memory leaks from inadvertent caches of domain objects which used object identity. When implementing equals and hashCode, only compare primary keys since that is the sense of equality the persistence layer uses.
Mark immutable fields final and provide no setters for them
Always mark primary keys final and only set them in a constructor. This means there is no default constructor.
Include all required fields in constructors
Never leave domain objects in an inconsistent state. The constructors should reflect this fact. Again that means no default constuctor.
Test the inputs of non-null fields
Simply provide if (null == someField) throw new NullPointerException(); in the constructor or setter which takes someField. If the field is not required, make the same check in the getter as it may have not been initialized yet.
Implement Comparable if there is a default sort order
Whenever domain objects have any kind of natural sort order, always implmement Comparable. Do not force other code to do the work on behalf of the domain object. And remember to compare as many fields as needed by the sort order. For example, a CustomerName comparator needs to sort on lastName, firstName, middleName and nameSuffix.
Implement toString for debugging
Do not use toString for business code or display. For those, use business-specific methods such as displayAs. Commons-lang provides an excellent ToStringBuilder for debugging.

Unfortunately, not all of them can be used with every project. For example, if you persist domain objects directly rather than using an intermediate persistence layer to represent database tables (e.g., Hibernate or iBatis), the framework requries that every instance field have a bean-like getter/setter. (Actually, recent Hibernate supports direct field access, but this is still uncommon with many projects.)

Tuesday, August 24, 2004

JUnit work around for lost exceptions

Here's the scenario: a unit test throws an uncaught exception, but then so does tearDown(). What happens? TestCase.runBare() says:

public void runBare() throws Throwable {
    setUp();
    try {
        runTest();
    } finally {
        tearDown();
    }
}

See the problem? If tearDown() throws, then runBare() loses any exception thrown by runTest() (which runs your unit test). [See the Java Language Specification, §11.3.] Rarely is this what you actually want since the lost exception was the original cause of test error. The solution—override runBare():

public void runBare() throws Throwable {
    Throwable thrown = null;

    setUp();

    try {
        runTest();

    } catch (final Throwable t) {
        thrown = t;

    } finally {
        try {
            tearDown();

            if (null != thrown) throw thrown;

        } catch (Throwable t) {
            if (null != thrown) throw thrown;
            else throw t;
        }
    }
}

The idea is simple. Store the exception from runTest() and rethrow it after running tearDown(). Make sure to throw any exception from tearDown() otherwise.

UPDATE: My brain finally caught up with my fingers and I realize that the sample solution is unnecessarily complex. Better is:

public void runBare()
        throws Throwable {
    Throwable thrown = null;

    setUp();

    try {
        runTest();

    } catch (final Throwable t) {
        thrown = t;

    } finally {
        try {
            tearDown();

        } finally {
            if (null != thrown) throw thrown;
        }
    }
}

Monday, August 23, 2004

Hidden static methods

While debugging some slow code with a programming partner, we ran across a long series of statements like this:

new Foo(args).run();

Line, after line like this. What is going on? Apparently the original writer was thinking of something like command pattern without any sense of do/undo/redo. But what he wrote instead is another anti-pattern. Rather than littering the VM with a series of use-once instances just to invoke a method on them, he could have said:

Foo.run(args);

And by using a static method have been more clear that this was functional, not object-oriented code. Not every problem is a nail requiring the same hammer.

Saturday, August 21, 2004

Turning 90 degrees

An interesting problem: you have a depth-first process and need to turn it into a breadth-first one. How to proceed? A simple publish-subscribe message bus works single-threaded so when a message receiver publishes a new message in response, the new message is processed before the rest of the receivers of the first message have a chance to run. This leads to some confusing situations where message order is counter-intuitive.

Since everything is plain Java, I cannot use continuations for the activation records else I would just reorder them. Ah, but I can come close enough for this simple system. Consider these two data structures (in reality, they are immutable classes, but I shortened them to C-style structs for conciseness):

public class Binding {
    public Receiver recipient; // the target object for Method.invoke(Object...)
    public Method onReceive; // the method
}

public class ActivationSet {
    public Set<Binding> bindings; // JDK 5.0, plain Set for JDK 1.3 or JDK 1.4
    public Message message; // the argument for on Method.invoke(Object...)
}

Now the publish algorithm is quite simple given an instance variable of Queue<ActivationSet> for the channel:

public void publish(Message message) {
    Set<Binding> bindings = findSubscribers(message);

    if (bindings.isEmtpy()) return;

    // Check if we're first to publish
    boolean topContext = activations.isEmpty();
    // Queue our recipients
    activations.offer(new ActivationSet(bindings, message));
    // Let the top context handle recipients in queue-order
    if (!topContext) return;

    while (!activations.isEmpty()) {
        // activate() invokes onReceive for each binding in the set
        activations.peek().activate();
        // Be careful not to remove our own activation set prematurely;
        // otherwise the topContext check above won't work right
        activations.remove();
    }
}

The effect is to queue up the depth-first nature of recursive method calls and handle them in breadth-first order. The code is quite simple once the concept is clear.

Wednesday, August 18, 2004

Smarter exceptions

Very commonly I see code which puts data into the message of an Exception as a way to pass information. This was certainly the case with the original Exception class which provided no way to hold context. Recent JDKs fixed that with Exception.getCause(). But I didn't realize the general usefulness of this approach until reading the changes so far in JDK5 Beta2. After all, exceptions are just classes; they too can have more methods. In fact, this is just what UnknownFormatConversionException.getConversion made me see. Useful information passed on an exception to aid in recovery, logging, etc. is just the ticket.

The switch statement

A nice post by Tom McQueeney on the new varargs feature in JDK5 marred by one point:

if (params.length > 0) lastName = params[0];
if (params.length > 1) firstName = params[1];
if (params.length > 2) middleName = params[2];
if (params.length > 3) nickName = params[3];

if (params.length > 4) {
    throw new IllegalArgumentException("Constructor called with too many arguments");
}

Why the series of if statements? A simple switch statement with fall-through is more clear:

switch (params.length) {
case 4: nickName = params[3];
case 3: middleName = params[2];
case 2: firstName = params[1];
case 1: lastName = params[0];
    break;
default:
    throw new IllegalArgumentException("Constructor called with too many arguments");
}

Tuesday, August 17, 2004

Two testing tales, one bad, one good

Tale one, package mismanagement

Here is an anti-pattern:

java/blarfage/Foo.java:

package blarfage; public class Foo { }

java/test/blarfage/FooTest.java:

package test.blarfage; import junit.framework.TestCase; public class FooTest extends TestCase { }

What is wrong here—where is the anti-pattern? Do not put test classes in a different package than that of the class under test. It looks clean and logical, but try it for a while and the pain of it hobbles your testing. Many useful tricks vanish, chief among them package-scope methods for tweaking the class under test. Don't just take my word for it: try it on any moderate-sized project and see for yourself.

Tale two, useful IoC trick

Here is a handy trick for configuring a class under test: provide a separate, protected constructor just for testing.

java/blarfage/Foo.java:

package blarfage; public class Foo { private final Bar favorite; public Bar(final Drink beer) { this(new Bar(beer)); } protected Bar(final Bar favorite) { this.favorite = favorite; } }

test/blarfage/FooTest.java:

package blarfage; import junit.framework.TestCase; public class FooTest extends TestCase { private TestFoo foo; protected void setUp() throws Exception { foo = new TestFoo(createMockBar()); } private static final TestFoo extends Foo { private TestFoo(final Bar favorite) { super(favorite); } } }

The idea is this: Say Bar is complex or expensive—java.sql.Connection is a good example. The public constructor takes the argument for constructing the private, expensive or complex object; the protected constructor takes the actual expensive or complex object so that a test can extend the class under test and use that protected constructor.

This trick is especially handy when you do not have control of code using the class under test (else you might refactor the public constructor to take the expensive or complex argument directly as the protected constructor does).

Wednesday, August 11, 2004

Brittle relations

Much of the time a project needs decoupling between a database schema and the business model using the schema. This setup uses some kind of mapping file, often XML, to glue the two pieces together. However as the schema evolves, mismatches between schema and business model show up at run-time during testing. This is very annoying. If a project does not need the looseness afforded by a separate mapping file, I have a different suggestion. The most common design in this scenario is to have a persistence layer for database work and a domain layer above that representing the business model.

Consider an intermediate third layer—call it the schema layer—between the persistence and domain layers. This schema layer uses the persistence layer to exactly model the database schema rather than the business model. The domain layer uses the schema layer rather than call the persistence layer directly.

Why the extra layer? With the extra schema layer the mapping file vanishes. Use tools to generate the schema layer automatically by interrogating the database during the build. Then, as the schema evolves, incompatible syntactic changes (changes in columns or types, not changes in meaning) propagate into the code of the schema layer and manifest themselves as compile errors in the domain layer. This is a great aid during development. Compile-time errors are much easier to catch automatically than problems during testing, and strange reflection contortions are kept out of the business model.

Tuesday, August 10, 2004

Supporting old code

One JDK5 feature I particularly appreciate is the simplicity of the new foreach syntax, and the addition of an Iterable<T> interface in support. However, much of the JDK was written before even Iterator was around such as StringTokenizer. What to do? Write a wrapper, of course:

/**
 * <code>EnumerationIterator</code> is an {@link Iterator} wrapper for {@link
 * Enumeration}. {@link #remove()} is unsupported.
 */
public class EnumerationIterator <T> implements Iterator<T> {
    private final Enumeration<T> enumeration;

    /**
     * Constructs a new <code>EnumerationIterator</code> from the given
     * <var>enumeration</var>.
     *
     * @param enumeration the enumeration
     */
    public EnumerationIterator(final Enumeration<T> enumeration) {
        this.enumeration = enumeration;
    }

    /**
     * {@inheritDoc}
     */
    public boolean hasNext() {
        return enumeration.hasMoreElements();
    }

    /**
     * {@inheritDoc}
     */
    public T next() {
        return enumeration.nextElement();
    }

    /**
     * {@inheritDoc}
     */
    public void remove() {
        throw new UnsupportedOperationException();
    }
}

/**
 * <code>EnumerationIterable</code> is an {@link Iterable} wrapper for {@link
 * Enumeration} to support JDK5 <em>foreach</em> syntax.
 */
public class EnumerationIterable <T> implements Iterable<T> {
    private final Enumeration enumeration;

    /**
     * Constructs a new <code>EnumerationIterable</code> for the given
     * <var>enumeration</var>.
     *
     * @param enumeration the enumeration
     */
    public EnumerationIterable(final Enumeration enumeration) {
        this.enumeration = enumeration;
    }

    /**
     * {@inheritDoc}
     */
    public Iterator<T> iterator() {
        return new EnumerationIterator<T>(enumeration);
    }
}

Now I can write this:

for (final String token : new EnumerationIterable<String>(new StringTokenizer("a:b:c", ":")))
    System.out.printnln(token);

Monday, August 09, 2004

Nice

Nice is a very attractive new Java-like language for the JVM. Xoltar has nice things to say about Nice, the language. I'm more than a little enchanted: much better type safety than Java, multimethods, closures, parametrics types, named parameters, tuples. Wow! And unlike Groovy, it really does look like Java after some weight-lifting. (I'm sorry, but I'm too attached to ;. Call me a "C" bigot, but it just looks wrong otherwise.)

I'm eager to give Nice a spin and perhaps use it a project. And the comparison of Nice and Groovy really isn't that fair — Nice is still pre-compiled into bytecode like Java, not scriptable like Groovy. There's plenty of need for both.

Trickiness with generics and reflection for methods

This caught me offguard at first:

public class Scratch { public static class Qux { } public static class SubQuux extends Qux { } public static class Foo <M extends Qux> { public void spam(final M meat) { } } public static class Bar extends Foo<SubQuux> { public void spam(final SubQuux meat) { } } public static void main(final String[] args) { try { for (final Method method : Bar.class.getMethods()) if ("spam".equals(method.getName())) System.out.println(method); } catch (Exception e) { e.printStackTrace(); } } }

So, just what does main print? The answer is fascinating:

public void Scratch$Bar.spam(Scratch$SubQuux) public volatile void Scratch$Bar.spam(Scratch$Qux)

Aha! Now I get it. Since the JDK5 compiler implements generics with type erasure, there needs to be a version of spam which takes the lowest possibly erased type, Qux, so that JDK1.4 runtimes can handle the method call. Crud. That messes up my reflection for messaging. However, the solution is staring right at me: volitile. Since there really is no such method as public void Scratch$Bar.spam(Scratch$Qux), the compiler synthesizes one and marks it as volitile. Nifty. Now I can filter those out with Modifier.isVolitile. No harm, no foul.