Friday, September 24, 2004

IndexMap

In my project there are several idiomatic uses of JDK collections. One is using Map to uniquely index domain objects by some field. The code is simple:

public class IndexMap extends SafeHashMap {
    public static interface Mapper {
        public Object getKeyFor(final Object value);
    }

    private final Mapper mapper;

    public IndexMap(final Class keyClass, final Class valueClass,
            final Mapper mapper) {
        super(keyClass, valueClass);

        this.mapper = mapper;
    }

    public IndexMap(final Class keyClass, final Class valueClass,
            final Mapper mapper, final Collection values) {
        this(keyClass, valueClass, mapper);

        addAll(values);
    }

    public void add(final Object value) {
        put(mapper.getKeyFor(value), value);
    }

    public void addAll(final Collection values) {
        for (final Iterator it = values.iterator(); it.hasNext();)
            add(it.next());
    }
}

(SafeHashMap is another JDK collection extension. It forbids null keys and values, and requires they be of certain classes.)

Idiomatic use looks like:

new IndexMap(KeyType.class, DomainType.class, new IndexMap.Mapper() {
    public Object getKeyFor(final Object value) {
        return ((DomainType) value).getKey();
    }
}, initialValues);

Which indexes a collection of DomainType domain objects by the key property.

UPDATE: Because of editing several files at once, I suffered a brain fart and mixed IndexMap (the point of this post) with AutoHashMap (another, still interesting collection).

Tuesday, September 21, 2004

Two ways to model tables in code

Generally I run into two ways to model tables in code.

The first way is to have a single object type respresenting a full row in a table, matching the SQL query SELECT * FROM TABLE. I'm going to call this way the model-the-table method. Every use of the table in code gets all fields regardless of need. This is wasteful but simpler to maintain.

The second way is to have several object types, each representing a single use of a row in the table, matching the SQL query SELECT COLUMN_1, COLUMN_2 FROM TABLE. I'm going to call this way the model-the-use method. Each use of the table in code gets only the fields needed. This is more precise but harder to maintain.

I'm undecided which I like better and have used both in projects, even within the same project. Perhaps enlightenment will come my way.

My pairmate, Karthik Chandrasekariah pointed out to me the similarity of this choice to using the Adapter pattern. Only instead of changing the view of an underlying code object with an adapter, you change the view of a database table.

Domain objects and compareTo

I wrote earlier about domain objects and boolean equals(Object) and how to handle primary keys. What about int compareTo(Object)? The same remarks about primary keys still apply, and the code pattern is:

public class Something implements Comparable {
    /* ... */

    public int compareTo(final Object o) {
        final Something that = (Something) o;
        int compareTo = firstPK.compareTo(that.firstPK);
        if (0 != compareTo) return compareTo;
        compareTo = secondPK.compareTo(that.secondPK);
        if (0 != compareTo) return compareTo;
        // ... likewise for other primary keys
        return lastPK.compareTo(that.lastPK);
    }
}

This sort of comparison method will group sorts by the order of comparison, so that firstPK groups together, then secondPK, etc.. If, say, you sorted [Apple, Blue], [Orange, Orange], [Apple, Red], this way, they would be grouped as [Apple, Blue], [Apple, Red], [Orange, Orange].

Layers and containers

I've been thinking about how J2EE containers work. They provide a world of many layers: application-container, container-Java libraries, Java libraries-byte code, byte code-JVM, JVM-platform libraries, platform libraries-native code, native code-OS, OS-hardware. (And there are, of course, calls from higher layers to lower layers even further down.) That's a lot of layers. I wonder what could be stripped out?

For example, Java handles thread scheduling for Java code, but the OS does the same for native code. How much are Java threads mapped onto native threads? (The answer varies quite a bit between platform and JVM implementation.) The same question arises comparing byte code to native CPU instructions. How much of a JVM could be performed directly by an OS?

There is plenty of research in these areas already (I was going to make a list of interesting links, but Google turned up so much, it hardly seems worth the effort—Google sure changes how research works), so I am looking forward to reading more about these ideas over time. Layers are good for abstraction, but over time the most successful abstractions become concrete and implementors take advantage to improve performance and transparancy. Witness pointers: C abstracted hardware addressing as pointers, which then C++ abstracted as virtual methods, which then Java abstracted as methods: success begats success.

I expect the same trend to continue as byte code slowly displaces native code and JVM-like things appear more in hardware and operating systems such as the cool work at Transmeta.

Saturday, September 18, 2004

Actor/Director

Fellow ThoughtWorker Andrew McCormick pointed out Actor/Director to me, a good pattern he coined. The simple example he casually posted was turning this:

public class TemplateMethodClass {
    public void doSomething() {
        setup();
        yourstuff();
        finish();
    }

    protected void yourStuff() {
    }
}

Into this:

public class DirectorClass {
    public void doSomething() {
        setup();
        Actor actor = getActorForYourStuff();
        finish();
      }

    protected Actor getActorForYourStuff() {
        // other possibilities
        // return new SingleActor();
        // return new MultipleActor(new SingleActor(), new SingleActor())
        return new NullActor();
    }
}

Seeing this, I immediately cleaned up a long-standing solution to the database connection problem, becoming:

public class ConnectionDirector {
    private final ConnectionProvider provider;
    private final ConnectionActor actor;

    public ConnectionDirector(final ConnectionProvider provider,
            final ConnectionActor actor) {
        this.provider = provider;
        this.actor = actor;
    }

    public void doSomething() {
        final Connection conn = provider.openConnection();
        try {
            actor.doSomething(conn);

        } finally {
            provider.closeConnection(conn);
        }
    }
}

Isn't that tidy? Both the connection provider and the connection actor are passed in. As Andrew noted, Dependency injection is probably the most important obvious-in-hindsight idea that's come along recently. And it fits perfectly into the Actor/Director pattern.

(Note to pattern mavens: does Actor/Director duplicate an existing pattern? I find the name to be very intuitive.)

UPDATE: Andrew later noted: Actor/Director is sort of a twist on Chain of Responsibility...or at least along the same lines. Both have the intent of separating what gets done from how it gets done.

Sunday, September 12, 2004

More on domain objects

Several excellent comments and communications asked me questions about my post on domain objects. There are several points I want to discuss further.

Why implement boolean equals(Object), and why compare only primary keys?
This is a the heart of equality v identity or object equality v instance equality. The basic complaint is why not just rely on identity as implemented in boolean Object.equals(Object)? Why not, indeed. Do not forget that even if you override equals, you can just call == yourself which cannot be overriden (in Java) and which always provides identity, not equality. But for domain behavior you really do want things to compare equal which behave identically. In fact, without this property all sorts of code becomes very tedious. Just try to implement an object cache without overriding equals.
Why mark primary keys as final?
This follows from using equality instead of identity for equals. If a primary key changes, the object is no longer the same. The primary keys are therefore immutable. If you want a new domain object, you must create a new instance with different primary keys.

The Program

I want a domain-specific language (DSL) in Java for domain objects. Were I writing in a more modern, flexible language such as Scheme (just an example; there are others), I would simply design a DSL for the problem of domain objects. As it is, Java forces an amalgam of coding conventions and extra-linguistic baggage such as XDoclet (equivalently, JDK 5 annotations with code generation) or AOP to accomplish the same task. I want to consider here the Java-only techniques for this goal.

Tuesday, September 07, 2004

A clever production v. testing trick

Say you have a database connection provider for production, and a mock persistence layer for testing. This setup is common for iBatis, Hibernate or similar solutions. How should you design the consumers of these? I fell upon this clever pattern today while reworking an event replayer which read events from a database and injected them into a messaging system:

/**
 * <code>Foo</code> does something clever. It has two
 * modes: production and testing.  For production, use {@link
 * #Foo(ConnectionProvider)} and pass in a provider.  For testing,
 * use {@link #Foo(BarMapper)} and pass in a mock mapper.
 *
 * The code takes great care in the constructors to ensure you cannot mix uses.
 * All instance variables are marked <code>final</code>.
 *
 * If you are in production use, the constructor tests <var>provider</var> and
 * assigns it, making the mapper <code>null</code>.  Then, if you try to use
 * the mapper instance variable directly instead of via the getter for it,
 * you will throw a <code>NullPointerException</code>.
 *
 * Contrariwise, if you are in testing use, the constructor tests the
 * mapper parameter and assigns it, making the provider <code>null</code>.
 * Then, if you try to use the provider instance variable directly instead of
 * via the getter for it, you will throw a <code>NullPointerException</code>.
 *
 * Lastly, there is no {@link Connection} instance variable.  Instead {@link
 * #barNone(Id)} gets one from the provider on the fly (the getter
 * ensure this only really does something if in production mode), and closes it
 * before the method returns.  There is never a leak, the connection is never
 * held for longer than needed, the method may be run more than once
 * statelessly, and the method uses a getter for the mapper, again ensuring
 * the code <cite>does the right thing</cite>.
 */
public class Foo {
    private final ConnectionProvider provider;
    private final BarMapper mapper;

    public Foo(final ConnectionProvider provider) {
        if (null == provider)
            throw new NullPointerException();

        this.provider = provider;
        this.mapper = null;
    }

    protected Foo(final BarMapper mapper) {
        if (null == mapper)
            throw new NullPointerException();

        this.provider = null;
        this.mapper = mapper;
    }

    public void barNone(final Id barId) {
        final Connection conn = getConnection();

        try {
            doSomethingCleverHere(barId, getMapper(conn));

        } finally {
            close(conn);
        }
    }

    private Connection getConnection() {
        return null != provider
            ? provider.getConnection()
            : null;
    }

    private void close(final Connection conn) {
        if (null != provider) provider.close(conn);
    }

    private BarMapper getMapper(final Connection conn) {
        return null != mapper
            ? mapper
            : new BarMapper(conn);
    }
}

See the idea? Several things are going on at once here. Read the class javadoc comment carefully. This trick will serve you well.

A footnote: why is the one constructor public and the other protected? Simple. The constructor taking a connection provider is for production and is marked public. The constuctor taking a mapper is for testing and is marked protected. Remember to keep production and test case code in the same package but in separate source trees.

Advice to persistence frameworks

In my previous post I described how to make good domain objects in Java. However, much of the advice is circumscribed by limitations in popular persistence layers below the domain objects. Therefore, I have some advice to persistence layer authors.

Support non-default constructors. Admittedly, this takes some cleverness. Say you have a class like this:

public class Foo {
    private final int count;
    private final int total;
    private int numberOfMonkeysInABarrel;

    public Foo(final int count, final int total) {
        this.count = count;
        this.total = total;
    }

    public int getPercentage() {
        return Math.round((float) (count * 100) / total);
    }

    public void setNumberOfMonkeysInABarrel(final int n) {
        numberOfMonkeysInABarrel = n;
    }

    public int getNumberOfMonkeysInABarrel() {
        return numberOfMonkeysInABarrel;
    }
}

A really clever persistence layer can work out what the inputs are for the constructor by noting the following:

  1. The Sun VM returns reflected fields in order of declaration. The documentation does not specify an order, however, so even though this is emprically true in JDK 1.4 and 5.0 VMs, this is a logically weak link.
  2. For given fields, filtering is easy to find just private final fields.
  3. By requiring than the order of inputs for a constructor match the ordering of fields (very common by convention), it is mechanical to match up constructor inputs to required fields, assuming required fields are declared private final.

And there you go: a persistence layer can create domain-safe instances without requiring an otherwise useless default constructor or getters and setters.

Saturday, September 04, 2004

How to make a domain object

There are several basic rules for designing a domain object. These rules keep safety and correctness in mind:

Always implement equals and hashCode
I have seen many memory leaks from inadvertent caches of domain objects which used object identity. When implementing equals and hashCode, only compare primary keys since that is the sense of equality the persistence layer uses.
Mark immutable fields final and provide no setters for them
Always mark primary keys final and only set them in a constructor. This means there is no default constructor.
Include all required fields in constructors
Never leave domain objects in an inconsistent state. The constructors should reflect this fact. Again that means no default constuctor.
Test the inputs of non-null fields
Simply provide if (null == someField) throw new NullPointerException(); in the constructor or setter which takes someField. If the field is not required, make the same check in the getter as it may have not been initialized yet.
Implement Comparable if there is a default sort order
Whenever domain objects have any kind of natural sort order, always implmement Comparable. Do not force other code to do the work on behalf of the domain object. And remember to compare as many fields as needed by the sort order. For example, a CustomerName comparator needs to sort on lastName, firstName, middleName and nameSuffix.
Implement toString for debugging
Do not use toString for business code or display. For those, use business-specific methods such as displayAs. Commons-lang provides an excellent ToStringBuilder for debugging.

Unfortunately, not all of them can be used with every project. For example, if you persist domain objects directly rather than using an intermediate persistence layer to represent database tables (e.g., Hibernate or iBatis), the framework requries that every instance field have a bean-like getter/setter. (Actually, recent Hibernate supports direct field access, but this is still uncommon with many projects.)