Tuesday, September 06, 2005

JSF and AJAX

A delightful page on mixing JSF and AJAX. Particularly helpful is that the authors present three strategies—not just a simple HOWTO recipe—and a comparison among them. And the article is richly researched, a joy to read.

With the discussion of Sun open sourcing the JSF RI (reference implementation), this article deserves a wider audience.

Refactoring Thumbnails

What a wonderful resource! A set of simple descriptions and straight-forward UML diagrams for common refactorings. The link has been up on the refactoring web site for about a year now, but I somehow I missed it.

Take this example, Hide Implementation With Interface:

client code directly accesses features of an implementation class despite the presence of an interface   restrict the client code its use of features to those provided by the available interface

Monday, September 05, 2005

Drools

Drools, the Java rules engine, has been getting press lately. O'Reilly has a nice pair of articles, one with examples, the other a higher-level view.

A lot of Drools is based on work first done in CLIPS, the LISPy rules engine of the 80s and early 90s. Strangely, I already knew about CLIPS. They are all based around the Rete algorithm, a clever way to trade computation for memory with if-then-else rules systems.

I learned about CLIPS in the late 80s from Brian Donnell (now Dantes), a fine fellow. We were classmates together at in Baker College at Rice University, and in fact he was the very first person to ever explain Object Oriented Programming to me. (As he wrote COOL, the OOP system in CLIPS, this was a topic he enjoyed discussing.) He tried describing it in context of C with Classes and C++; at that time the only language I knew was C. I can thank him and Matt Cohen (who taught me C) for starting my life in programming.

So I am pleased to see systems like Drools and Jess getting some press. And still, LISP lives on.

Friday, September 02, 2005

Fun with Ant macros

Noticing that many Ant file operations were rather slow on large directory trees, I fell back to the tried and true: <exec/>. But I noticed that as I coded up cp, mv and rm, that the calls to <exec/> became rather tedious and repetitious. Looking around a bit, I found just the thing: <macrodef/>! First the starting code using rm as an example:

<echo message="rm -rf ${some.dir}"/>
<exec executable="rm">
    <arg value="-rf"/>
    <arg value="${some.dir}"/>
</exec>

Pretty straight-forward. Just note that <exec/> is silent so I add an <echo/>. This corresponds to the shell command:

$ rm -rf $some_dir

The next step is to turn this into a macro:

<macrodef name="rm">
    <attribute name="src"/>
    <sequential>
        <echo message="rm -rf @{src}"/>
        <exec executable="rm">
            <arg value="-rf"/>
            <arg value="@{src}"/>
        </exec>
    </sequential>
</macrodef>

Notice that the '$' became a '@' and the <exec/> is now wrapped in a <sequential/> tag. That is how Ant tells apart macro parameters from properties. With this change, I can now use this Ant script snippet:

<rm src="${some.dir}"/>

Much more readable!

Next I want to make this a bit more reusable. My example was super simple, but other cases might need to use some of the attributes for <exec/>. For my purposes, I added dir, spawn and failonerror. I found real uses of dir and failonerror in our codebase, and I wish to single out spawn in just a minute. That yields:

<macrodef name="rm">
    <attribute name="src"/>
    <attribute name="dir" default="."/>
    <attribute name="spawn" default="false"/>
    <attribute name="failonerror" default="false"/>
    <sequential>
        <echo message="rm -rf @{src}"/>
        <exec executable="rm" dir="@{dir}" spawn="@{spawn}"
                failonerror="@{failonerror}"
            <arg value="-rf"/>
            <arg value="@{src}"/>
        </exec>
    </sequential>
</macrodef>

Combined here are the techniques for macro attribute defaults and for passing down attributes for tasks wrapped in a macro. These serve well to preserve expected defaults and avoid surprises for macro users.

Aside: Why does failonerror default to false? This seems a perverse choice for a build system when fail-fast strategies save so much developer time in large projects.

Lastly, I want to make the macro generic to work with cp and mv, not just rm. So I did the obvious (to me) thing: I made a macro with the macro. Thus:

<macrodef name="file-operation">
    <attribute name="operation"/>
    <attribute name="message"/>
    <element name="attributes"/>
    <element name="args"/>
    <sequential>
        <macrodef name="@{operation}">
            <attributes/>
            <attribute name="dir" default="."/>
            <attribute name="spawn" default="false"/>
            <attribute name="failonerror" default="false"/>
            <sequential>
                <echo message="@{message}"/>
                <exec executable="@{operation}" dir="@{dir}"
                      spawn="@{spawn}" failonerror="@{failonerror}">
                    <args/>
                </exec>
            </sequential>
        </macrodef>
    </sequential>
</macrodef>

And my definition of the <rm/> macro becomes:

<file-operation operation="rm" message="rm -rf @{src}">
    <attributes>
        <attribute name="src"/>
    </attributes>
    <args>
        <arg value="-rf"/>
        <arg value="@{src}"/>
    </args>
</file-operation>

Usage stays the same:

<rm src="${some.dir}"/>

And likewise for cp and mv:

<file-operation operation="cp" message="cp -a @{src} @{dst}">
    <attributes>
        <attribute name="src"/>
        <attribute name="dst"/>
    </attributes>
    <args>
        <arg value="-a"/>
        <arg value="@{src}"/>
        <arg value="@{dst}"/>
    </args>
</file-operation>

<file-operation operation="mv" message="mv @{src} @{dst}">
    <attributes>
        <attribute name="src"/>
        <attribute name="dst"/>
    </attributes>
    <args>
        <arg value="@{src}"/>
        <arg value="@{dst}"/>
    </args>
</file-operation>

With corresponding Ant script calls:

<cp src="${some.dir}" dst="${dupliate.dir}"/>
<mv src="${some.dir}" dst="${renamed.dir}"/>

One last itch remains for me. The raison d’ĂȘtre I started down this road in the first place was to speed up <delete dir="${some.dir}"/>. An optimization I realized early on was not just to call to rm, but to run the operation in the background:

<tempfile property="tmp.dir" prefix=".tmp."/>
<mv src="${some.dir}" dst="${tmp.dir}"/>
<rm src="${tmp.dir}" spawn="true"/>

This pattern renames ${some.dir} to a random temporary directory and deletes the temporary directory in the background. The documentation of <exec/> even claims the operation continues after the Ant script exits. Perfect! Now to simplify usage of the pattern:

<macrodef name="rm-background">
    <attribute name="src"/>
    <attribute name="property"/>
    <attribute name="dir" default="."/>
    <attribute name="failonerror" default="false"/>
    <sequential>
        <tempfile property="@{property}" prefix=".@{property}."/>
        <mv src="@{src}" dst="${@{property}}" dir="@{dir}"
            failonerror="@{failonerror}"/>
        <rm src="${@{property}}" spawn="true" dir="@{dir}"
            failonerror="@{failonerror}"/>
    </sequential>
</macrodef>

Gives:

<rm-background src="${some.dir}" property="tmp.some.dir"/>

Notice that the temporary directory begins with a '.'. This is to make it hidden under UNIX/Cygwin so it doesn't clutter ls.

Time for a peanut butter sandwich.

UPDATE: I've saved everything in one place for easy examination.

Tuesday, August 30, 2005

Deleting things with Ant

One thing about Ant I have noticed is that file operations are slow and recursive operations are especially so. Consider this toy Ant script to delete a directory in the background while carrying on in the foreground:

<project name="parallel-delete" default="all">
    <property name="old.dir" value="build"/>
    <available file="${old.dir}" property="old.dir.exists"/>

    <target name="all" depends="rename-old-dir">
        <parallel>
            <apply executable="rm">
                <arg value="-rf"/>
                <dirset dir="." includes="${old.dir}.*"/>
            </apply>

            <sequential>
                <!-- Real work goes here -->
                <sleep seconds="1"/>
                <echo message="Snoozing..."/>
            </sequential>
        </parallel>
    </target>

    <target name="rename-old-dir" if="old.dir.exists">
        <tempfile property="tmp.dir" prefix="${old.dir}."/>
        <!-- Slow as all get out.
        <move todir="${tmp.dir}">
            <fileset dir="${old.dir}"/>
        </move> -->
        <!-- Deprecated, but works better than move.
        <rename src="${old.dir}" dest="${tmp.dir}"/> -->
        <!-- Fastest of all but only works on UNIX/Cygwin.
        <exec executable="mv">
            <arg value="${old.dir}"/>
            <arg value="${tmp.dir}"/>
        </exec>
    </target>
</project>

Notice the commented sections in the rename-old-dir target? Each of them has drawbacks. For my testing I setup with:

$ cp -a /usr/include build

This gave me a large directory tree to test with. You can decide from the comments which approach is least worst for you.

The actual technique—deleting in the background—is an interesting one and shaves considerable time off a large project wanting a clean rebuild. There is also the funny business with <apply/> also owing to Ant problems with file operations on a large, recursive tree. I cannot nest a <dirset/> element inside a <delete/> task (the <dirset/> type is, unfortunately, rather less useful than <fileset/>), but I need to delete directories, not files, and the name of the directory is a pattern. (I may have left over temporary directories to delete from previous, aborted runs.)

This solution is workable and fast, but with the drawback of targeting UNIX and Cygwin. Windows-only users get the short end of the development stick again.

My project is sitting on Ant 1.6.2; perhaps 1.6.3 addresses some of these. The syntax for file operations on directories is improving (e.g., <move file="src/dir" tofile="new/dir/to/move/to"/>) so some of my concerns may be already addressed.

More on the <parallel/> trick

My sample Ant script runs the cleanup in the background as foreground work continues. If the foreground work finishes, the script continues and the background completes. An alternative is to surround the background work with <daemons/>; then if the foreground finishes the script exits leaving the background work incomplete. For a cleanup task this isn't a terrible choice and one could have a vacuum process or task for removing detritus from previous incomplete background work with the benefit of having the Ant script finish faster.

UPDATE: Food for thought: here is a similar build script with make instead of Ant:

build.dir = build

all:
        : Do your work here
        sleep 1
        echo Snoozing...

rebuild: clean all

clean:
        test -d $(build.dir) && mv $(build.dir) $(build.dir).$$$$ || true
        ($(RM) -rf $(build.dir).* &)

Monday, August 29, 2005

Nice Ant 1.6.3 improvement for default values

Apache Ant 1.6.3 added a nice feature to the condition task. There is now an else attribute. The niceness is best illustrated by example:

<property environment="env"/>

<!-- The old, pre-1.6.3 way:-->
<condition property="foo.bar" value="${env.FOO_BAR}">
    <isset property="env.FOO_BAR"/>
</condition>
<condition property="foo.bar" value="Lives somewhere else">
    <not>
        <isset property="env.FOO_BAR"/>
    </not>
</condition>

And with Ant 1.6.3:

<property environment="env"/>

<!-- The new, 1.6.3 way:-->
<condition property="foo.bar" value="${env.FOO_BAR}"
        else="Lives somewhere else">
    <isset property="env.FOO_BAR"/>
</condition>

It could be more concise still, but this is an improvement.

Wednesday, August 24, 2005

Saturday, August 20, 2005

A home page

I enjoy posting code snippets but sometimes a full project is more helpful. On my home page are several links to downloads for several of my more interesting (to me) posts. Cheers!

Friday, August 19, 2005

GNU autotools lessons learned

Along the way to autoconfiscating portions of our Windows to Linux port for my company's desktop softare, I discovered the autoreconf gem. It handles most of what I had stuffed into a autogen.sh script.

One nit. I have some custom macros in an m4/ subdirectory which add support for --enable-debug, --enable-profile and --enable-release flags to ./configure. (Why aren't these standard, or at least the macros standard?) autoreconf supports an -I m4 option to pass these to autoconf and autoheader, but not to aclocal.

Drat!

However, thanks to GNU Automake by Example, I found that I can put a ACLOCAL_AMFLAGS = -I m4 line in the top-level Makefile.am to pass -I m4 to aclocal. This is an unfortunate code duplication, but better than simply having the feature broken.

I also discovered autoupdate which brought my configure.ac file up to current standards. Nifty.

Lastly, I saw that autoreconf is actually quite clever. If I have never run ./configure, the --make option does nothing as it does not know how I wish to configure the project (the install directory, for example). However once I have run ./configure, autoreconf reuses the settings from that first run for subsequent runs of ./configure and then dashes off with make afterwards.

Thursday, August 11, 2005

Teaching configure about build flags

Cobbled together from several sources around the Internet, I put together a solution this morning to a question posed by one of our developers: how do you make debug v release builds with GNU autotools?

A first pass at answering produced the ac_build_types.m4 macro for autoconf. First, some usage. Here is configure.ac:

AC_PREREQ(2.59)
AC_INIT([my_project], [0.0.0], [binkley@alumni.rice.edu])
AC_CONFIG_SRCDIR([config.h.in])
AC_CONFIG_HEADER([config.h])
AM_INIT_AUTOMAKE

AC_BUILD_TYPES

dnl Rest of file...

The only thing different from a standard configure.ac is the addition of AC_BUILD_TYPES. The effect of that shows in ./configure:

$ ./configure --help
# ...
Optional Features:
  --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
  --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
  --enable-debug          build debug version
  --enable-profile        build profiling version
  --enable-release        build release version
# ...

Now there are flags to build debug, profiling and release builds.

Last is the macro itself:

C_DEFUN([AC_BUILD_TYPES],
[
AC_ARG_ENABLE(debug,
      [  --enable-debug          build debug version],
      CFLAGS="$CFLAGS -DDEBUG -O0 -g3 -Wall"
      CXXFLAGS="$CXXFLAGS -DDEBUG -O0 -g3 -Wall")
AC_ARG_ENABLE(gprof,
      [  --enable-profile        build profiling version],
      CFLAGS="$CFLAGS -pg"
      CXXFLAGS="$CXXFLAGS -pg"
      LDFLAGS="$LDFLAGS -pg")
AC_ARG_ENABLE(release,
      [  --enable-release        build release version],
      CFLAGS="$CFLAGS -DNDEBUG -g0 -O3"
      CXXFLAGS="$CXXFLAGS -DNDEBUG -g0 -O3")
])

That's all! Save the definition into something like ac_build_types.m4 and run aclocal -Idirectory-containing-macros as part of creating your ./configure for other developers.

Wednesday, August 10, 2005

Almost there

As part of exploring a Windows to Linux port, I looked into generating Windows DLLs with automake and libtool. After some experimenting, I have almost everything I want in an example library, Foo:

  1. Builds static and shared libraries for Linux
  2. Builds static and shared libraries for Windows
  3. Windows libraries can depend on Cygwin
  4. Windows libraries can be entirely independent of Cygwin

Here's how I build for that last case:

$ ./autogen.sh
$ CC="cc -mno-cywin" CXX="cc -mno-cygwin" LTCC="cc" ./configure --prefix=/tmp/foo
$ make install

This installs everything under /tmp/foo (where I can easily clean up between test runs).

Why LTCC="cc"? It turns out that libtool writes a temporary wrapper for the main and just this temporary wrapper needs the full Cygwin environment. Libtool provides LTCC for building the wrapper independently of CC used for everything else. Without the extra setting, everything actually works but produces spurious errors during building. (If you want an explanation of -mno-cygwin, see the manpage for GCC under Cygwin.)

I posted this under the title Almost there. There is one more detail to work out. Even though the DLL has no Cygwin depedencies, it is still named cygfoo-0-0-0.dll rather than foo.dll as expected. Everythings works correctly, but this is annoying. You cannot rename the DLL either as this breaks loading for programs linked against it. When I figure this last bit out, I'll post an update to this entry.

Runtime traces for C

While porting some software from Windows to Linux, I needed to see backtraces. If there is no global exception handler and an exception in C++ is not caught, it aborts the program. The reports I got went along the lines of some program output:

$ run_foo
Aborted.

Not very helpful!

So I wrote a small pair of trace macros based on the GNU C Library's backtrace facility. Although this facility is only available for UNIX and Linux platforms, my tracing macros still are helpful under Windows sans backtracing.

#ifndef TRACE_H_
#define TRACE_H_

#include <stdio.h>
#include <stdlib.h>

#define TRACE_BACKTRACE_FRAMES 10

#ifdef __GNUC__
# define __FUNCTION__ __PRETTY_FUNCTION__
#endif

/* Emacs-style output */
#ifdef EMACS
# define TRACE_PREFIX   fprintf (stderr, "%s:%d:%s", __FILE__, __LINE__, __FUNCTION__)
#else
# define TRACE_PREFIX   fprintf (stderr, "%s(%d):%s", __FILE__, __LINE__, __FUNCTION__)
#endif /* EMACS */

#ifdef linux
# include <execinfo.h>

# define TRACE_DUMP()   do {     void *array[TRACE_BACKTRACE_FRAMES];     int n = backtrace (array, sizeof (array) / sizeof (void *));     char **symbols = backtrace_symbols (array, n);     int i;      for (i = 0; i < n; ++i)       fprintf (stderr, " -[%d]%s\n", i, symbols[i]);      free (symbols);   } while (0)
#else
# define TRACE_DUMP()
#endif /* linux */

#define TRACE()   TRACE_PREFIX; fprintf (stderr, "\n");   TRACE_DUMP ()

#define TRACE_MSG(MSG)   TRACE_PREFIX; fprintf (stderr, ": %s\n", MSG);   TRACE_DUMP ()

#endif /* TRACE_H_ */

Use the EMACS define to switch between Emacs-style and regular line tracing.

UPDATE: I meant to provide sample output:

trace.c(9):bob: Some interesting debug message.
 -[0]./trace(bob+0x51) [0x804878d]
 -[1]./trace(main+0x21) [0x8048817]
 -[2]/lib/tls/libc.so.6(__libc_start_main+0xd0) [0x4021de80]
 -[3]./trace [0x80486a1]

One thing jumps out immediately: this is not Java. But one can tease out that main called bob and bob wrote a trace message. Notice this:

trace.c(9):bob: Some interesting debug message.
 -[0]./trace(main+0x51) [0x8048841]
 -[1]/lib/tls/libc.so.6(__libc_start_main+0xd0) [0x4021de80]
 -[2]./trace [0x80486a1]

That is the same output with -O3 passed to GCC. The compiler inlines away the trivial bob function. -O2 did not inline the function.

Friday, July 29, 2005

The complexity of Map in Java

I just posted an updated version of my util library of extensions to Java Collections in the course of which I added a ListMap interface to replace UnionMap and an implementation, ArrayHashListMap.

If one were to write read-only Map implementations, the procedure is straight-forward and well-documented in the Javadocs for Map. However, writing a modifiable map is much more of an undertaking.

Consider all the places that "modifiability" can be leaked:

entrySet()
This is burdensome as the Set has to provide modifiable operations which effect the original Map and all methods of Set which themselves return modifiable collections or iterators need to tie back to the original Map all the way down to Map.Entry.
keySet()
The same remarks for entrySet() apply to keySet() (exception Map.Entry).
values()
Similarly for values() although it is a Collection rather than a Set, a distinction of very little.
layers()
A method particular to ListMap, it returns a list of maps which comprise the PATH-like layers which are collapsed to present a single Map view.

What did I do in face of this? I introduced a new interface, Refreshing (would Refreshable have been better?), which has one method, void refresh(), implemented versions of Map List, Set and Iterator which take a Refreshing object to signal after modifiable operations complete, and extended AbstractMap to return refreshing versions of collection classes for each of the possible points where modifiability leaks. You can see the result in ListMap and ArrayHashListMap.

Thank goodness for mock objects and unit tests!

Tuesday, July 26, 2005

More automake, now with packages

Murray Cumming had a lot of helpful things to say about automake and libraries. Drawing from that, I found the right way to package our inhouse libraries for reuse was to add this to the library's Makefile.am:

pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = name_of_library_project.pc

And drop a name_of_library_project.pc.in into the top-level library project directory:

prefix=@prefix@
exec_prefix=@exec_prefix@
libdir=@libdir@
includedir=@includedir@

Name: Name of library project
Description: Some description here.
Requires:
Version: @VERSION@
Libs: -L${libdir} -lname_of_library link list of libraries for this project
Cflags:

The rest was magic, and "just worked". To other projects wanting to use the library, add this to their configure.in (or configure.ac; different names, same files):

PKG_CHECK_MODULES(DEPS, name_of_library_project)
AC_SUBST(DEPS_CFLAGS)
AC_SUBST(DEPS_LIBS)

(See my earlier post on autogen.sh.)

Monday, July 25, 2005

Porting the Angband borg to 3.0.6

The very excellent APWBorg for Angband does not compile out of the box for version 3.0.6, but the fix is simple. Make these substitutions in the borg?.c file:

Term->offset_x for p_ptr->wx
Term->offset_y for p_ptr->wy

This matches the ChangeLog description:

2004-05-31 17:26  rr9

        * src/: cave.c, cmd3.c, defines.h, dungeon.c, files.c, generate.c,
        l-player.pkg, spells2.c, types.h, wizard2.c, xtra2.c: Replaced the
        'wx' and 'wy' main term viewport coordinates with window specific
        offsets.

A simple diff of src/cave.c between versions 3.0.5 and 3.0.6 shows this change in the mainline sources.

UPDATE: Some tips on installing APWBorg under Linux in the form of command history:

$ unzip 305Borg.zip
$ cd 305Borg
$ rm MAIN-WIN.C
$ for x in *
> do mv $x $(echo $x | tr '[:upper:]' '[:lower:]')
> done
$ dos2unix *
$ ANGBAND=... # where ever you unpacked angband-3.0.6.tar.gz
$ cp borg*.[ch] $ANGBAND/src
$ cp borg.txt $ANGBAND/lib/user

You still need to fix up the make process to compile and link the borg sources into angband. Copy the sample autogen.sh to $ANGBAND and chmod +rx autogen.sh. Add the nine borg?.c files to the list of sources in $ANGBAND/src/Makefile.am. Add borg.txt to the list of sources in $ANGBAND/lib/user/Makefile.am. Then:

$ cd $ANGBAND
$ ./autogen.sh
$ ./configure --prefix=$ANGBAND
$ make install

Now play angband.

UPDATE: I dropped Dr. White a line and got a reply. Looks like my patch will make it into the next version of his borg. Yah for open source!

Sunday, July 24, 2005

ReverseList for Java

One of my Java pastimes is writing trivial utility classes. Part of my interest is stylistic: I hate seeing blocky, chunky code that could be replaced by elegant, simple code. Illustrating my point, I needed a reversed list which presented an underlying list in reverse order. So I created a simple utility class, ReverseList, which fits the bill:

class ReverseList<E>
        extends AbstractList<E> {
    private final List<E> list;

    public ReverseList(final List<E> list) {
        this.list = list;
    }

    public void add(final int index, final E element) {
        list.add(reverseIndex(index) + 1, element);
    }

    public E remove(final int index) {
        return list.remove(reverseIndex(index));
    }

    public E get(final int index) {
        return list.get(reverseIndex(index));
    }

    public E set(final int index, final E element) {
        return list.set(reverseIndex(index), element);
    }

    public int size() {
        return list.size();
    }

    private int reverseIndex(final int index) {
        return size() - index - 1;
    }
}

The only trick to it is to note that add(int, Object) requires one bump the reversed index. I only discovered this during unit testing and was puzzled for a while. Then I realized that add is conceptually akin to inserting in that it shifts elements to the right: reversing the list shifts to the left, hence, requires that indices be off by one.

UPDATE: An even better explanation for the index in add: The indices are akin to point in Emacs, that is, they note the spot between elements and are just to the left of a given element. If you think of elements as boxes in a row, the point is the gap to the left of a box between it and the box next to it.

Reversing a list is like reflecting it in a mirror: traversing it from beginning to end becomes end to beginning when reversing. Likewise, point is reflected from being just to the left of an element to just to the right of an element. To represent this change in the original, underlying list is to increase the index of point by one.

UPDATE: Andrew McCormick pointed out that "reverse list" also has another name: Stack.

Friday, July 22, 2005

Java's weakness, C++'s strength

For the most part I enjoy coding in Java more than in C++. Especially with tools like the Intellij IDEA refactoring editor, my productivity is higher in Java and my pain lower. However, the one thing from C++ I miss the most is the destructor.

Strangely, many C++ programmers (at least those I witness) seem to overlook this strength. One of my favorite utility classes is so simple:

#include <stdio.h>

class auto_file {
  FILE* filep;

public:
  auto_file (const char *path, const char *mode) {
    if (!(filep = open (path, mode))
      throw some_program_specific_exception (path, mode);
  }

  ~auto_file () { close (filep); }

  operator FILE* () { return filep; }
};

Most times I could just use <fstream>, but sometimes I need to pass around file descriptors or FILE pointers to code outside my control. And when you work with sockets (which need the same treatment), there is no standard library facility.

Java's try-finally approach to releasing system resources is ok for many uses, but is prone to forgetfulness and does not work at all when object scope is complex.

The one thing I miss most in Java is the destructor. It would be swell to introduce a new language facility for them (something different from the finalize() fiasco). With garbage collection, Java could do an even better job than C++.

Sunday, July 17, 2005

Rich text editing for HTML

I've been playing with LiveJournal and noticed they, too, have a rich text textarea widget. But when I tried the simplest thing that could possibly work:

<html>
    <head>
        <title>Rich text?</title>
    </head>
    <body>
        <form>
            <textarea rows="5" cols="20"></textarea>
        </form>
    </body>
</html>

It did not work. If I turn on the richtext feature of LiveJournal and paste in text I copy from a webpage which includes links, LiveJournal pastes the links as well as the text and formatting; with my sample page I just get plain text.

George Hotelling does write about this and points to Epoz project on SourceForge.

Sadly, there are no files to download in spite of the project claiming to be mature.

I just want to figure out how LiveJournal's rich text widget works.

Wednesday, July 13, 2005

A starter autogen.sh

I looked around quite a bit before jimmying together this autogen.sh for autoconfiscating a project at work:

#!/bin/sh
set -x
autoscan
libtoolize --automake --copy
aclocal
autoconf
autoheader
automake --foreign --add-missing --copy

The point of autoscan is to catch any new portability problems as they crop up.

To go with autogen.sh, here is a starter configure.in:

C_PREREQ(2.59)
AC_INIT([aproject], [0.0.0], [binkley@alumni.rice.edu])
AC_CONFIG_SRCDIR([config.h.in])
AC_CONFIG_HEADER([config.h])
AM_INIT_AUTOMAKE
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

And Makefile.am:

lib_LTLIBRARIES = libsample.la
libsample_la_LDFLAGS = -version-info 0:0:0
libsample_la_SOURCES = sample.cc

It would have saved me considerable effort to begin with starter files such a these. Just plop them into the top-level of your project, and run ./autogen.sh. Correct for errors in the output and look at configure.scan for suggestion for updating configure.in. And update Makefile.am with a real list of your sources.

Rinse, lather, repeat.

UPDATE: With modern autotools installations, an even cleaner starter script:

#!/bin/sh
autoscan # show autotools lint
autoreconf --install "$@" # add -I dir for local M4 macros

NB — If you add -I m4 (for example) to pick up custom M4 macros in the m4/ directory of your project, remember to update Makefile.am as well and add to the top:

ACLOCAL_AMFLAGS = -I m4

Unfortunate duplicity, but necessary with current tools (autoconf 2.59 / automake 1.95). See this post for details.

Monday, July 04, 2005

The many ways to skin a cat

I just ran across yet another way to write the Factory pattern in Java:

public class Factory {
    private static interface Lineman {
        Thing create(final Widget widget);
    }

    private static Lineman[] LINEMEN = {
        new Lineman() {
            public Thing create(final Widget widget) {
                return new CoolThing(widget);
            }
        },
    };

    public static enum Things {
        COOL;

        private final Lineman lineman;

        Things() {
            lineman = LINEMEN[ordinal()];
        }

        public Thing create(final Widget widget) {
            return lineman.create(widget);
        }
    }
}

User code looks likes this:

final Thing thing = Factory.Things.COOL.create(widget);

Although overcomplicated-looking, it does has several useful points.

User code is constrained with enums as to the types of objects created by the factory. No business with class names or special strings. And modern editors have full code completion for the enums.

Secondly, code maintenance for the factory is simple. When you add a new type, add a new enum and a corresponding entry in the static array. This sort of work is straight-forward to automate with XDoclet or some custom build step.

UPDATE: I should point out that suitable imports simplify the user code further:

final Thing thing = COOL.create(widget);

And reflection can automate futher eliminating the need for a static array of factory methods if the classes follow some pattern in their naming, but with the caveats that go with reflection (ignoring exceptions):

public Thing create(final Widget widget) {
    return (Action) Class.forName(getThingClassName(name()))
            .getConstructor(Things.class, Widget.class)
            .newInstance(this, widget);
}

Friday, June 24, 2005

Let generics make your Java constructor cleaner

I often find myself writing Java constructors like this:

public SomeClass(final SomeType parameter) {
    super(parameter);

    if (parameter == null)
        throw new NullPointerException();
}

when what I really want to write is this:

public SomeClass(final SomeType parameter) {
    if (parameter == null) throw new NullPointerException();

    super(parameter);
}

which is, unfortunately, illegal. But there is a simple solution:

public SomeClass(final SomeType parameter) {
    super(asNotNull(parameter));
}

What is asNotNull(?)?

public static <T> T asNotNull(final T parameter) {
    if (parameter == null) throw new NullPointerException();

    return parameter;
}

As I generify the input, I am able to preserve the type on output so including calls to asNotNull(T) lose no type information and I need not write casts.

Of course, you can generalize this idea further and provide a family of tests or a more general method which takes an interface as a parameter or even an Exception parameter to change the thrown exception.

Monday, May 09, 2005

Ruby to Smalltalk

As part of working on Ruby on the Smalltalk VM, I am looking at ways of parsing Ruby in Ruby. My starting point is the fact that parse.y is by definition the most correct parsing of Ruby. So I thought, How could I turn parse.y into Ruby? Here is my idea:

  1. Turn parse.y into parse.c — Already done for me in the stock Ruby distribution
  2. Use GCC to turn parse.c into parse.rtl (where RTL is the intermediate representation of a program used by GCC) or parse.s (where S is some flavor of assembly)
  3. Find or write a small RTL or Assembly interpreter in Ruby.

And there you have it. I have reduced a nasty problem (turning YACC/C source) into a less nasty one (interpreting in Ruby a made-to-interpret language).

Now to figure out step #3.