Tuesday, August 30, 2005

Deleting things with Ant

One thing about Ant I have noticed is that file operations are slow and recursive operations are especially so. Consider this toy Ant script to delete a directory in the background while carrying on in the foreground:

<project name="parallel-delete" default="all">
    <property name="old.dir" value="build"/>
    <available file="${old.dir}" property="old.dir.exists"/>

    <target name="all" depends="rename-old-dir">
        <parallel>
            <apply executable="rm">
                <arg value="-rf"/>
                <dirset dir="." includes="${old.dir}.*"/>
            </apply>

            <sequential>
                <!-- Real work goes here -->
                <sleep seconds="1"/>
                <echo message="Snoozing..."/>
            </sequential>
        </parallel>
    </target>

    <target name="rename-old-dir" if="old.dir.exists">
        <tempfile property="tmp.dir" prefix="${old.dir}."/>
        <!-- Slow as all get out.
        <move todir="${tmp.dir}">
            <fileset dir="${old.dir}"/>
        </move> -->
        <!-- Deprecated, but works better than move.
        <rename src="${old.dir}" dest="${tmp.dir}"/> -->
        <!-- Fastest of all but only works on UNIX/Cygwin.
        <exec executable="mv">
            <arg value="${old.dir}"/>
            <arg value="${tmp.dir}"/>
        </exec>
    </target>
</project>

Notice the commented sections in the rename-old-dir target? Each of them has drawbacks. For my testing I setup with:

$ cp -a /usr/include build

This gave me a large directory tree to test with. You can decide from the comments which approach is least worst for you.

The actual technique—deleting in the background—is an interesting one and shaves considerable time off a large project wanting a clean rebuild. There is also the funny business with <apply/> also owing to Ant problems with file operations on a large, recursive tree. I cannot nest a <dirset/> element inside a <delete/> task (the <dirset/> type is, unfortunately, rather less useful than <fileset/>), but I need to delete directories, not files, and the name of the directory is a pattern. (I may have left over temporary directories to delete from previous, aborted runs.)

This solution is workable and fast, but with the drawback of targeting UNIX and Cygwin. Windows-only users get the short end of the development stick again.

My project is sitting on Ant 1.6.2; perhaps 1.6.3 addresses some of these. The syntax for file operations on directories is improving (e.g., <move file="src/dir" tofile="new/dir/to/move/to"/>) so some of my concerns may be already addressed.

More on the <parallel/> trick

My sample Ant script runs the cleanup in the background as foreground work continues. If the foreground work finishes, the script continues and the background completes. An alternative is to surround the background work with <daemons/>; then if the foreground finishes the script exits leaving the background work incomplete. For a cleanup task this isn't a terrible choice and one could have a vacuum process or task for removing detritus from previous incomplete background work with the benefit of having the Ant script finish faster.

UPDATE: Food for thought: here is a similar build script with make instead of Ant:

build.dir = build

all:
        : Do your work here
        sleep 1
        echo Snoozing...

rebuild: clean all

clean:
        test -d $(build.dir) && mv $(build.dir) $(build.dir).$$$$ || true
        ($(RM) -rf $(build.dir).* &)

No comments: