Friday, September 02, 2005

Fun with Ant macros

Noticing that many Ant file operations were rather slow on large directory trees, I fell back to the tried and true: <exec/>. But I noticed that as I coded up cp, mv and rm, that the calls to <exec/> became rather tedious and repetitious. Looking around a bit, I found just the thing: <macrodef/>! First the starting code using rm as an example:

<echo message="rm -rf ${some.dir}"/>
<exec executable="rm">
    <arg value="-rf"/>
    <arg value="${some.dir}"/>
</exec>

Pretty straight-forward. Just note that <exec/> is silent so I add an <echo/>. This corresponds to the shell command:

$ rm -rf $some_dir

The next step is to turn this into a macro:

<macrodef name="rm">
    <attribute name="src"/>
    <sequential>
        <echo message="rm -rf @{src}"/>
        <exec executable="rm">
            <arg value="-rf"/>
            <arg value="@{src}"/>
        </exec>
    </sequential>
</macrodef>

Notice that the '$' became a '@' and the <exec/> is now wrapped in a <sequential/> tag. That is how Ant tells apart macro parameters from properties. With this change, I can now use this Ant script snippet:

<rm src="${some.dir}"/>

Much more readable!

Next I want to make this a bit more reusable. My example was super simple, but other cases might need to use some of the attributes for <exec/>. For my purposes, I added dir, spawn and failonerror. I found real uses of dir and failonerror in our codebase, and I wish to single out spawn in just a minute. That yields:

<macrodef name="rm">
    <attribute name="src"/>
    <attribute name="dir" default="."/>
    <attribute name="spawn" default="false"/>
    <attribute name="failonerror" default="false"/>
    <sequential>
        <echo message="rm -rf @{src}"/>
        <exec executable="rm" dir="@{dir}" spawn="@{spawn}"
                failonerror="@{failonerror}"
            <arg value="-rf"/>
            <arg value="@{src}"/>
        </exec>
    </sequential>
</macrodef>

Combined here are the techniques for macro attribute defaults and for passing down attributes for tasks wrapped in a macro. These serve well to preserve expected defaults and avoid surprises for macro users.

Aside: Why does failonerror default to false? This seems a perverse choice for a build system when fail-fast strategies save so much developer time in large projects.

Lastly, I want to make the macro generic to work with cp and mv, not just rm. So I did the obvious (to me) thing: I made a macro with the macro. Thus:

<macrodef name="file-operation">
    <attribute name="operation"/>
    <attribute name="message"/>
    <element name="attributes"/>
    <element name="args"/>
    <sequential>
        <macrodef name="@{operation}">
            <attributes/>
            <attribute name="dir" default="."/>
            <attribute name="spawn" default="false"/>
            <attribute name="failonerror" default="false"/>
            <sequential>
                <echo message="@{message}"/>
                <exec executable="@{operation}" dir="@{dir}"
                      spawn="@{spawn}" failonerror="@{failonerror}">
                    <args/>
                </exec>
            </sequential>
        </macrodef>
    </sequential>
</macrodef>

And my definition of the <rm/> macro becomes:

<file-operation operation="rm" message="rm -rf @{src}">
    <attributes>
        <attribute name="src"/>
    </attributes>
    <args>
        <arg value="-rf"/>
        <arg value="@{src}"/>
    </args>
</file-operation>

Usage stays the same:

<rm src="${some.dir}"/>

And likewise for cp and mv:

<file-operation operation="cp" message="cp -a @{src} @{dst}">
    <attributes>
        <attribute name="src"/>
        <attribute name="dst"/>
    </attributes>
    <args>
        <arg value="-a"/>
        <arg value="@{src}"/>
        <arg value="@{dst}"/>
    </args>
</file-operation>

<file-operation operation="mv" message="mv @{src} @{dst}">
    <attributes>
        <attribute name="src"/>
        <attribute name="dst"/>
    </attributes>
    <args>
        <arg value="@{src}"/>
        <arg value="@{dst}"/>
    </args>
</file-operation>

With corresponding Ant script calls:

<cp src="${some.dir}" dst="${dupliate.dir}"/>
<mv src="${some.dir}" dst="${renamed.dir}"/>

One last itch remains for me. The raison d’ĂȘtre I started down this road in the first place was to speed up <delete dir="${some.dir}"/>. An optimization I realized early on was not just to call to rm, but to run the operation in the background:

<tempfile property="tmp.dir" prefix=".tmp."/>
<mv src="${some.dir}" dst="${tmp.dir}"/>
<rm src="${tmp.dir}" spawn="true"/>

This pattern renames ${some.dir} to a random temporary directory and deletes the temporary directory in the background. The documentation of <exec/> even claims the operation continues after the Ant script exits. Perfect! Now to simplify usage of the pattern:

<macrodef name="rm-background">
    <attribute name="src"/>
    <attribute name="property"/>
    <attribute name="dir" default="."/>
    <attribute name="failonerror" default="false"/>
    <sequential>
        <tempfile property="@{property}" prefix=".@{property}."/>
        <mv src="@{src}" dst="${@{property}}" dir="@{dir}"
            failonerror="@{failonerror}"/>
        <rm src="${@{property}}" spawn="true" dir="@{dir}"
            failonerror="@{failonerror}"/>
    </sequential>
</macrodef>

Gives:

<rm-background src="${some.dir}" property="tmp.some.dir"/>

Notice that the temporary directory begins with a '.'. This is to make it hidden under UNIX/Cygwin so it doesn't clutter ls.

Time for a peanut butter sandwich.

UPDATE: I've saved everything in one place for easy examination.

Post a Comment