I need to track changing data in Java. For example, say I am collecting data sets of some real world behavior, and I'd like to know what has changed between individual readings: additions, removals, changes.
I start with a single datum:
public static class Example { public final int index; // primary key public final double value; // satelite data public final long access; // other data public Example(final int index, final double value, final long access) { this.index = index; this.value = value; this.access = access; } @Override public boolean equals(final Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; final Example example = (Example) o; if (index != example.index) return false; return true; } @Override public int hashCode() { return index; } }
Reducing to essentials:
- Primary key values: different
Example
instances with the same index represent the same "thing" - Satellite data: these are the measurements I want to track
- "Other" data: these are part of my data set but are not part of my change tracking
Now equals
and hashCode
take care of matching equivalent data so I can compare values between data sets. But I also need to know if the satellite data has changed:
public static class ExampleComparator implements Comparator<Example> { @Override public int compare(final Example a, final Example b) { final int c = Integer.valueOf(a.index).compareTo(b.index); if (0 != c) return c; return Double.valueOf(a.value).compareTo(b.value); } }
An important point to consider: why not have Example
implement Comparable<Example>
instead of the separate comparator class? Read the Comparator
javadocs closely: you really want equals
and compareTo
to agree (the discussion is good — I have a new source of bugs to comb out of my code base).
But here equals
looks at only primary keys and the comparator looks at both primary keys and satellite data: they do not agree.
Having all the tools I need, time for comparing data sets (please be kind about the weak class name):
public class Differator<T, C extends Comparator<T>> { private final Map<T, T> items = new HashMap<T, T>(); private final C comparator; public static <T, C extends Comparator<T>> Differator<T, C> newDifferator( final C comparator) { return new Differator<T, C>(comparator); } public Differator(final C comparator) { this.comparator = comparator; } public Changes<T> replaceAll(final Set<T> newItems) { final Changes<T> changes = new Changes<T>(); for (final T item : items.keySet()) if (!newItems.contains(item)) { changes.deleted.add(item); items.remove(item); } for (final T newItem : newItems) { if (!items.containsKey(newItem)) changes.inserted.add(newItem); else { final T item = items.get(newItem); if (0 != comparator.compare(item, newItem)) changes.updated.add(new Delta<T>(item, newItem)); } items.put(newItem, newItem); } return changes; } public static class Delta<T> { public final T item; public final T newItem; public Delta(final T item, final T newItem) { this.item = item; this.newItem = newItem; } @Override public boolean equals(final Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; final Delta delta = (Delta) o; if (!item.equals(delta.item)) return false; if (!newItem.equals(delta.newItem)) return false; return true; } @Override public int hashCode() { int result = item.hashCode(); result = 31 * result + newItem.hashCode(); return result; } } public static class Changes<T> { public final Set<T> inserted = new HashSet<T>(); public final Set<Delta<T>> updated = new HashSet<Delta<T>>(); public final Set<T> deleted = new HashSet<T>(); @Override public boolean equals(final Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; final Changes changes = (Changes) o; if (!deleted.equals(changes.deleted)) return false; if (!inserted.equals(changes.inserted)) return false; if (!updated.equals(changes.updated)) return false; return true; } @Override public int hashCode() { int result = inserted.hashCode(); result = 31 * result + updated.hashCode(); result = 31 * result + deleted.hashCode(); return result; } } }
It was not much effort to write Differator
. replaceAll
swaps out a new data set for an old one, returning how they two differed. Credit for immediately spotting that Differator
is not merely thread unsafe, it is very unsafe. But this is not hard to address in a simple way.
Lastly, some tests:
public class DifferatorTest { private Differator<Example, ExampleComparator> differator; private Example item; private Changes<Example> changes; @Before public void setUp() throws Exception { differator = newDifferator(new ExampleComparator()); item = new Example(1, 3.14159, 123); changes = differator.replaceAll(singleton(item)); } @Test public void testFirstInsert() { assertThat(changes.inserted, is(equalTo(singleton(item)))); assertThat(changes.updated.isEmpty(), is(true)); assertThat(changes.deleted.isEmpty(), is(true)); } @Test public void testUpdateUnimportantData() { final Changes<Example> changes = differator .replaceAll(singleton(new Example(1, 3.14159, 456))); assertThat(changes.inserted.isEmpty(), is(true)); assertThat(changes.updated.isEmpty(), is(true)); assertThat(changes.deleted.isEmpty(), is(true)); } @Test public void testUpdateSatelliteData() { final Example newItem = new Example(1, 2.71828, 123); final Changes<Example> changes = differator .replaceAll(singleton(newItem)); assertThat(changes.inserted.isEmpty(), is(true)); assertThat(changes.updated, is(equalTo( singleton(new Differator.Delta<Example>(item, newItem))))); assertThat(changes.deleted.isEmpty(), is(true)); } @Test public void testUpdatePrimaryKey() { final Example newItem = new Example(2, 3.14159, 123); final Changes<Example> changes = differator .replaceAll(singleton(newItem)); assertThat(changes.inserted, is(singleton(newItem))); assertThat(changes.updated.isEmpty(), is(true)); assertThat(changes.deleted, is(singleton(item))); } }
No comments:
Post a Comment