Wednesday, March 17, 2010

Upgrading read lock to write lock in Java

Setup

I ran into a great question on stackoverflow while researching a concurrency question: how to you upgrade a read lock to a write lock?

The answer is easy: you can't. Trying to do so gives you deadlock.

The discussion covers a DAG data structure, but I wanted to try a simpler case.

Simplest case

Say you have an idempotent "resource" (intentionally vague):

public interface Resource {
    void something() throws IOException;
}

The resource can spoil, and you want try another instance automatically. The obvious thing:

public class RetryResourceFacade
        implements Resource {
    private Resource resource;

    public RetryResourceFacade() {
        refreshResource();
    }

    public void something() {
        while (true) {
            try {
                resource.something();
                return;
            } catch (final IOException e) {
                refreshResource();
            }
        }
    }

    private void refreshResource() {
        while (true) {
            try {
                resource = newResource();
                return;
            } catch (final IOException e) {
                // try another one
            }
        }
    }

    public static Resource newResource()
            throws IOException {
        return null; // unlikely in real life
    }
}

Essentially when a resource spoils, you try another one until one works. But this is horrible for concurrent code: how to fix it?

Thread-safe

public class RetryResourceFacade
        implements Resource {
    private final Object lock = new Object();

    private volatile Resource resource;

    public RetryResourceFacade() {
        refreshResource();
    }

    public void something() {
        synchronized(lock) {
            while (true)
                try {
                    resource.something();
                    return;
                } catch (final IOException e) {
                    refreshResource();
                }
            }
        }
    }

    private void refreshResource() {
        while (true) {
            try {
                resource = newResource();
                return;
            } catch (final IOException e) {
                // try another one
            }
        }
    }

    public static Resource newResource()
            throws IOException {
        return null; // unlikely in real life
    }
}

Well, better, but still—do I really want to single-thread calls to something()? I can do yet better still:

More throughput

public class RetryResourceFacade
        implements Resource {
    private final Lock readLock;
    private final Lock writeLock;

    private volatile Resource resource;

    public RetryResourceFacade() {
        final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
        readLock = lock.readLock();
        writeLock = lock.writeLock();

        refreshResource();
    }

    public void something() {
        while (true) {
            readLock.lock();
            try {
                resource.something();
                readLock.unlock();
                return;
            } catch (final IOException e) {
                readLock.unlock();
                refreshResource();
            }
        }
    }

    private void refreshResource() {
        writeLock.lock();
        while (true) {
            try {
                resource = newResource();
                writeLock.unlock();
                return;
            } catch (final IOException e) {
                // try another one
            }
        }
    }

    public static Resource newResource()
            throws IOException {
        return null; // unlikely in real life
    }
}

By introducing read/write locks, I can let many callers use something() but single-thread refreshing the resource (I do not want several callers refreshing resource at once).

Best of all

But there is one final problem. What happens when two callers are both in something() and the same resource spoils? They both race to refreshResource(), one of them wins and gets a valid resource, but the other then throws that good one away and gets another new resource. Wasteful and embarrassing. The final solution:

public class RetryResourceFacade
        implements Resource {
    private final Lock readLock;
    private final Lock writeLock;

    private volatile Resource resource;
    private volatile boolean valid;

    public RetryResourceFacade() {
        final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
        readLock = lock.readLock();
        writeLock = lock.writeLock();

        refreshResource();
    }

    public void something() {
        while (true) {
            readLock.lock();
            try {
                resource.something();
                readLock.unlock();
                return;
            } catch (final IOException e) {
                valid = false;
                readLock.unlock();
                refreshResource();
            }
        }
    }

    private void refreshResource() {
        writeLock.lock();
        if (valid) {
            // Another caller already fixed resource
            writeLock.unlock();
            return;
        }

        while (true) {
            try {
                resource = newResource();
                valid = true;
                writeLock.unlock();
                return;
            } catch (final IOException e) {
                // try another one
            }
        }
    }

    public static Resource newResource()
            throws IOException {
        return null; // unlikely in real life
    }
}

Only refresh the resource if it is invalid. This is a little bit like handling spurious wakeup in that several threads compete for the resource, but only the first one gets to actually do something with it.

Key points

  • Back to the setup: avoid deadlock. You cannot obtain the write lock while holding the read lock, so release the read lock first, then obtain the write lock.
  • Use a status flag to ensure the write lock section is not executed needlessly.
  • You must mark resource and valid as volatile to ensure "publish" semantics (Jeremy Mason has a great post on this), otherwise anything could happen (really).

UPDATE: Thanks to Darren Accardo for key ideas in this post; his threading is stronger than mine!

No comments: