About me

Michael L Perry

Improving Enterprises

Principal Consultant

@michaellperry

User login

Better solution

The solution left the cache too constraining. While it was correct, it was too tightly locked. All users of the class will line up behind one another, whether they are after the same data or not.

To resolve this problem, let's move the lock. Instead of locking on the cache while fetching the value, let's lock on the key. Two clients accessing the cache with the same key will line up, but clients with different keys will not.

But we can't lock on the key itself. We need to lock on something with identity. The identity of the key is not important. A client can create a new instance of the key having the same value, and that should qualify as the same key. So we have to look elsewhere.

Fortunately, the value of the key is used to find an object in the dictionary. That object does have identity. So we move the lock there.

public class CachedObject<TValue>
{
    private TValue _value;
    private bool _fetched = false;
    public DateTime DateCached { get; private set; }

    public CachedObject(DateTime dateCached)
    {
        DateCached = dateCached;
    }

    public TValue Value
    {
        get { return _value; }
        set { _value = value; _fetched = true; }
    }

    public bool Fetched
    {
        get { return _fetched; }
    }
}

public class Cache<TKey, TValue>
{
    private TimeSpan _expiry;
    private Dictionary<TKey, CachedObject<TValue>> _store =
        new Dictionary<TKey, CachedObject<TValue>>();

    public Cache(TimeSpan expiry)
    {
        _expiry = expiry;
    }

    public TValue Get(TKey key, Func<TKey, TValue> fetch)
    {
        CachedObject<TValue> found;

        lock (_store)
        {
            // If the object is not in the cache, or it is expired,
            // fetch it and add it to the cache.
            DateTime now = DateTime.Now;
            if (_store.TryGetValue(key, out found) &&
                found.DateCached + _expiry > now)
            {
                return found.Value;
            }

            found = new CachedObject<TValue>(now);
            _store.Add(key, found);
        }

        lock (found)
        {
            if (!found.Fetched)
                found.Value = fetch(key);
            return found.Value;
        }
    }
}

The Get method locks first on the cache. This protects the data structure, and ensures that we get just one instance of CachedObject. Once it has that, it locks on the found CachedObject so that clients after the same key line up.

Inside the lock, it checks to see whether the value has already been fetched. If not, this thread is the first in line, so it fetches the value. But if it has already been fetched, then the thread was not the first in line and can just use the fetched object.

The above is not a formal proof, and has many logical holes. Hopefully we'll have time to shore it up, and possibly find bugs in the code along the way. This is the kind of code that definitely needs to be proven. Sure, unit testing can give us some confidence, but no amount of unit testing can verify that this code is completely thread safe.