About me

Michael L Perry

Improving Enterprises

Principal Consultant

@michaellperry

User login

Michael L Perry's blog

Solution

Two weeks ago, I assigned homework. Rewrite the .NET Socket API so that you can prove the following assertions:

  • You cannot use a Socket returned from Accept to accept any additional connections from the connection queue.
  • You cannot call Accept, Bind, Connect, Listen, Receive, or Send if the socket has been closed.
  • You must call Bind before you can call the Listen method.
  • You must call Listen before calling Accept.
  • Connect(string host, int port) can only be called if addressFamily is either InterNetwork or InterNetworkV6.
  • Connect cannot be called if Listen has been called.
  • You have to either call Connect or Accept before Sending and Receiving data.
  • If the socket has been previously disconnected, then you cannot use Connect to restore the connection.

These assertions are prominently listed in the MSDN documentation. Why not roll them into the API itself? Let the compiler prove that we are using sockets correctly. Last week, I gave you a hint. I moved the Send and Receive methods out to a different class so that we can prove that you've called Accept or Connect prior to Send or Receive. Both Accept and Connect became factory methods. You must call the factory before you get the object, and you must get the object before you call its methods. Q.E.D. Let's start from there and prove the rest of the assertions. Starting point

  • You have to either call Connect or Accept before Sending and Receiving data.
public class ConnectedSocket
{
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
}

public class Socket : IDisposable
{
    public Socket(SocketInformation socketInformation);
    public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public ConnectedSocket Accept();
    public void Bind(EndPoint localEP);
    public ConnectedSocket Connect(EndPoint remoteEP);
    public ConnectedSocket Connect(string host, int port);
    public void Dispose();
    public void Listen(int backlog);
}

Looking through the list of assertions, we find that this one is already taken care of by the first change:

  • You cannot use a Socket returned from Accept to accept any additional connections from the connection queue.

The ConnectedSocket class doesn't have an Accept method, so you can't call it. Q.E.D.
Other prerequisitesThere are three other prerequisite assertions that we can prove in exactly the same way:

  • You must call Bind before you can call the Listen method.
  • You must call Listen before calling Accept.
  • Connect cannot be called if Listen has been called.
public class Socket : IDisposable
{
    public Socket(SocketInformation socketInformation);
    public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public BoundSocket Bind(EndPoint localEP);
    public ConnectedSocket Connect(EndPoint remoteEP);
    public ConnectedSocket Connect(string host, int port);
    public void Dispose();
}

public class BoundSocket
{
    public ListeningSocket Listen(int backlog);
    public ConnectedSocket Connect(EndPoint remoteEP);
    public ConnectedSocket Connect(string host, int port);
}

public class ListeningSocket
{
    public ConnectedSocket Accept();
}

Bind becomes a factory for the class that gives you Listen. Bind does not take away your ability to call Connect. Listen does.
Factory and productFactories are great prerequisites. Now that it is becoming more obvious that we are using that pattern, let's rename Socket and move IDisposable to the product.

public class SocketFactory
{
    public SocketFactory(SocketInformation socketInformation);
    public SocketFactory(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public BoundSocket Bind(EndPoint localEP);
    public ConnectedSocket Connect(EndPoint remoteEP);
    public ConnectedSocket Connect(string host, int port);
}

public class ConnectedSocket : IDisposable
{
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
    public void Dispose();
}

The ConnectedSocket is the ultimate product. That's the socket that needs to be closed. And what do you know, this change proves another of our assertions:

  • If the socket has been previously disconnected, then you cannot use Connect to restore the connection.

It is impossible to call Connect on a socket that has been disposed, since they aren't the same object anymore.
Dispose onceNow that the Dispose method is where it belongs, let's focus on the assertion most concerned with its use:

  • You cannot call Accept, Bind, Connect, Listen, Receive, or Send if the socket has been closed.

Accept, Bind, Connect, and Listen have already been taken care of, since they have become factory methods. To handle Send and Receive, we turn every ConnectedSocket factory into a callback:

public class ConnectedSocket
{
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
}

public void Connect(string host, int port, Action<ConnectedSocket> action);

We have taken away the capability of disposing a socket. The factory does it on the consumer's behalf. Since there is no way for the consumer to call Dispose, and since the callback returns before Dispose is called, there is no way to call Send or Receive after Dispose. Q.E.D.
Specialized factoryWhich leaves us with one final assertion:

  • Connect(string host, int port) can only be called if addressFamily is either InterNetwork or InterNetworkV6.

Again, we want to move the restricted method into its own special class. But now, we want to enforce that certain values are provided, not that certain methods are called. We'll limit this by creating our own smaller enumeration.

public class IPSocketFactory : SocketFactory
{
    public enum Version { V4, V6 }
    public IPSocketFactory(Version version, SocketType socketType, ProtocolType protocolType)
        : base(
            version == Version.V4 ? AddressFamily.InterNetwork : AddressFamily.InterNetworkV6,
            socketType,
            protocolType) { }

    public void Connect(string host, int port, Action<ConnectedSocket> action);
}

If you want to call the Connect overload with two parameters, you have to use the limited enumeration to get the right factory. I didn't mess with the overloaded constructor, because it complicates things.
One slight wrinkle. In a previous change we created a copy of the Connect method, since it can be called after Bind(). So we make the same change there:

public class IPBoundSocket : BoundSocket
{
    public void Connect(string host, int port, Action<ConnectedSocket> action);
}

public class IPSocketFactory : SocketFactory
{
    public enum Version { V4, V6 }
    public IPSocketFactory(Version version, SocketType socketType, ProtocolType protocolType)
        : base(
            version == Version.V4 ? AddressFamily.InterNetwork : AddressFamily.InterNetworkV6,
            socketType,
            protocolType) { }

    public IPBoundSocket Bind(EndPoint localEP);
    public void Connect(string host, int port, Action<ConnectedSocket> action);
}

The only way to get an IPBoundSocket is to use the IPSocketFactory. Therefore, you must have selected one of the Internet protocols. By the way, this is taking advantage of a little-used feature of C# called covarience. This is the one part of the proof that might not work in other languages.
So there you have it. A provably correct Socket API. There is no way to use it incorrectly. The compiler enforces all of the assertions that we would otherwise have to read the documentation to discover.



Hint

Your homework is extended for another week, but let me give you a hint what I'm looking for. The problem was to Fix the Socket API. Rewrite the .NET Socket interface so that the assertions made in the MSDN documentation can be proven. To give you an example of what I'm looking for, take the following assertion:

  • You have to either call Connect or Accept before Sending and Receiving data.

This is a prerequisite assertion. We must first call Connect or Accept, then we can call Send or Receive. To prove this assertion, I'll use factory methods. Here's the original Socket class:

public class Socket : IDisposable
{
    public Socket(SocketInformation socketInformation);
    public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public Socket Accept();
    public void Bind(EndPoint localEP);
    public void Connect(EndPoint remoteEP);
    public void Connect(string host, int port);
    public void Dispose();
    public void Listen(int backlog);
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
}

Move the Send and Receive methods out of this class. We cannot call them yet. Move them into a new class called ConnectedSocket.

public class ConnectedSocket
{
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
}

To prove that we have to call Connect or Accept before any of the ConnectedSocket methods, we make Connect and Accept factory methods:

public class Socket : IDisposable
{
    public Socket(SocketInformation socketInformation);
    public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public ConnectedSocket Accept();
    public void Bind(EndPoint localEP);
    public ConnectedSocket Connect(EndPoint remoteEP);
    public ConnectedSocket Connect(string host, int port);
    public void Dispose();
    public void Listen(int backlog);
}

This API proves one of the assertions in the MSDN documentation. Try the others on your own. Answers will be posted next week.



Socket API

Your task is to fix the .NET Socket API. This is as unhelpful as an API gets. There are several rules that a user of a Socket must follow. Create an API that proves that they are following all of the rules. Here is the relevant part of the existing unhelpful API:

public class Socket : IDisposable
{
    public Socket(SocketInformation socketInformation);
    public Socket(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType);

    public Socket Accept();
    public void Bind(EndPoint localEP);
    public void Connect(EndPoint remoteEP);
    public void Connect(string host, int port);
    public void Dispose();
    public void Listen(int backlog);
    public int Receive(byte[] buffer);
    public int Send(byte[] buffer);
}

Here are the rules as documented in MSDN:

  • You cannot use a Socket returned from Accept to accept any additional connections from the connection queue.
  • You cannot call Accept, Bind, Connect, Listen, Receive, or Send if the socket has been closed.
  • You must call Bind before you can call the Listen method.
  • You must call Listen before calling Accept.
  • Connect(string host, int port) can only be called if addressFamily is either InterNetwork or InterNetworkV6.
  • Connect cannot be called if Listen has been called.
  • You have to either call Connect or Accept before Sending and Receiving data.
  • If the socket has been previously disconnected, then you cannot use Connect to restore the connection.

You may use as the basis of your proof the concepts that were discussed in previous Q.E.D. articles, including:

Please post responses in the comments, email them to mperry@adventuresinsoftware.com, or post links on your own blog. I will share my solution next week.



Better solution

The solution left the cache too constraining. While it was correct, it was too tightly locked. All users of the class will line up behind one another, whether they are after the same data or not.

To resolve this problem, let's move the lock. Instead of locking on the cache while fetching the value, let's lock on the key. Two clients accessing the cache with the same key will line up, but clients with different keys will not.

But we can't lock on the key itself. We need to lock on something with identity. The identity of the key is not important. A client can create a new instance of the key having the same value, and that should qualify as the same key. So we have to look elsewhere.

Fortunately, the value of the key is used to find an object in the dictionary. That object does have identity. So we move the lock there.

public class CachedObject<TValue>
{
    private TValue _value;
    private bool _fetched = false;
    public DateTime DateCached { get; private set; }

    public CachedObject(DateTime dateCached)
    {
        DateCached = dateCached;
    }

    public TValue Value
    {
        get { return _value; }
        set { _value = value; _fetched = true; }
    }

    public bool Fetched
    {
        get { return _fetched; }
    }
}

public class Cache<TKey, TValue>
{
    private TimeSpan _expiry;
    private Dictionary<TKey, CachedObject<TValue>> _store =
        new Dictionary<TKey, CachedObject<TValue>>();

    public Cache(TimeSpan expiry)
    {
        _expiry = expiry;
    }

    public TValue Get(TKey key, Func<TKey, TValue> fetch)
    {
        CachedObject<TValue> found;

        lock (_store)
        {
            // If the object is not in the cache, or it is expired,
            // fetch it and add it to the cache.
            DateTime now = DateTime.Now;
            if (_store.TryGetValue(key, out found) &&
                found.DateCached + _expiry > now)
            {
                return found.Value;
            }

            found = new CachedObject<TValue>(now);
            _store.Add(key, found);
        }

        lock (found)
        {
            if (!found.Fetched)
                found.Value = fetch(key);
            return found.Value;
        }
    }
}

The Get method locks first on the cache. This protects the data structure, and ensures that we get just one instance of CachedObject. Once it has that, it locks on the found CachedObject so that clients after the same key line up.

Inside the lock, it checks to see whether the value has already been fetched. If not, this thread is the first in line, so it fetches the value. But if it has already been fetched, then the thread was not the first in line and can just use the fetched object.

The above is not a formal proof, and has many logical holes. Hopefully we'll have time to shore it up, and possibly find bugs in the code along the way. This is the kind of code that definitely needs to be proven. Sure, unit testing can give us some confidence, but no amount of unit testing can verify that this code is completely thread safe.



Cache

Given this source code for a cache, rewrite the API to prove that a race condition cannot occur:

public class CachedObject<TValue>
{
    public TValue Value { get; private set; }
    public DateTime DateCached { get; private set; }

    public CachedObject(TValue value, DateTime dateCached)
    {
        Value = value;
        DateCached = dateCached;
    }
}

public class Cache<TKey, TValue>
{
    private TimeSpan _expiry;
    private Dictionary<TKey, CachedObject<TValue>> _store =
        new Dictionary<TKey, CachedObject<TValue>>();

    public Cache(TimeSpan expiry)
    {
        _expiry = expiry;
    }

    public TValue Get(TKey key)
    {
        CachedObject<TValue> found;

        // If the object is in the cache, check the date.
        if (_store.TryGetValue(key, out found) &&
            found.DateCached + _expiry < DateTime.Now)
        {
            return found.Value;
        }
        else
        {
            return default(TValue);
        }
    }

    public void Put(TKey key, TValue value)
    {
        _store.Add(key, new CachedObject<TValue>(value, DateTime.Now));
    }
}

The caller has to use it properly for it to work. They have to call Get before Put (prerequisite), and they have to synchronize around the two calls. If they don't, they run the risk of creating a race condition.

A race condition occurs when the outcome of an operation depends upon other operations happening in parallel. If one thread gets finished first, there is one result. If another thread finishes first, there is another. In this case, two threads could try to Get the same value. Finding none, they can both race to fetch the value and put it in the cache.

Rewrite this API to prove that a race condition cannot occur.



Fixing an unhelpful API

Prove the correctness of a service. Here is how it started:

public void DoWorkWithOrderService()
{
    using (ITransaction currentTransaction = new AcidTransaction(_container))
    {
        OrderService.Transaction = currentTransaction;
        OrderService.DoWork();
        currentTransaction.Commit();
    }
}

It is only possible to prove the correctness of this code based on the caller (seen here), and not the order service itself. If the caller forgets to initialize the transaction property, if the caller forgets to create the transaction in a using statement, or if the caller forgets to create a new instance per thread, then bad things can happen.

Unhelpful_API The order service is an unhelpful API. When you approach it, you have to use caution. If you use it in the wrong way, bad things will happen. If you're lucky, it will throw an exception. If you are unlucky, you won't catch errors until you release to production.

An unhelpful API must be approached with fear.

Parameters are prerequisites
The first step in improving this unhelpful API is to ensure that a transaction exists before a service method is called. We can do this by turning the property into a parameter.

Before:

public class Service : IService
{
    public ITransaction Transaction
    {
        set;
        get;
    }
    
    public void DoWork() { }
}

After:

public class ServiceFactory : IServiceFactory
{
    public IService CreateService(ITransaction transaction)
    {
        return new Service(transaction);
    }
}

public class Service : IService
{
    private ITransaction _transaction;
    
    public Service(ITransaction transaction)
    {
        _transaction = transaction;
    }
    
    public void DoWork() { }
}

You cannot get a service from the factory without providing a transaction. Therefore, you cannot make any service calls before providing a transaction. Q.E.D.

Callbacks are prerequisites
We have proven that a transaction is provided before a service method is called, but we haven't yet proven that the transaction gets disposed. We can take one more step toward making this a helpful, provable API using a callback.

public class ServiceFactory : IServiceFactory
{
    private ITransactionFactory _transactionFactory;

    public void CallService(Action<IService> action)
    {
        using (ITransaction transaction =
            _transactionFactory.CreateTransaction())
        {
            action(new Service(transaction));
        }
    }
}

A transaction factory is injected into the service factory (via constructor not shown here), and that is used to create a transaction as needed. That creation happens within a using statement in the service factory, not in the calling code. There is no way for the caller to forget to do this. Thus we have proven that the transaction gets disposed.

In these two steps, we have transformed an unhelpful API into a helpful one. The unhelpful API offered many options, most of them wrong. The helpful API offers only the options that are correct. It guides the caller in its proper use. It cannot be used incorrectly.

An unhelpful API cannot be proven. It must be approached with fear. A helpful API can be proven. It can be approached with confidence.



Service transaction

Below is the source code for a service. Prove that:

  • The service has a transaction when it is called.
  • The transaction is disposed.
  • The code is thread safe.
public Order GetOrder(int orderId)
{
   return InvokeGetOrder(() => OrderServiceProvider.GetOrderWithItems(orderId));
}

public Order InvokeGetOrder(Func<Order> function)
{
    Order order;
    using (IBusinessTransactionContext currentBusinessTransactionContext =
        new AcidBusinessTransactionContext(_container, false))
    {
        OrderServiceProvider.BusinessTransactionContext =
            currentBusinessTransactionContext;

        order = function();
        OrderServiceProvider.BusinessTransactionContext.Commit();
    }

    return order;
}


Correct the induction problem

Let's take another run at proving that a watch movement with 20 parts has 19 connections.

SixGears It is difficult to prove that any movement can be constructed one gear at a time by making just one connection with an existing gear. And here is a counter-example to demonstrate the problem.

A ring of six gears, all the same size, is a movement with one degree of freedom. If you construct this movement one gear at a time, you will connect the final gear to two prior components, not to just one. This movement has 6 gears and 6 connections, not the 5 that we want our proof to show.

Something is obviously wrong here. This movement is not what is intended by the problem, but it is not explicitly prohibited either. We need to change the problem:

Given: A movement with 20 moving parts having one degree of freedom and no redundant connections.

Prove: There are 19 connections among those parts.

Now our problem is precise enough. While it may seem wrong to change a problem in order to solve it, this is something that happens all the time in software. The requirements are never stated precisely. We always have to go back and ask questions. In the process, we refine the definition of the problem until we agree on something that can be implemented unambiguously.

Find a new induction
After changing our problem to make it provable, we still find that our original approach is difficult. In order to prove that any valid movement can be built on part at a time, we have to prove that there is always at least one component connected to only one neighbor. We have to prove the existence of a leaf.

While this is not a terribly difficult proof, there is an easier way. We could choose a less restrictive induction. Rather than assuming that we can build a movement one part at a time, we only need to know that we can build any movement from two smaller movements. This satisfies the requirements of inductive reasoning.

So first, to prove that we can construct any movement from two smaller movements, we just need to select any connection within the movement. Now break this connection. One of two things will happen:

  • You are left with one movement, in which case the connection was redundant. The larger movement was invalid and therefore not covered by the proof.
  • You are left with two movements, each with one degree of freedom. Since we broke only one connection, the others remain intact. Assuming the larger movement had no redundant connections, neither do the smaller ones.

So the process of building a movement out of smaller movements is valid. Now we can write our proof:

  1. Let Mj+k be the movement formed by connecting Mj and Mk.
  2. g(Mj+k) = g(Mj) + g(Mk), because we combined two sets of gears.
  3. c(Mj+k) = c(Mj) + c(Mk) + 1, because we added only one new connection.
  4. c(Mj+k) = g(Mj)-1 + g(Mk)-1 + 1, by applying the assumption of the induction.
  5. c(Mj+k) = g(Mj) + g(Mk) - 1, by association.
  6. c(Mj+k) = g(Mj+k) - 1, by combining 2 and 5.

Q.E.D.


Inductive reasoning

The problem was: Given: A movement with 20 moving parts having one degree of freedom. Prove: There are ___ connections among those parts. The answer, of course, is 19. But the real exercise is in the proof. This problem is solvable using a technique called inductive reasoning. Inductive reasoning is proving a statement in two parts. First, you prove it for a trivial case. Then, you prove that if it's true for a smaller case, it must be true for a larger case. State the problemTo begin, we need to define some terms to help us reason about the problem. We'll define the terms M, g(M), and c(M) like this:

  • Let M be a movement with one degree of freedom.
  • Let g(M) be the number of gears in the movement.
  • Let c(M) be the number of connections in the movement.

Using these terms, we can state the problem as:

  • Proove that c(M) = g(M)-1.

Start with the trivialThe trivial case is a movement having a single gear. This movement has no connections, since there are no other gears to which it can connect. Written using our new terms:

  • Let M0 be a movement having only one gear.
  • g(M0) = 1.
  • c(M0) = 0.
  • c(M0) = g(M0) - 1.

Build upwardNow if I have a movement Mj, what happens when I add another gear to it? I end up with a movement Mj+1. I want to prove that the connections in Mj+1 are one fewer than the gears. Inductive reasoning lets me assume that this is true for Mj.

  1. c(Mj) = g(Mj) - 1.
  2. g(Mj+1) = g(Mj) + 1, because I added a gear.
  3. c(Mj+1) = c(Mj) + 1, because I connected that gear to exactly one previous gear.
  4. c(Mj+1) = g(Mj) - 1 + 1, by combining steps 1 and 3.
  5. c(Mj+1) = g(Mj) + 1 - 1, by commuting step 4.
  6. c(Mj+1) = g(Mj+1) - 1, by combining steps 2 and 5.

Q.E.D. Not so fastI was imprecise in my reasoning in a few places. While this still led to the correct answer in this case, it might have led to something completely wrong in a different problem. Inductive reasoning requires that you prove that the steps you take to get from a smaller problem to a bigger one will cover all possible problems. You have to prove that you can build a movement one part at a time by connecting that gear to exactly one gear of the previous movement. While this is true, it is by no means obvious. If we can't prove that our induction can build any movement, then we can't use it to prove statements about all movements. What about those that can't be assembled one part at a time? What we need is a different induction, one that more obviously covers all movements.



Watch movement

In Q.E.D. Hour, we will be proving things about our code. To get the group started in thinking about proof, I assigned some homework. It has to do with a watch. The mechanism inside of a wristwatch is known as a movement. It has many moving parts, but all of those parts work together in unison. It measures only one thing. It has only one degree of freedom. The secret to why the movement works is not in the parts. It is in the connections between the parts, where a connection is the point at which two moving parts touch. Which brings us to our proof. Given: A movement with 20 moving parts having one degree of freedom. Prove: There are ___ connections among those parts. Fill in the blank. I'm sure you already know the answer. But the challenge is to prove it. This is representative of the types of proofs that we will write in Q.E.D. Hour.