About me

Michael L Perry

Improving Enterprises

Principal Consultant

@michaellperry

User login

Degrees of freedom

saddle A mathematical model of a problem is written with equations and unknowns. You can think of the number of unknowns as representing the number of dimensions in the problem. If there are two, it can be represented on a plane. Three and it appears floating in space. Usually there are more, making it more a difficult to visualize, multi-dimensional volume.

The equations slice through the volume. One equation with three unknowns makes a two-dimensional sheet that flaps in the volume. Two equations with three unknowns form a thread that curves through the system. The thread is not necessarily a straight line; it can swoop through in any of the three directions like a roller coaster. But if you were on the thread you could only walk forward or backward, so the thread is really one-dimensional.

The number of unknowns minus the number of equations gives us the degrees of freedom. So that three-dimensional roller coaster with two equations has only one degree of freedom. If you were to start at the beginning and roll along the track, you could number each point as a distance from the start. That distance will have little to do with the straight-line proximity to the starting point, but since you can't leave the track it doesn't really matter.

The equations that keep us on the track are constraints. If we have more equations than unknowns, the problem is over constrained. If we have fewer, the problem is under constrained and we have some degrees of freedom. If they are equal, then we have a Goldilocks problem: it's just right. We can usually find the one right answer to a Goldilocks problem. But software is not about finding one answer, it's about getting work done. And to get work done we need a little wiggle room. We need degrees of freedom.

Find the degrees of freedom

To identify the degrees of freedom in software, start by defining the unknowns. These are usually pretty simple to spot. These are the things that can change. In a checkbook program, for example, each transaction amount is an unknown, as are the the account balance and the color used to display it (black or red).

Next, define the equations. These are the relationships between the unknowns that the software has to enforce. In the checkbook, the balance is the sum of all transaction amounts. And the color is red if the balance is negative or black otherwise.

Subtract to find your degrees of freedom. One amount per transaction (n), one balance, and one color gives n+2 unknowns. The balance sum and the color rule give us two equations. n+2-2 = n degrees of freedom, one per transaction.

Store the degrees of freedom

Each degree of freedom represents something that the user can do. A user can enter transactions, but a user cannot directly edit the balance or the color. We need to store the user's direct inputs -- the degrees of freedom -- but not the related values. Those can be calculated when needed.

Calculate the rest

When the program needs a value that the user does not directly control, it should calculate it on the fly. The alternative is to store that value and adjust it whenever the user makes a relevant change. Sprinkle enough of these triggers through the system and it becomes a Rube Goldberg machine of cascading events. This is how software rots.

Sometimes you want to store a calculated value for optimization. You might store an account balance if it needed to be read often, or if the number of transactions gets extremely large. But every time you make this kind of decision, you run the risk of inconsistency. Caches need to be invalidated, and changes accumulated. You might even choose to provide a "Refresh" button to spank the software and force it to reconcile. Be aware of every line of this extra code, as it is often the source of bugs.

Understanding the degrees of freedom in the software helps to create a maintainable design. You know what is under the user's direct control, and what is affected indirectly. You know which values must be stored, and which must be calculated. And when you add new features in the future, you can add new degrees of freedom without affecting the previous ones.



1.61803398874989...

Is there a golden rule that you follow to determine if total degree of freedom calculated is too large? i.e. Too few equations for the given amount of unknowns?

Also, is the quality of software design manifested in the degrees of freedom? Or is the degree of freedom amoral?

Sorry I'm a little slow on this subject and until your blog introduces the use of sock puppets to explain the use of mathematical equations to predict the quality of software design, text on a page will have to due. :)

The perfect number

1.618... sounds like the right number. :) I once read a book about the golden ratio. It has a fascinating history, but also a staggering amount of misconception. My family looked at me funny when I took the book to the pool to finish it.

But I digress. The true perfect number of degrees of freedom in your solution is ... the degrees of freedom in your problem.

If your solution has fewer degrees of freedom than the problem it's trying to model, then there are scenarios that it can't handle. Your customer will say that your software is not done.

If your solution has more, then it allows invalid states. There are ways that your software behaves that have no analog in the problem domain. Your customer will file each invalid state as a bug.

We usually don't ship software with too few degrees of freedom. We usually err on the other side. I would include the number of extraneous degrees of freedom among the many indicators of software quality.

I'm working on my Euler sock puppet right now. Video to come. :)