I had a conversation with someone, tracking their team’s velocity in a way that seemed illogical to me. Discussions on velocity often turn into a heated debate on how *exactly* you should calculate velocity. And it’s funny, because it *feels *like it shouldn’t be that difficult.

That same person summarized velocity very elegantly to me:

Velocity is the distance covered per unit of time. In that sense, a story point is an estimated ‘distance’,

Actually, when I think about it now, the summary is even more beautiful than I thought at the time, because you can extend the metaphor.

Suppose we didn’t have a standardized measure, or any **words** for distances. None. You couldn’t say kilometers. You couldn’t say meters. It would immediately trigger a Babylonian confusion. Now, all you know is a number of cities you have to visit, while walking to Rome.

What do you do? The agile way is: You invent a word for distance. Say, quacks. You estimate the distance in quacks to the first, second, third, city. And once you’ve estimated, you start walking. Once you reach the first city, you look back on your journey. How many quacks did you do in a day? Now, you look at the estimated quacks between city one and two (and the others). When do you expect to arrive at city two? You continue your way, onward to the second city. Of course the work we do isn’t always sequential, but often parallel, but let’s stop here. Because today’s conversation was about how to update your future travel plan once you’ve encountered a detour.

## Unexpected work messes with your plan

Halfway to city number three, you encounter something that needs to be dealt with **now**. You have to stop walking and focus your attention elsewhere. You’re not making progress on reaching city three. When you finally get to city three, you’ve spent many, many more days than you expected to.

How does that affect your planning for the next N cities? How likely is it that you’ll encounter more detours, or a broken shoe lace? Not only that, what if you know, from experience, that you’ll have some distractions along the way that have nothing to do with the journey itself? Like answering a phone call.

Let’s come to the core of what the conversation was about.

My conversation partner told me that his team applies a *fixed* factor of 0.9 to convert from velocity to capacity. (And then one more factor to take into account absence of team members.) Furthermore, if unexpected work comes in, they estimate it in story points and **add this to their velocity**. However, this would only extend to *certain types* of unexpected work. Unexpected work of type A should be budgeted for, unexpected work of type B should not.

Basically, during their sprint, they count the story work in points, but also unexpected work in points. They throw that on a single heap they call *velocity* and then when they are asked to give their capacity, they apply a *fixed factor* to sieve out their ‘feature velocity’ from their aggregated metric.

Now, I have a strong feeling that applying this does not do *anything* for the reliability of your velocity, especially since the fixed factor does not seem to be given by measuring the amount of unexpected work that came in. But, I hadn’t done the math yet. So this blog post is all about doing the math behind this exercise.

Today’s question:

Given that velocity is a measure of how many story points you can complete in an iteration. Does including the unexpected work in your velocity, and then applying a constant factor to your capacity to account for unexpected work, make your velocity more reliable?

This Medium post lists multiple ways to deal with unexpected work and suggests that planning a buffer would make your forecasts less reliable. Alternatives are listed, as well as the remark that planning a buffer can be an economical way to deal with unexpected work.

## Two methods of calculating velocity and capacity (maths!)

Here we go! Suppose we have a team that has a fixed, known velocity \(V\) (points per sprint). They are 100% predictable: Every single sprint they will complete exactly \(V\) story points. Suppose we could clone this team, so that we can run them side by side on the same story. So we have identical teams, Team 1 and Team 2, both working at exactly the same speed. Not only that, the team members never take a holiday, so capacity is always at 100%. This is our controlled environment, in which we will test two different methods of tracking velocity. Note: These teams don’t actually **know** their own velocity yet. They’ve never tracked it and are going to track their own velocity for the first time.

Let \(S_n\) be the story points completed in sprint \(n\). These are points for the work on the sprint backlog that the team committed to in advance. Also sometimes called ‘feature work’. So, this is the stuff that you’ve determined should be at the top of your backlog.

**Team 1** calculates their velocity after \(n\) sprints as follows

\( V^1_n = \frac{\sum_{i=0}^{n} S_n}{n} \)

and their capacity for sprint \(n+1\) as follows

\( C^1_{n+1} = V^1_{n} \)

**Team 2** calculates their velocity after \(n\) sprints as follows

\( V^2_n = \frac{\sum_{i=0}^{n} (S_n + RU_n)}{n} \)

They have an extra parameter: \(RU_n\), which is the unexpected or unplanned work that came in during sprint \(n\). Note that this is of a type that they have *reserved time for*. (They ignore the work that does not fit in the predefined bin.) Their capacity for sprint \(n+1\) is

\( C^1_{n+1} = V^1_{n} * (1 – RF) \)

Also here, another extra parameter: \(RF\) is their reservation factor. This is a percentage of their time they set aside for some forms of unexpected work. (In the discussion we had, this was set to 10% or 0.1)

## Approach

Suppose we have a feature worth \(X\) points. Suppose that these are perfectly estimated points: we have a team that is perfect in every way, in estimating and execution. Furthermore, we’re going to bug this team with unexpected work. We’ll bug them with \(RU\) and \(NU\), reserved unplanned work and nonreserved unplanned work. We’ll spread these interruptions across their sprints.

How long will the feature take to complete? Well, we know the teams’ true velocity (they always complete exactly the same work), we know exactly how much interruption we’ll be giving them. So, it will take the two identical teams

\(f = \frac{X+RU+NU}{V}\)

sprints to finish the work. (Assuming all the unexpected work comes in before the story is finished, no ‘trailing’ unexpected work is counted.)

But, the team doesn’t **know **this yet. They haven’t tracked their velocity yet. And each of them has a different method to track their velocity.

So, how good do the two methods predict \(f\)? What we can do, is feed each of these formulas **back into itself**. So, first we obtain the velocity and capacity that would result after completion of the feature. And then we check how well they compare to \(f\).

## Team 1 velocity

After completion of the feature, team 1’s self-tracked velocity is

\(V^1_f = \frac{\sum_{i=0}^{i=f} S_f}{f}\)

And, because these conditions are perfectly controlled by us, we know the values of \(f\) and \(S_f\). Therefore

\(V^1_f = \frac{X}{\frac{X+RU+NU}{V}} = \frac{VX}{X+RU+NU}\)

## Team 2 velocity

Similarly, team 2’s self-tracked velocity is

\(V^2_f = \frac{\sum_{i=0}^{f} S_f + RU_f}{f}\)

And, again, we know the scenario that this team went through, so we know what it is they measured:

\(V^2_f = \frac{X+RU}{\frac{X+RU+ NU }{V}} = \frac{V(X+RU)}{X+RU+ NU }\)

## Comparison

Now that both teams have tracked their velocity, we can feed the same story back into their own formulas. We wipe their memory so that they don’t know they already implemented the feature, we hand them their own metrics and tell them: Here’s your own velocity and here’s a story of size \(X\). How long do you think it will take? Both teams calculate \(E\), the expected number of sprints.

First, team 1:

\(E^1 = \frac{X}{V^1} = \frac{X}{\frac{VX}{X+RU+ NU }} = \frac{X(X+RU+NU)}{VX} = \frac{X + RU + NU}{V} = f\)

So, team 1 returns **exactly **the time that it actually took them to execute the story! This is what velocity is all about: Over the course of a number of sprints, their velocity reflects how they *actually performed*.

Then, team 2:

\(E^2 = \frac{X}{V^2} * (1-RF)= \frac{X}{\frac{V(X+ RU)}{X+RU+UU}} * (1-RF) = \frac{X(X+RU+UU)}{V(X+RU)} * (1-RF)\)

Compared to team 1… team 2 does not return a true reflection of the work that they did. Their estimate is muddled by two extra factors, \(RU\) and \(RF\).

Under what circumstances would team 2 give an estimate that reflects their learning from previous sprints? Only if \(E^2 \equiv f\). Let’s see when that is the case:

\(E^2 \equiv f\)

Evaluate both left and right hand side

\(\frac{X(X+RU+NU)}{V(X+RU)} * (1 – RF) \equiv \frac{X+RU+NU}{V}\)

Divide right hand side by fraction on the left hand side

\( 1 – RF \equiv \frac{ \frac{X+RU+NU}{V} }{ \frac{X(X+RU+NU)}{V(X+RU)} }\)

Simplify right hand side

\( 1 – RF \equiv \frac{X+RU+NU}{V} * \frac {V(X+RU)} {X(X+RU+NU)}\)

Simplify right hand side further

\( 1 – RF \equiv \frac {X+RU} {X}\)

Again:

\(RF \equiv 1 – \frac {X} {X} + \frac {RU} {X} \)

And lastly, eliminate what is effectively \(1 – 1\):

\(RF \equiv \frac {RU} {X} \)

So, basically, team 2 only gives a proper estimate, if \(RF\) is **exactly** the ratio the interruptions and the ‘real work’. The way to get this ratio, is again by tracking: You’d have to track \(RU\) constantly and update \(RF\) constantly. But I hope you’ll agree with me that this doesn’t do anything for the actual *metrics*. The only thing you would get, is that you would introduce a factor in your velocity that you then perfectly divide out of it again. It’s pointless…

Of course, for learning purposes, it’s good to track the unexpected work and talk about it. But the second calculation method (with the *fixed* value for \(RF\)) does not give any added benefit in terms of reliability, quite the contrary.