How long this is going to take? - The Art & Science of Software Estimation

Déjà Vu? It's mostly a dreaded question. Do you remember the last time your project manager asked you to estimate the time frame for a project you knew very little about, had very little written specifications on and the stake holders kept changing their minds about feature set? How did you go about estimating the time frame? Was it a ball park estimate, an educated guess or a WAG? How close was it to reality when the software actually got developed? I’ll discuss these questions and share some of my experiences in this writing.

In the January 2007 MSDN magazine, James Waletzky discussed the estimation from a SWAG (Silly Wild Ass Guess) prospect and then talked about Wideband Delphi for estimation. Rapid estimations are a reality of today’s fast paced development environments especially in the vertical markets where software development is means for a tool and not the core business of the organization, What would be the most appropriate method to estimate the time frame which is more efficient than guessing M&M’s in a jar at a Rubio’s and less time consuming than a function point estimation via Matson, Barnett, and Mellichamp Model for E = 585.7 + 15.12 FP?. Let’s see what popular methodologies have to offer.

As a principle, all software estimation methods share the Gödel's incompleteness theorems as their fundamental axiom i.e. the modeling of system is essentially not perfect. Traditional software engineering approaches to estimation include decomposition techniques like LOC (lines of code) based estimations, function point based estimates, process-based estimates, use case oriented estimates etc. These techniques essentially quantify the task breakdown via different methodologies and then estimate the time based on how long a single entity would take. Sounds like common sense, right? The empirical estimation models include function oriented metrics, COCOMO model and its variations. Aside from these heavy weight processes, Agile methodology describes the estimation process as follows.

  • Each user scenario is considered separately for estimation purposes
  • The scenario is decomposed into the set of functions and engineering tasks required to develop them.
  • Each task is estimated separately, based on historical data, an empirical model, or “experience”
  • Estimates for each task are summed to obtain an estimate for the scenario.
  • The effort estimates for all scenarios are summed for a given increment to obtain the increment estimate

This sounds more realistic, five step process and you are done with it. As we already know as Moore et al described “Our preliminary results suggest that complicated methods may not necessarily yield a more accurate estimate, particularly when developers can incorporate their own intuition into the estimate”, or simply that making fancy estimations won’t make you more accurate.

In order to determine the rough order of magnitude (ROM), i.e. a rough estimate of the number of person-days to complete the task, I prefer the agile way. A typical workflow would be as follows.

Identify: Go through the specs (purpose statement/business requirement document or any other document) you have which identifies what needs to be done. Identify how you would do it in the scope and boundaries of current system.

Divide and Factor: With a rough sketch of identified tasks, factor out the similarities and what components can be shared across the board. Split the tasks into smaller pieces until it reaches ‘sane atomicity’. Try to think in service context so the responsibility is centralized instead of shared across the components, it helps the bottom line.

Raise the Unknown: Always identify unknown as risk, it helps. If you don’t know that the new mail server can handle bounce back processing for your application or you have not evaluated the ISO format file parser library performance yet, please make note of it.

Estimate:  Take an educated guess of how long it would take you to complete the tasks. If you are working as a team, knowing their respective skillets, ask fellow developers for their estimates. Most likely they would be the ones working on some of the components too. Find out the median, pad it to include the process factors (documentation, release notes, collaboration delays) and here your ROM guesstimate.

Repeat: As you identify the tasks more and more clearly, re-iterate through the steps for accuracy and effectiveness.

This process does not take too long, depending on the size of project it could be less than an hour and its much better than “hmmm, there are 1000 M&M’s in this bottle”. This is more like “In one square inch of this jar, I can see almost 20 chocolate goodies. Providing the curve and integrating below it, I think it would have…”, you got the idea. You can re-calibrate it with future iterations as it would be a good learning exercise in estimation. This method will get more and more concrete proportional to your specifications as they say, requirements are like water. They're easier to build estimate when they're frozen.

“I love deadlines. I like the whooshing sound they make as they fly by.”
-- Douglas Adams

References and Further Readings