IT Metrics excellence: avoiding the world turning upside-down


Regardless of how a software development project was contracted (fixed price, time and materials, etc.) or how it was managed from a methodology point of view (Agile, Waterfall, Prince2, …), by the end, a software product will be created. This product, irrespective of contract or methods, will have a specific cost, a size, different quality indicators, and a set of factors that will help to determine, if for example, the cost is in line with the end product.

It’s essential to distinguish clearly between ‘product’ and ‘project’ concepts, by recalling that the goal of a project is to create a product, a service, or achieve a certain result. The difference between both concepts is clear, but when they are looked at through metrics glasses, things can get more complicated because one same metric may be linked to different understandings. The size of a project can be seen, for example, as project duration or project effort. The quality of a project can be judged by how well or badly it has been managed and whether a set of predefined or required indicators has been met. The size or the quality of a software product (that was created in a project) are totally different concepts, far from the project perspective.

Projects, products and metrics

We can collect dozens of project metrics, but project metrics without product metrics sometimes provide isolated indicators, lacking the most valuable and strategic information. One can determine that project estimate is coherent and tally with actual ones. Similarly, it’s possible to establish that the project schedule was adhered to and that this was better than expected. And there are many other extremely useful metrics, which tell us if a project is adhering to a set of pre-established rules and indicators. But unfortunately, they don’t do much more than that.

However, if we add a set of apparently straightforward questions, the results can be totally different: is the project planned effort coherent with the one that might be? Is the quality expected and agreed coherent with the one that might be? Are the different KPIs consistent and provide reliable results and conclusions?

The point is that you need to go beyond project aspects and take a wider view, considering factors that are extremely valuable and make sense in everyday, practical terms – right down to low-level project metrics and indicators, but also addressing strategic and CIO-level outcomes, plus company-wide projections (internal) and market forecasts (external). Bringing all of these aspects together on different levels is an amazing opportunity to improve the insights gained from metrics and enhance the results and conclusions. This is the case whether it applies to projects, products, financial aspects, competitive indicators affecting a company, or future strategic expectations.

There are so many indicators and KPIs that can be applied, but there are still always key questions that need addressing. Despite collecting dozens of important metrics, have we realistic productivity, quality, product costs, responsiveness, and time- to-market metrics? Can outcomes and their causes be compared to other, previously delivered projects in different fields of technology, perhaps using different development tools, methodologies, or carried out under different contracts? Is there any way to determine why some projects perform better than others? Can we use standard, recognised methods to draw comparisons between our company and others, even on a national level, or across sectors, or perhaps worldwide?

To answer these questions correctly, it would be necessary to address a number of aspects, the first of which is that information has to be realistic. Are the numbers recorded correct? Sometimes the fact to reward (or even punish) teams or people, or to create rankings and distribute certain figures internally, can contribute in a kind of ‘engineer the numbers’ – a bit like photoshopping numbers: you take a picture of someone and retouch it if the eyes or hair don’t look right. If numbers are not the correct ones, any derived metrics and conclusions will be wrong and will not have sense; just nice dashboards, appealing presentations with pie charts or graphs, but perhaps far from the reality.

So after information has been correctly recorded, with honesty and fair on all levels, one of the fascinating challenges is to analyse the “why” and “how” concepts to be more competitive and to create a better product or service for the customer. Why did this set of projects result in lower productivity levels than others? How does our quality compare internally or externally with standard worldwide indicators? How can we add more value for the customer? And what if …? Those questions and answers are key for improving, for moving to more mature estimation levels, for measuring the improvements, ROIs, and provide CIOs level effective dashboards on a strategic level.

Incorrect common denominators; when the world is turned upside-down

What’s sometimes curious in the software industry is that the measurements that people capture don’t always relate to product size. Too often, product size is linked in some erroneous manner to product costs or to development resources. There isn’t necessary a link between a project and a product: a project that took 5000 hours can deliver a much smaller product than another project that only took 3500 hours, even if the same quality criteria were fulfilled. This is why product size has to be considered a given. Without understanding size, other metrics such as quality only indicate factors such as the number of defects (even the defect concept can be somewhat fragile). If instead of looking at product size, we consider project size (for example the effort of delivering a project, in terms of resources), all subsequent conclusions will be wrong. What’s more, sometimes the worst projects can score best in terms of quality, simply because resource investments were higher (and projects that are expected to deliver a given functionality of the same quality typically need more time). Basing metrics on the fact that a 5000-hour project resulted in 25 defects is not the same as using metrics for a 3500-hour project with 25 defects, even if both were the same in technical terms – the first one was less efficient. This is a bit like turning the world on its head.

On other occasions, product size may be linked (incorrectly) to the number of lines of code (LOC) that had to be written or edited. Apparently, the higher the LOC number, the more productive the outcome is or even the better the quality (even if some lines were unnecessary). As is the case with product size being confused with project size, there are wrongly assumed advantages of programmers who create a product with 700,000 LOC compared to those who deliver the same product, offering the same quality, performance and security with only 300,000 LOC. Again, quite simply, it is not the same producing 10 defects in 700,000 LOC as 10 defects in 300,000 LOC. The first project may appear better, but from the point of view of customer functionalities, the product could be exactly the same. Obviously the same logic can be applied when size is linked to the number of programs, or the number of processes, or team size, project duration, etc. Again, the world is upside-down.

Looking at LOC aspects from this angle may seem a bit abstract, but unfortunately there is no official standard for what constitutes a line of code. Is it a physical LOC? With or without comments? Sometimes comments can provide help, but on other occasions they are unnecessary and old code may not have been removed. Or is the LOC count about logical lines, statements, structured vs. unstructured programming, embedded code, etc? Depending on the criteria used for LOC counts, there can be huge differences, providing completely different results. Sometimes the reason for selecting certain criteria is that they just suit a given situation, contract or derived metrics. We could write dozens of articles and produce many studies on this fascinating topic.

High level of confidence and trust

The ultimate goal is to measure the product size using standard, recognised methods – techniques that take a global perspective and instil confidence and trust, not only internally but also when it comes to the relationship between providers and customers. With the required know-how and accreditation, different people will arrive at the same numbers by counting the size of an IT product or the functionalities provided by an IT product – and not the price paid, the effort invested in creating a product, physical code or the number of programs.

By measuring the magic number, you end up with a ‘true common denominator’ – a figure that can be used as a basis for other metrics (and some derived metrics can even be refined, adding other dimensions such as application criticality or the type of technology used, in order to compare like with like).

The Functional Size can be obtained using different methods, but the IFPUG Functional Size (so in the aspects of counting Functional and even Non Functional aspects) is the most consolidated and used worldwide (together with Cosmic and NESMA, both of them derived years ago from IFPUG), ISO recognized, and one interesting point, in addition that it has the biggest world community, is that due that it is the most widely used there are a high number of standard indicators, benchmarking and references that provide guidance to all levels. Functional Size metric is required for government software projects contracts in countries such as Brazil, Italy, Japan, Malaysia or South Korea, and it is widely used around this small world.