Improving data quality is often a priority for many organisations I speak to. As one of the three key pillars for making your data trustworthy – as discussed in depth in our blog ‘3 Key Strategies to Build Data Trust’ – it has become imperative for enabling data-driven decisions.
During these initial discussions on data quality, I’ve often found that there aren’t clear rules for the data, so we spend a lot of time upfront defining what “good” data looks like and what to measure. In most cases, you need to address quality without clear business expectations. In short, what “good” data quality means hasn’t been decided yet.
In this scenario, the obvious approach is to bridge that gap first, to define the requirements first and then look for the best way to move forward based on a clear understanding of the desired outcome – taking a ‘waterfall approach’. Whilst this is sound in principle with many advantages, in today’s reality it can result in long requirements phases.
So how could we do this differently? Cue in an alternative approach – the bottom-up approach we’ve been adopting. In this blog, I wanted to share our take on improving data quality quickly but also through an approach that feels more ‘logical’.
The Waterfall Methodology
Developed in the 1970s, the waterfall methodology is a popular approach in project management, leaning on five sequential phases to take a project from start to finish. It will be familiar to many in IT and other disciplines with its Requirements-Design-Implement sequence. From a data quality perspective, its primary advantage is that it ensures all stakeholders are involved and consulted on what “good” means for the business in terms of data quality. With a focus on gathering requirements before design, this approach ensures that resulting actions are aligned with the desired objective. On the downside, this ‘top-down’ approach can take a lot of time to get going, with much planning and definition before any action starts to happen. Particularly when multiple stakeholders are involved, defining ‘what good looks like’ for example can result in conflicting opinions, scope uncertainty and decision-making paralysis. Following this sequential flow also generally means that results and outcomes are left to the end – extending what we like to call the ‘time-to-value’. So how do we keep the advantages of the waterfall approach yet enable decisions to be made faster?The Bottom-Up Approach
Following a similar sequence to the waterfall methodology, the bottom-up approach takes a different view of how requirements are gathered, designed and implemented. The key difference being the focus on starting with a small scope, delivering against that, and then broadening it out – allowing you to start small, implement quickly and build the scope organically. Starting with a “data-driven” approach to requirements capture, the idea is to use data to help our stakeholders make decisions rather than presenting them with a blank sheet of paper. By analysing the current data and using the findings to inform the requirements phase, we can ask stakeholders to provide feedback on specific issues. This serves as a starting point for determining their requirements. For example, we could ask pointed questions such as:- “We’ve noticed that the column for customer phone numbers has some missing values. Is this a problem? Should we address this?”
- “There are entries in this column that don’t appear to be phone numbers. Were you expecting this?”
- “There are several records with missing or incomplete order details. Is this affecting your reporting?”


