It often feels like the subject of data management is intentionally made opaque and confusing with lots of its own terminology and language. However, from my experience it’s not actually rocket science and the terminology is just hiding a lot of common sense that has been built up through joint experience. Instead of following the usual pattern of telling you what all these things are and why you need them, I want to take a different approach and simply try and describe the general idea of things in a way I hope is easier to understand.
Fundamentally, data management is not an invention; it’s not a specific thing that someone came up with one day and patented. It is simply what it says on the tin – the management of data – and the field covers a number of practices (and technologies to support those practices) that are relevant to help us manage the challenges we see surrounding the data we have and its application in the modern world.
The Library Analogy
We’ve been managing data for centuries. Before all this recent computer-related excitement, we used to use books and it didn’t take us too long to realise that we needed a bit of organisation to ensure our information was looked after and available when we needed it. Think about how libraries work. For most aspects of a library, there’s an equivalent we need to consider for managing electronic data.
People go to the library to look for information. It’s useful if there is a way for them to easily find what they are looking for. In some cases, they have a specific book in mind, in some cases they are interested in a topic and are trying to find a good book to help them. When people talk about data cataloguing, this is often the sort of problem they are trying to solve.
Librarians (or even curators) look after the books in the library. They decide which books to include or discard, ensure the books remain in good condition, and organize them properly on the shelves. Say hello to data stewards, who are similarly tasked with the on-the-ground management of data to make sure it is understood, labelled and managed.
I’m yet to find an example where this analogy falls down (if you can call it an analogy, because in both cases we are talking about the management of data.
Yin and Yang
At its core, data management tends to serve two purposes:
- Controlling data to minimise risk.
- Maximising the ability to use data to benefit the organisation, through driving efficiencies or finding new ways to generate income.
These two goals are supported by a number of activities that are the things that tend to crop up when people talk about data management – things like data cataloguing and classification, data quality, data lineage, data access control, data retention and lifecycle management. All of these activities are there to serve those two outcomes and everything should be viewed through that lens. If you are looking into data lineage, consider how understanding how the data flows will allow you to gain control of your data or enable new insights by helping people access the data.
These two motivations are often perceived to conflict (control stifles enablement, enablement relinquishes control), but a better way to think about it is that good data management can alleviate the conflict and allow both to flourish. If you have a clear understanding of your data, clear policies and a way to enforce those policies you can release restrictions with the confidence that the appropriate guardrails are in place – enabling data-driven activities in a controlled manner.
Data Management should be a verb not a noun
A lot of the terminology and language used in data management uses nouns to describe how to deliver against it – data quality, data catalogue, data lineage. This might falsely feed into an idea that data management is delivered by the things you have – if I have a data catalogue then I am doing well. I believe this is one of the pitfalls in this area – that data management is somehow delivered by finite change projects that introduce a “thing” (often by simply buying and installing some software).
I would argue that data management is delivered by the things you do. It is useful to consider all aspects of data management as activities or ongoing processes. Think about data cataloguing and how you will do it rather than how to introduce a data catalogue (or worse, focussing on which data catalogue software product you are going to buy). This should encourage you to think about who it is for and how it will help and therefore steer how you will do it. In the end, this might help you identify the right tool to help you do that.
Metadata Everywhere
At its heart, managing your data is powered by knowing about your data and understanding it. That’s where metadata comes in – the data about your data; how big is it? What type is it? How old is it? Who does it belong to? Does it contain any sensitive information? Is it accurate and up to date? One of the foundational pieces of data management is collecting and storing this metadata. Anything you know about your data is metadata.
That information then helps define and measure against policies – where should sensitive information be allowed and not allowed? When should data be archived or deleted? At what point is the quality of some data fit for purpose?
Finally, once you are measuring against your policies, you can act to resolve any issues – data quality shortcomings, risk resulting from inappropriate storage of sensitive data, cost resulting from the storage of things you don’t need. This sort of thing normally comes in two guises: dealing with the past and dealing with the present. When you look at the current state for the first time you will likely discover a lot of historical issues which will need (a lot of) remedial action to get everything to an acceptable level. In parallel, you also need to manage new issues – i.e. resolving the root cause to stop it from happening or intercepting issues when they arise.
In Conclusion
Data management is just managing data – it is not some patented invention that you need special training to understand. You do it because it helps – generally because it minimises risk or enables your organisation to do more or new things. Everything people talk about in the data management world is around things you can do in order to achieve these outcomes and it’s all about what YOU need. Software and technology cannot help you to work out what will help although it can help you execute on what you have decided you need to do.
Simply put, think about why things should be organised and orderly and how it could help if they were and that will help you understand what the world of data management is there for and how it can help you.
At Nephos, we blend our deep technical expertise with a keen understanding of strategic business needs to provide cutting-edge data solutions. Ready to elevate your data capabilities? Explore our management services, including data discovery and classification, data quality, AI success setup, and data governance. Discover how Nephos can transform your data capabilities today – Click here.