Until recently, I wasn’t familiar with the Winchester Mystery House (you can check it out on wiki), but when I did hear about it, the similarities between it and the current data landscape was obvious.
If you’re not familiar with it like me, the Winchester Mystery House is a mansion in San Jose, California, owned by the widow of firearm magnate William Wirt Winchester. Aside from its vast size and the rumour that it’s haunted, it has several curiosities in the architecture and its build.
From 1886 to 1922, construction was seemingly continuous with various building crews working on the site, converting the original eight-room farmhouse into the world’s most unusual and sprawling mansion, featuring:
- 24,000 square feet
- 10,000 windows
- 2,000 doors
- 160 rooms
- 52 skylights
- 47 stairways and fireplaces
- 17 chimneys
- 13 bathrooms
- 6 kitchens
In my mind, this sprawling mansion is analogous to the problem that organisations have with their data today.
As the volume of data that we generate daily continues to explode, it becomes increasingly fragmented. We have silos and pockets of data in various locations, on different storage mediums but with no master plan.
We’ve spent years amassing this data, but like the mystery house, without a map (or a data catalogue), finding the data that we need, and securing that data, is at best, an arduous and inefficient task. At worst, an impossibility.
If I relate this to what we see in customer environments today, it has a significant impact on two major aspects:
Data Science / ML
One of the critical drivers for organisations to run data science initiatives is to drive more value from their data. We often want to correlate data from multiple sources to accomplish that; however, when we don’t know what we have or where we have it, we spend more time wrangling our data than processing it to generate value.
We waste time, become less efficient, and don’t see the gains we expect.
Data Governance and Privacy
Without a doubt, one of the biggest challenges that we face in terms of governing our data is the lack of visibility over what we have and where we have it. Without the map, or a data catalogue, we can’t possibly hope to control our data properly. It opens us up to risk with the potential impact of fines, data loss and reputational damage.
This data fragmentation problem isn’t going to reduce – we are generating increasingly large volumes of data that ultimately creates more silos and greater complexity.
I guess what I’m saying here is that you need the map – you need to understand your data, where it is, what it is, and how to handle it in the right way; without it, you can’t unlock its power or adequately protect it in line with governance and privacy requirements.
Don’t let data fragmentation hinder your organisation’s potential. Discover how we can help chart your data map and leverage your data assets effectively: