Skip to main content

With all the current excitement and energy around Artificial Intelligence driven by the rise of GenAI, it’s an interesting opportunity to reflect on the one of the more recent waves of technology-driven change, Digital Transformation. From the early stages of Digital Transformation to the rise of Artificial Intelligence (AI) and Machine Learning (ML), our world has undergone a significant shift.

While these advancements bring new opportunities, they also come with challenges. Interestingly, the learnings from the wave of Digital Transformation and our existing experience of working with large data sets from the past decade or so can help us make the most of AI.

Digital Transformation: A Quick Look

Some trace the history of Digital Transformation back to the 1970’s, but for me it became a buzz word in the 2000’s with its friends Big Data and Internet of Things (IoT) promising revolutionary new services based on the ability to access and process vast amounts of new data.  It was about all the new signals and messages that could be captured over the new IoT and other sources and how we could store and process that Big Data at scale to unlock value within it. It enabled new use cases such as understanding the details of every individual driver to give them a bespoke car insurance policy based on their driving style rather than insurance providers relying on demographic generalisations to approximate the risk.

Now that we have been through that Digital Transformation wave, these capabilities are in place and available to support any and all new use cases.  It could be argued that ChatGPT and other LLMs are just that – an application of Big Data using the biggest data of them all, the Internet.  LLMs are effective because of the vast amounts of data they are trained on, the vast amounts of data they have access to. That brings us to the question – what have we learned about working with large data sets during the Digital Transformation wave that can help drive AI success?

Data Quality: Garbage In Garbage Out

This phrase is as true of data processing now as it was when it was first used in the 1950s. Computers have always simply done what you tell them to and over the years we’ve been teaching them to do more complex things.  No matter how complex the task, if there are flaws in the input then there will be flaws in the output.  Digital Transformation did not teach us this, but as datasets and data processing grew in scale and complexity, it became harder to keep track of quality of the data being used and the need for automated checking expanded accordingly.  This led to a lot of activity to develop and mature data quality practices, for example the development of dimensions for data quality to define different ‘qualities’ of the data that might need to be explored and understood. Many publications covered this topic starting from the late 1990’s and peaking in the 2010’s when Digital Transformation was likely providing a lot of the motivation.

There are now mature data quality frameworks we can draw on to structure and address this issue with a myriad of tools available to helps us implement data quality to help us ensure that our new artificial intelligence algorithms are trained on and using high quality data, giving them the best chance to succeed.

Data Accessibility: Let Me At It!

There has always been a tension between permitting that which can help whilst preventing that which can harm. With data, the tension is between confidentiality and value – sharing all data with everyone can be harmful and is likely to breach regulatory requirements while sharing nothing stifles your ability to generate insight and enable data to help you make the right decisions.  Digital Transformation highlighted and enlarged the potential for driving value – even suggesting that you will wither and die if you fail to harness the power of your data – which correspondingly drove a rise in the risk and associated concern around privacy.

Would GDPR exist if it were not for the rise of digital companies that are collecting so much information about us and had the ability to use that data in potentially uncontrolled ways?  In response to this, we now have new tools and techniques for understanding where the sensitive data is and allowing the right people to see it while preventing the wrong people from seeing it so that we can climb trees without the risk of falling; we can maximise the volume of ‘safe’ data that we make available to AI.

Data Cataloguing: What’s Where

Digital Transformation provided ways to do new things and those new things involved new ‘stuff.’  Suddenly we had object storage and semi-structured databases and data lakes storing all sorts of new formats of data with a Hadoop cluster in the corner for crunching lots of big numbers.  We had to get the data in and out and about which meant more moving parts and more cups to hide balls under.  Not only that, but we were also trying to get a better grip on quality and access (see above!).

Whereas previously cataloguing your data might be realistically achievable through a spreadsheet or your CMDB, now it was getting a bit chaotic and something more substantial and purpose-built was required.  Again, necessity was the mother of invention, and we now find ourselves with a plethora of tools available to help us and an increasing focus on semantics of data to ensure we set up our data catalogues in a way that helps those working with data do so more efficiently and effectively.  Just as this helps humans find what they are looking for or answer their questions more efficiently, it does the same for our Artificial Intelligence colleagues.

Final Thoughts

As we continue to explore and develop new technologies, it is essential to remember the lessons learned in Digital Transformation. Firstly, the understanding that regardless of computation complexity, poor data in results in poor data out, has led to stronger data quality practices to ensure AI systems are fed high-quality data, enhancing their effectiveness and reliability. Secondly, the challenge of data accessibility has provided us with ways to safely open up access to data. Lastly, increases in the variety and distribution of our data have resulted in sophisticated data cataloguing solutions to help us keep track of where everything is. Such advances provide us with the ability to govern our data effectively and efficiently, aiding humans and AI alike.

We have learned a lot as we navigated our way through those years when Digital Transformation was the phrase on everyone’s lips, maturing the approaches and tools we use to manage and exploit large volumes of data. This new wave of interest in AI is simply an evolution of leveraging Big Data and we can go into it with an existing understanding of what it takes to make sure we get the outcomes we are after.

 

Allan Watkins

At the heart of Allan's professional journey, spanning more than 26 years, lies a deep-seated passion for acquiring knowledge and understanding the mechanics behind how things operate. His thought leadership content mirrors this curiosity, enabling readers to broaden their understanding of data governance and its complex web of policies.

Close Menu

© Nephos Technologies Ltd.