A Purpose-based Data Integration Strategy
If you are reading this article, we can safely guess you are involved in some aspect of big data initiatives. Whether you are at the creation end or the receiving end, the data integration challenge is as daunting as the volume growth is impossible to control. Yet, organizations who intend to differentiate themselves, have no choice but to transform their established processes and ways of thinking. The question is “How?”.
With the omnipresent explosion of technology adoption touching everyone’s life, we have seen an unprecedented growth of data creation, which leads to more technology based solutions and services, creating more data and the cycle continues. This is relevant because data truly changes our lives. It provides the foundation for us to make better decisions. It enables us to engage our customers in a more relevant and personal way. There are few successful business processes today that do not leverage their relevant data asset. How would an air traffic controller make thousands of real time decisions without relevant and real time data about outgoing and oncoming air traffic? How would the consumer experience feel if Netflix eliminated all reviews, ratings and the optimization of its recommendations based on what you and “people like you” watched and rated? We can’t succeed in business today without the analysis and understanding of our data assets. And depending on the industry, the availability and access to data varies greatly, and so do the associated challenges.
The biggest risk we face is not the impasse of ten years ago, nor the cost or the complexity. Rather, the biggest risk we face is the risk of not choosing
Luckily, technology came to the rescue and has enabled us to harness the data explosion and adapt at great speed. The volume of data and the need to leverage all data to generate the most precise insight has lead us to shift from a model where we select what we know is important today, to a model leveraging unfiltered and unrestricted data lake approaches. This evolution also enables us to look at different data. We can no longer require data to be “massaged” before it is stored. We can no longer rely on scheduled load processes, with predictable volumes of structured data. We must now embrace the opportunities of unrestricted streams of unstructured data. Ten years ago, this concept would have been met with skepticism at best, or flat out rejection in most cases. Technology allows us to capitalize on the asset and expand the impact of data on business and peoples lives. Taking it a step further, any organization refusing to take advantage of this opportunity will fail to differentiate itself. Recent innovations have been supported by the integration of external or crowdsourced data assets. Think about how the customer experience has improved when searching for a recipe, as an example. The content (books of recipes) has existed for centuries. The curating of the content (by quality, complexity, etc.) has been made possible by consumers rating and reviewing the content and sharing it via social communities. Would you prefer looking for a crepe recipe in a random book or on Pinterest, which has already filtered unpopular (un-pinned) content? The same concept applies to businesses -- would you rather forecast your hotel occupancy based on last year’s reservation log or leverage real-time listening of Facebook feeds telling you what travelers are talking about with conventions and sporting events schedule from the city and in turn have complex science deliver the optimized promotional offer (to send alongside information and offers from a local brewery) which seems to be appealing to your target audience.
But that is just scratching the surface of where data and technology will take us. We need to relinquish what was once the controlled and predictable data warehouse environments and open the data infrastructure that we enable. While the technology contributes to resolving the input of data into our systems and solutions, the change in expectations from the end consumer presents an additional set of challenges. Whether we focus on internal analysts and data scientists or on the external customer, we have to recognize that the rules of the game have changed. Data consumed overnight and processed in batch linearly, cannot support today’s expectation around latency. The American Express fraud detection algorithm needs to trigger the customer phone call milliseconds after the suspicious transaction. Tomorrow is too late. Similarly, how we consume data has evolved. Writing queries and caching the output for future consumption is pointless when we expect to ask the question using natural language and without having to physically interact. We have adapted to consumers, making data and services available to them via Siri and Alexa. Yet, we are slow to bring the capability to business applications. Machines compute data and learn faster than any human can. By leveraging machine learning techniques, we can ensure our growing data assets are turned into relevant insights, which in turn become data which can be used for optimization or customer engagement. Our mission is to make this critical insight available seamlessly.
As the end consumer expects reliable, always-on, and precise insight or engagement, this forces us to continually rethink our approach to data infrastructure and architecture. A static and predictable environment is nearly impossible, or will become a burden for business and customers.
In the end, all the challenges we face can be solved through technology. More volume? Buy more storage, it is almost free. Less structure? Store it as is and use text analytics solutions. More complexity? Let the machines learn. More time sensitivity? Implement in-stream data and science capability. Every challenge has its solution. The biggest risk we face is not the impasse of ten years ago, nor the cost or the complexity. Rather, the biggest risk we face is the risk of not choosing.
Data as an asset is a differentiating factor. The ability to create and absorb data, the access to brilliant minds to help us interpret it, are not only game changing, it is the journey that successful organizations must follow. What we won’t be able to create is time. We need to choose what we integrate and what we prioritize, because time is finite. And the only way to succeed is by focusing on the purpose of our data asset. Are we creating insight to drive weekly decisions or insight for real time actions? Or are we using data to improve how we engage our customer and make their lives easier? This is the critical question: “How is the data relevant to the business or the customer?” The relevance of the data is what must drive our technology and integration decisions. Success will be determined by how focused organizations are on the true purpose of their data asset, not by how good they are at integrating the data. By being laser focused on the purpose, the integration path will come clear and simple and we will produce value for our customers and our business through precision and relevance. Big data is dead, long live relevant data.