Today many organizations are stuck in data muck. Simultaneously, organizations are drowning in vast amounts of new data while the business needs more speed in digesting data that are often aggregated with other data. In today's outcome-driven world, management doesn’t need more dashboards. They need fast boards that tell the data story by incorporating in-the-moment operational data with historical data to create the business context needed for decision-making. One of the main obstacles that organizations have is their data infrastructure sprawl. Organizations need to simplify to accelerate by changing their data landscape to be re-oriented to the data mesh architectural concept.
Like it or not, our data management efforts are getting
overwhelmed as the amount, complexity, and contexts of data are emerging fast.
In contrast, slow legacy data sources are a hidden danger just under the
surface. When someone has to traverse the end-to-end data journey from start to
outcomes, the complexity is almost overwhelming even if things don't change.
The problem is that they are changing and changing fast, creating a need for
more data to manage. Now, imagine a goodly sum of the data migrating to the
cloud as the data sources further distribute and include new signals, events,
patterns, and contexts. Figure 1 attempts to identify all the sources, emergent
or not, that either create or contain data to manage to service outcome-driven
organizations.
Figure 1 The Data
Iceberg
We need a dynamic data-view maker with all the sources to configure to their dynamic needs and outcomes. Today organizations have to over-specify and create the data view ahead of time. A real unified dynamic data experience is what is needed. Today’s approach is to think about the data lifecycle in terms of the functional role of the datastore. That is, new data gets created by applications and stored in that application’s OLTP database. Then, for analytical purposes, that data is copied and moved to an OLAP database for reporting. These days, there are more application types and systems generating more types of structured, semi-structured, and unstructured data and storing and processing that data in a greater variety of single-purpose datastores types, for instance, a document database for product catalog information. This causes unnecessary latency where real-time needs are important but also presents a greater management cost by maintaining the variety of datastores and maintaining the skill sets needed to design and operate the datastore variety. A data mesh approach reorients the data management efforts to align the consumers of “data products” instead of thinking about many data pipelines and datastore types. A data mesh provides a unified view and architecture for organizational outcomes supported by applications, processes, or dashboards. It combines data inside the cloud with data outside the cloud. A data mesh hides the complexity and variety of data from the end-use. A data mesh manages both fast and slow data, whether it is organized in a centralized or distributed fashion. Within a data mesh, you have what’s known as “nodes”. Each node corresponds to a data product and defines all the data, metadata, consumers, and providers of that data product. When realizing these “nodes”, there’s an opportunity to gain efficiency by selecting fewer datastores and use them for a broader range of workloads. For instance, there are now modern, cloud-native, distributed SQL databases that support what I call “Monster Data” while storing and processing real-time streaming data and historical data simultaneously. These can be used in a data mesh to reduce the number of skills set and reduce the data infrastructure sprawl, ultimately resulting in a simpler and less costly data landscape. In essence, a data mash unifies and simplifies data management coupled with a modern, cloud-native distributed SQL database can lower the cost for more extensive monster data sources that move at lightning speeds. See Figure 2 for a depiction of a data mesh.
Figure 2 Sample Data Mesh
Net; Net:
The data inventory to manage has gotten unwieldy and
continues to be like a monster to tame. Organizations will need new
architectures including distributed data cells that carry the intelligence to
self-manage to play well with other data sources from various contexts. As
organizations try to leverage cloud data economically, they have to watch out
for the pitfalls of hidden cloud costs. All of this is a must while simplifying
the access to the new and emergent data combined with legacy data types or
sources. This transformation is a significant accelerator to digital business
transformation.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete