Those who have worked on Machine Learning (ML) projects know
that ML requires a large amount of data to train the resulting algorithms. Some
would say you can never have too much data. There is usually a correlation
between the amount of data and the sophistication of the resulting ML model.
This data hunger is only going to get more intense as AI progresses towards new
benefit pools while leveraging more sophisticated AI capabilities. Since there
are other contributing trends bedsides the sophistication of AI, the question
looms for organizations is, "do they have the right data to fuel
successful AI efforts?" If they don't have enough, should they inventory
more in anticipation of the AI feast?
Figure 1: The AI / Data Continuum
It’s not likely that all that big data that organizations
have been hoarding is the correct data, but understanding where AI is going
will give an organization a "leg up" on culling and collecting more
of the correct data as AI progresses during the next decades.
The Progression of AI
Changes the Data Game
While ML requires significant amounts of data to self-modify
its behavior, the appetite of AI increases quickly as the sophistication of the
AI capabilities increase. There is a big step from machine learning to Deep Learning
(DL) in that DL requires much more data than ML. The reason being that DL is usually
only able to identify concept differences with the layers of neural networks.
DL determines the edges of concepts when exposed to millions of data points. DL
allows machines to represent concepts via neural networks as the human brain
does, thus allowing more complex problem-solving. AI can also work on fuzzier
problems where the answers are more uncertain or ambiguous. These are typically
judgment or recognition problems that can extend to the creation or other right-brained
activities. This again requires more data, which in some cases may be emergent
or real-time in nature.
The Shift from
Data-Driven to Outcome Driven
As AI moves up in the sophistication of the problems its
assists or solves, it will become data-driven and goal/outcome-driven. It means
that the AI may request data on the fly that it needs to solve a particular
problem or make a specific deduction, thus complicating data management. It may
involve the interaction of inductive data-driven portions of a solution with the
deductive needs for data based on a hypothesis to reach a target. This kind of
dynamic interaction is needed for outcome-oriented problems. It is much
different than just interrogating the data looking for interesting events and
patterns. Decision driven approaches fit right in the middle of these two
distinct approaches. Some decisions are operationally focused and improved through
matching data with outcomes. More strategic decisions will pick up on both
inductive and deductive approaches. This is just another demand channel to
boost data usage.
The Shifting Problem
Scopes Impact Data Needs
The scope of AI solutions are will typically start narrow
and move to wider scope over time, thus requiring more data. Complex solutions
typically target more than one answer and will require more data to support the
tributary solution sets, contributing to a complex/hybrid result. As the scope
of decisions, actions, and outcomes span more contexts inside and outside an
organization, more data will need to be obtained to understand each context and
their interactions. Each of these contexts could be changing and morphing at
different rates, therefore, requiring more data yet.
Net; Net:
It's clear that more data will be the hallmark of
AI-assisted solutions. The data appetite might come from more challenging
problems, the better leverage of advanced AI/analytics, or growing end to end
value chains. One thing is for sure. Organizations had better get ready for the
new world of “AI/Data Interaction”. It could change or extend data management
policies, methods, techniques or technologies.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete