Big Data is the unexpected
resource bonanza of the current century.
Moore’s Law driven advances in computing power, the rise of cheap
storage and advances in algorithm design have enabled the capture, storage, and
processing of many types of data previously that were unavailable for use in
computing systems. Documents, email,
text messages, audio files, and images are now able to transform into a
usable digital format for use by analysis systems, especially artificial
intelligence. The AI systems can scan
massive amounts of data and find both patterns and anomalies that were
previously unthinkable and do so in a timeframe that was unimaginable. While most of the uses of Big Data have been
coupled with AI/machine learning algorithms so companies and understand their
customer's choices and improve their overall experience (think about
recommendation engines, chatbots, navigation apps and digital assistants among
others) there are uses that are truly industry transforming.
In healthcare, big data and
analytics is helping the industry move from a pay-for-service model that
reimburses hospitals, physicians and other caregivers after service was
performed to a new approach that reimburse them based on the outcome of the
service, specifically the post-service health of the patient. This approach is only possible if there is
enough data to understand how the patient relates to the vast population of
other patients who have had the same procedure/service and the expected
outcome. While a variety of other factors,
such as the patient’s cooperation with the treatment plan, are involved, those
factors can be tracked and analyzed as well, providing a clear path on best
practices and expected results based on evidence. When this is combined with diagnostic
improvements made possible by using AI to find patterns in blood and tissue
samples or radiology image scanning and anomaly detection, the ability for the physician to determine the exact issue and suggest the best treatment pathway
for a given situation is unparalleled.
The result to society for this example is expected to be a dramatic
increase in efficiency resulting in a lower cost of service. However, the same
technologies that are able to deliver these unparalleled benefits are also
capable of providing the platform for a previously unimaginable set of
fraudulent uses.
Examples of Issues
An interesting case of the
unexpected occurred in the UK where a group of criminals with very
sophisticated knowledge in AI and big data have been able to scam a number of
organizations into transferring large sums of money to fraudulent
accounts. According to the BBC, the
criminals captured a number of voice recording from CEO’s making investor
calls. They analyzed the voice
recordings with an AI pattern -matching program to re-create words and parts of
speech. They then created a new
recording in the CEO’s voice directing the CFO to wire funds to a specific
account on an emergency basis. They sent
the recording via voice mail to the CFO and even spoofed the CEO’S number.
Think of this as an extremely sophisticated fraudulent “robocall” attack using
AI to replicate the voice of a known and trusted person sending explicit
instructions requiring urgent compliance.
While normally this would not work due to organizational processes and
security protections, given the right set of circumstances, it can be
successful. Also, the level of
knowledge, time and money it takes to prepare and launch this type of attack
limits its ability to be easily replicated.
However as more voice data becomes available and the AI algorithms and
techniques become easier to use, we can expect these types of data and
technology misuse to become more prevalent.
One can imagine a case where the voice of a loved one in distress is
sent to a parent or grandparent looking for some amount of money to be sent
immediately to card or account. Here the
same techniques applied over a large population could have devastating results.
Similarly, facial recognition
technology has the potential to identify and authenticate people based on using
the sophisticated camera technology found in mobile phones and other camera and
video recording devices that have become pervasive in our world. However, few people really understand the
limitations of these devices when it comes to accurately identify people
under different environmental situations.
In the case of the best commercially available technology the accuracy
rate, under sufficient lighting and in a “penned” or confined space, is over 90%.
This drops to around 65% if the lighting conditions change or the person is in
a place like a mall or an outdoor arena.
Now, add to that the significant error rate that occurs for people with
skin tones that are closer in color to their accessories, as well as its
inability to accurately recognize a person with a hat, scarf, sunglasses or
facial hair, and it is easy to see why communities such as San Francisco have
banned its use in law enforcement activities.
Efforts to Consider
So,
the question is; what can we do to bet the benefits of AI and big data yet
protect ourselves from the downside risk these technologies bring? First, realize that as the old adage goes,
the Genie cannot be put back into the bottle.
We will need to live with and be prepared to manage the risks each of
these technologies brings. In our practice, we work with clients to identify
the critical data types, decision types and actions/outcomes that require elevated of level protection. This is a comprehensive effort that results in a digital asset threat matrix with corresponding
required actions. However, everyone or the organization, no matter what the size can start by:
- Understanding the types of data both you and your
organization have in your possession (images/pictures, text, spreadsheets) and
decide what data you are willing to share and under what circumstances. This is
particularly important for individual biometric data. Keep engaged with papers
and events emerging on the topic of “The Data of You”
- Develop specific rules for when you will take actions such as transferring
money and who (maybe multiple people) is able to authorize the transaction and
under what circumstances
- Ask your analytics vendor or
analytics team, to show you the tested the current and historic accuracy rate of any software that is used to make
critical decisions. Why would you allow
something with a marginal accuracy rate to aid in the decision-making process,
especially when dealing with something so important as law enforcement? This also applies to other analytical
software such as blood and urine testing services.
- Safeguard your data in the context
of use through tracking, mining and random audits. There are usually trends and
tells in the usage of your data internally and externally.
- Stay abreast with activities
and outcomes from “Deep-Fake” events and publications. The use of AI and
Algorithms to fool institutions and individuals are on the rise leading to
alternative realities.
Net; Net:
Lastly,
on an individual level, remember it is
your data. Do not agree to share it
with any app or information request, especially on-line lotteries or emails
that tell you are a winner, just give us your contact information! These may be scams and you do not want to end
up a victim of the unintended consequences of big data and AI!
For
more information see:
This post is a collaboration with Dr. Edward Peters
Edward M.L. Peters, Ph.D. is an award-winning technology entrepreneur and executive. He is the founder and CEO of Data Discovery Sciences, an intelligent automation services firm located in Dallas, TX. As an author and media commentator, Dr. Peters is a frequent contributor on Fox Business Radio and has published articles in The Financial Times, Forbes, IDB, and The Hill. Contact- epeters@datadiscoverysciences.com