Big Data is the unexpected resource bonanza of the current century. Moore’s Law driven advances in computing power, the rise of cheap storage and advances in algorithm design have enabled the capture, storage, and processing of many types of data previously that were unavailable for use in computing systems. Documents, email, text messages, audio files, and images are now able to transform into a usable digital format for use by analysis systems, especially artificial intelligence. The AI systems can scan massive amounts of data and find both patterns and anomalies that were previously unthinkable and do so in a timeframe that was unimaginable. While most of the uses of Big Data have been coupled with AI/machine learning algorithms so companies and understand their customer's choices and improve their overall experience (think about recommendation engines, chatbots, navigation apps and digital assistants among others) there are uses that are truly industry transforming.
In healthcare, big data and analytics is helping the industry move from a pay-for-service model that reimburses hospitals, physicians and other caregivers after service was performed to a new approach that reimburse them based on the outcome of the service, specifically the post-service health of the patient. This approach is only possible if there is enough data to understand how the patient relates to the vast population of other patients who have had the same procedure/service and the expected outcome. While a variety of other factors, such as the patient’s cooperation with the treatment plan, are involved, those factors can be tracked and analyzed as well, providing a clear path on best practices and expected results based on evidence. When this is combined with diagnostic improvements made possible by using AI to find patterns in blood and tissue samples or radiology image scanning and anomaly detection, the ability for the physician to determine the exact issue and suggest the best treatment pathway for a given situation is unparalleled. The result to society for this example is expected to be a dramatic increase in efficiency resulting in a lower cost of service. However, the same technologies that are able to deliver these unparalleled benefits are also capable of providing the platform for a previously unimaginable set of fraudulent uses.
Examples of Issues
An interesting case of the unexpected occurred in the UK where a group of criminals with very sophisticated knowledge in AI and big data have been able to scam a number of organizations into transferring large sums of money to fraudulent accounts. According to the BBC, the criminals captured a number of voice recording from CEO’s making investor calls. They analyzed the voice recordings with an AI pattern -matching program to re-create words and parts of speech. They then created a new recording in the CEO’s voice directing the CFO to wire funds to a specific account on an emergency basis. They sent the recording via voice mail to the CFO and even spoofed the CEO’S number. Think of this as an extremely sophisticated fraudulent “robocall” attack using AI to replicate the voice of a known and trusted person sending explicit instructions requiring urgent compliance. While normally this would not work due to organizational processes and security protections, given the right set of circumstances, it can be successful. Also, the level of knowledge, time and money it takes to prepare and launch this type of attack limits its ability to be easily replicated. However as more voice data becomes available and the AI algorithms and techniques become easier to use, we can expect these types of data and technology misuse to become more prevalent. One can imagine a case where the voice of a loved one in distress is sent to a parent or grandparent looking for some amount of money to be sent immediately to card or account. Here the same techniques applied over a large population could have devastating results.
Similarly, facial recognition technology has the potential to identify and authenticate people based on using the sophisticated camera technology found in mobile phones and other camera and video recording devices that have become pervasive in our world. However, few people really understand the limitations of these devices when it comes to accurately identify people under different environmental situations. In the case of the best commercially available technology the accuracy rate, under sufficient lighting and in a “penned” or confined space, is over 90%. This drops to around 65% if the lighting conditions change or the person is in a place like a mall or an outdoor arena. Now, add to that the significant error rate that occurs for people with skin tones that are closer in color to their accessories, as well as its inability to accurately recognize a person with a hat, scarf, sunglasses or facial hair, and it is easy to see why communities such as San Francisco have banned its use in law enforcement activities.
- Safeguard your data in the context of use through tracking, mining and random audits. There are usually trends and tells in the usage of your data internally and externally.