How fast is real-time? According to Robert Miller’s 1968 seminal study on human-computer interaction, we perceive a response time of 100 milliseconds or less as instantaneous. This is much quicker than the average human reaction time, which one online test clocks at 273 milliseconds.
Nowadays, customer-facing applications need to respond in as close to real-time as possible to win our attention and meet standard requirements. For example, 100 milliseconds is a golden rule for credit card processors to meet SLAs. Especially in our new remote-first economy, other areas like media, health care, retail and manufacturing need to take full advantage of real-time data processing to enhance digital business and stay relevant. Yet, leveling up with real-time requires some technical nuances that might be challenging to implement with the current developer skills gap.
According to Manish Devgan, chief product officer at Hazelcast, the problem with most data management techniques is that data isn’t being processed in-stream at the edge. Instead, data gets stored and thrown into a data lake where it accumulates, unfulfilled. To make matters more complicated, decentralized data management means organizations may be overseeing many different data lakes, adding to the potential for unused data and increased latency.
I recently met with Devgan to learn how in-memory stream processing goes beyond real-time data. According to Devgan, the secret sauce is processing incoming data in-stream while tying it to persistent storage to create context. By doing so, companies could unlock more innovative real-time experiences, he said.
Use Cases For Real-Time
Real-time presents many opportunities for business operations and end-user customer engagements. This is especially true in the financial sector. For example, Devgan explains how one bank was able to use real-time data at the moment of customer interaction at their ATMs. The system was able to fetch a history of the user account and extend an offer for a low line of credit, increasing their total loans by 400%. Capturing value at the moment of user intent is a golden opportunity, said Devgan.
Capital One is another example of a financial institution leveraging advanced technology to improve the product experience. “They’re one of the leading banks in terms of technology,” said Devgan. Capital One’s threat detection system calls a machine learning model in real-time to detect and prevent fraud. Only by acting on incoming data in the moment of interaction can preventative responses like these be signaled.
Other scenarios are also beginning to incorporate increased real-time analysis. “This split-second preemptive decision-making is important for many other use cases,” said Devgan. Physical retail might loop into customer loyalty programs at point-of-sale systems to increase in-store spending. Or, automobile manufacturers may want to use real-time processing to detect a problem with their robotic arms sooner rather than later.
Real-time vs. Stream Processing
When an event comes in, it usually has very little information. But when you look up a customer profile and connect it to a greater context, you can direct further actions. “Correlating data in motion with data at rest is where the magic happens,” Devgan. This distinction uncovers the difference between real-time data and stream processing.
The real-time label tends to get thrown around a lot, and it’s quantified differently for different groups. Real-time might be five minutes for some applications but 10 milliseconds for others. Real-time could simply mean that a system accesses data at rest very quickly, delivering perceivably “real-time” results.
On the other hand, stream processing incorporates data at rest and data in motion. It computes incoming, in-stream data and correlates it with data at rest. “You don’t let the data land — you process it as it comes in,” Devgan said.
For years, enterprises have been collecting and storing exorbitant amounts of data. A survey by Seagate and IDC reveals that 43% of data collected is largely unleveraged and other studies estimate the amount of this “dark data” is over 50%. And, not acting upon data effectively makes it just another pile on top of the growing mountain of technical debt.
There is a significant storage expense for holding on to too much data, whether on-premises or in the cloud. But, the paradigm is shifting a bit, said Devgan. Organizations don’t want to hold onto everything. The attitude is beginning to shift toward “garbage in, garbage out.”
Technology Behind Real-Time Processing
So, what sort of technology does stream processing require? Processing data at the edge requires some fundamental changes to design thinking. For one, since data must be computed in-stream, there’s more emphasis on in-memory processing. Devgan explained this as “pushing compute around instead of pushing data around.” To enable this, Devgan pointed to open-source projects like Jet that provide powerful distributed batch and stream processing.
SQL has a storied history of acting upon data at rest. Now, Devgan said, he noticed the reemergence of SQL in a stream processing context. Though the industry is still defining how the SQL dialect will conform in a streaming world, he’s bullish on the prospect of SQL guiding this area. “Streaming SQL will be the lingua franca for streaming,” he insisted. We’re also in an early era of determining how to bridge the stateful and serverless worlds, Devgan added.
But regardless of computing location, it will come down to the net customer experience outcomes. “People want to build their business applications and are less worried about the operational management of the system,” said Devgan. Whether it’s building real-time offers for e-commerce or implementing fraud detection, it will come down to the numbers. There is also a role for stream processing in health care to optimize physical resources and enhance patient experiences with real-time data. In all these scenarios, rapid access to data and processing at the edge will determine much of their success.