.excerpt-thumb {display: block !important;}

Big Data in the Hands of Users


By Judith Lamont, Ph.D.

This article appears in the issue April 2016 [Volume 25, Issue 4]

A recent survey by Forrester indicated that while nearly 90 percent of IT professionals say their companies plan to use big data and analytics for business decisions within the next year, only 20 percent say that the IT department is driving those initiatives. In an environment of increasing volume and data types, combined with often inadequate infrastructure, bringing information into the hands of business users can be a major challenge. That is where a number of big data software vendors are focusing their efforts.

Kyvos Insights has addressed this issue by organizing data in Hadoop so that business users can access it without having to write code. “Most users of big data have been data scientists and programmers,” says Ajay Anand, VP of products at Kyvos Insights. “But such experts are hard to find, and we wanted to bridge the barrier to bring data to the business users. Our premise was that there should be a seamless way to connect with Hadoop.”

In the more traditional world of analytics, data warehouses stored the information in a well-defined structure, but problems emerged as both the size and types of data increased. “The structure was inflexible,” Anand says, “and it was difficult to incorporate new data sources.” The Hadoop infrastructure gained adoption because it was flexible, cost-effective and scalable. It could eliminate data silos, which allowed a more comprehensive view of data coming from different parts of a company.

Scale, detail and interactivity

“With traditional data warehouses, typically data is aggregated and extracted and put into a data mart so that the scale is more manageable for visualization tools,” Anand says. “We wanted to eliminate the need for that and conduct OLAP-style analyses directly on Hadoop.” Kyvos Insights’ solution Kyvos allows business users to look at the data, spot an area they want to explore and drill down to any level of granularity. “We want users to be able to look down to a very fine level of granularity, like which customers are watching certain TV shows, so they can be segmented,” Anand says.

Because there may be millions of devices to track to obtain the data, the tool must have the ability to deal with both a large scale and a lot of detail, as well as offering interactivity. “Having greater scalability lets organizations analyze data over a very long time period, which is difficult to do in a traditional environment,” Anand says. The scalability also allows for measuring the entire target population rather than just samples. He cites an example in which a client wanted to gain an understanding of its Latino market, which previously had to be done through surveys and samples. “Now that they can get full empirical data, they can drill down to individual users and get statistically correct results,” Anand explains.

Kyvos provides its own interface into the big data repository, but it can also be used in conjunction with existing BI tools. “Users of Tableau, for example, can get the same kind of interactivity with big data by using our product,” Anand says. “A lot of big data projects languish because people have created data lakes in Hadoop but the business users cannot access them. Our software overcomes that problem.”

At this stage of the evolution of big data, customers do not always know what the technology can offer. “It is a journey, and they need to get educated over time,” Anand says. “They first look at the low-hanging fruit and then want to know what else they can do that they could not before. In order to explore, interactivity is critical, because otherwise it is really hard to follow a train of thought and get insights. If the user can get an interactive response, a tremendous increase in productivity is possible.”

Streaming analytics

Another challenging area is streaming, which puts the “velocity” in big data. (Volume, velocity and variety are often used as descriptors for big data.) Although a majority of companies in a variety of surveys are planning big data initiatives, only about one-fifth are using or plan to use streaming or complex event processing technology, according to research conducted by Gartner. The motivator for mastering that technology is the ability to gain real-time or near real-time insights that can shorten the lag time for decision-making in applications ranging from marketing to national security.

Impetus Technologies first produced streaming analytics solutions for its customers’ big data initiatives and then decided to develop StreamAnalytix, a commercial technology product that became available in 2015. “We found we were doing the same things over and over,” says Anand Venugopal, head of product for StreamAnalytix at Impetus, “and we wanted to produce a template.” That led to StreamAnalytix and the addition of a product-oriented segment of the business. StreamAnalytix is an open source-based, multi-engine platform that leverages Apache Storm and Apache Spark Streaming for rapid deployment of real-time streaming analytics applications.

“One of the prime applications for streaming analytics is to bring context sensitivity to customer care,” Venugopal explains. “When a customer calls in for assistance, they often tell the same story over and over to different agents. Imagine how much more pleasant the experience would be if the agent knew why the customer was calling and could provide the solution within a few seconds.” For example, in cable television, a common problem is misbehavior of the set-top box. “If agents could minimize the time on this call, they would reduce costs and improve customer care,” he says.

That ability depends on being able to bring in data from all the channels at once, from clickstream that shows everything the customer did on the website, to data from the set-top box and mobile phone input. “The data can be converged, analyzed and the system can make predictions about the most likely problem and present it to the agent,” Venugopal says. “In addition, the recent history can be presented so the agent can see the entire context of the customer and make the best recommendations.”

Operational intelligence is another broad area where streaming data can be valuable. “Predictive maintenance is a huge area,” he explains. “Using streaming data it is possible to correlate multiple factors and detect when an issue goes from green to orange mode, not just looking at a single parameter, but a combination of multiple values in real time.” Those factors are matched up with previous patterns that indicated a problem.

The business value of streaming analytics should be the focus of a company’s decision on whether to deploy it, according to Venugopal, rather than thinking about what technology to use. However, usability is also a critical factor. “We have drag and drop for some of the algorithms to build out a model,” he explains. “This makes it very appealing to data scientists. The idea is for a variety of individuals to be able to use the platform to assemble an application.”

Optimizing use of big data

Three elements are essential to make the most of big data, according to Ramon Chen, CMO of Reltio. “Start with clean data,” he says. “Master data management is foundational for any analytics project, including big data.” The second is to personalize information that is delivered. “You may have an enormous amount of data, but only a subset is going to matter to any given individual,” Chen continues. “It’s important to provide access to the most relevant information but also in a very integrated way. For example, a customer service agent should have a full view of the customer, beyond the view provided by the CRM system.”

Finally, big data should be leveraged not only to respond to inquiries but also to proactively deliver information. “The system should be able to make recommendations based on all the information that is available with a continuous feedback loop to improve them with each interaction,” he says.

Reltio provides a sophisticated platform-as-a-service (PaaS) environment, Reltio Cloud, in which enterprise applications can be developed. Among the built-in features are master data management (MDM), the ability to ingest third-party data and Reltio Commercial Graph. The company also offers a variety of industry-specific products that are pre populated with functionality and relevant content.

Cognitive computing capability supports the ability to make recommendations and improve them over time, based on user responses. Reltio Commercial Graph provides visualization of relationships among people, facts and events. Its approach of storing information about nodes is similar to that used by Facebook (facebook.com) and LinkedIn (linkedin.com), revealing connectivity among elements that would be difficult or impossible to discern from a traditional database.

“We saw that big data was coming,” says Chen, “and we also saw that many applications were mired in relational databases and did not provide a comprehensive picture.” Business software has not yet reached the level of advanced consumer-aware software such as that of Amazon (amazon.com), which integrates information about customers’ searches, purchases and ratings. “Business software should be able to do the same thing, with suggestions offered based on historical data, and the ability to accurately predict outcomes,” he adds.