That report outlined key 3 characteristics of Machine Learning Data Catalogs (MLDC):
1. Interpret, define, classify, link, and optimize the use of disparate data sources
Reltio is used by companies globally to define logical business schemas, capture and discover relationships through the Reltio Self-Learning Graph, suggest ongoing improvements, and to organize and bring siloed data together across the enterprise to meet their business objectives. This continuous reliable data foundation feeds better operational execution, predictive analytics, and sets them up to evolve towards a self-learning enterprise.
2. Reconcile policies across data use
Reltio Cloud’s built-in data security and privacy, regulatory, life-cycle, and data quality policies coexist to adapt data for multiple uses through a powerful audit trail, and role based access to data. This is critical in the face of evolving compliance measures such as GDPR. Flexibility and agility to ensure that you can track not just where the data originated, but how it’s being used and the outcomes it generates, is a critical component of any forward looking ML strategy.
3. Democratize data to the edge of business
Reltio is particularly well suited to meet this requirement through frontline business user facing data-driven applications and workflow and collaboration capabilities that come OOTB with Reltio Cloud. It allows teams to submit comments, suggestions, filter and easily segment information through a UI that’s as easy to use as Facebook and LinkedIn.
Data science teams are then able to use Reltio IQ, with Apache Spark to run their algorithms without the pain associated with cleaning and onboarding data in separate environments. This is increasingly important as enterprises deploy machine learning systems, with data scientists requiring relevant, curated data sources to train algorithms to improve results.
As this video illustrates, the true value comes from being able to synchronize ML-algorithm derived IQ scores back into master profiles as aggregate attributes. Making them available for segmentation by marketing, sales, and even data stewards and compliance teams. Teams can then continuously reconcile results to recommendations in a closed loop to self-learn and improve outcomes.
We are tremendously proud and honored to have been included in the MLDC Wave as it reflects our core belief that machine learning cannot be used in isolation from the overall data organization and management needs of the business.
Whatever your desired outcome, MDM forms the backbone of high quality, reliable data which allows ML to thrive.
ML in turn provides unique capabilities to improve and increase the efficiency of data quality, and enterprise data organization operations. Like the graphic I selected for this post, they go hand in hand, and are interconnected across all points of the data continuum and life cycle.
Here are a few predictions and perspectives from industry experts to learn from and be smarter this year:
The Promise of Artificial Intelligence (AI) and Machine Learning (ML) Continues on
There have been repeated predictions over the last couple of years touting a potential breakthrough in enterprise use of AI and ML. This year is no different as the potential benefits from adding some kind of intelligent AI/ML layer to software emboldens more organizations across industries to adopt these technologies.
ML and predictive analytics are leveraged to suggest next-best-actions for sending relevant and timely information to customers and finding opportunities for up-sell and cross-sell. Insights like churn propensity, life-time-value, preferences and abandonment rates can be delivered to relevant teams, along with recommended actions that allow them to capitalize on this information.
Effective May 25, 2018, the European General Data Protection Regulation (GDPR) will force organizations to meet a standard of managing data that many won’t be able to fulfill. They must evaluate how they’re collecting, storing, updating, and purging customer data across all functional areas and operational applications, to support “the right to be forgotten.” And they must make sure they continue to have valid consent to engage with the customer and capture their data.
Meeting regulations such as GDPR often comes at a high price of doing business, not just for European companies, but multinational corporations in an increasingly global landscape. Companies seeking quick fixes often end up licensing specialized technology to meet such regulations, while others resign themselves to paying fines that may be levied, as they determine that the cost to fix their data outweighs the penalties that might be incurred.
With security and data breaches also making high-profile headlines in 2017, it’s become an increasingly tough environment in which to do business, as the very data that companies have collected in the hopes of executing offensive data-driven strategies, weighs on them heavily, crushing their ability to be agile.
Customer-obsessed, Data-driven Retailers will Thrive
With the Cloud Infrastructure as a Service (IaaS) wars heating up, players such as Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure continue to attempt to outdo each other on all vectors including capabilities, price, and service.
To avoid being “Amazoned,” some retailers have even adopted a non-AWS Cloud policy. For most, however, it’s about efficiency and cost. Multi-cloud means choice and the opportunity to leverage the best technology for the business challenges they face.
Today’s Modern Data Management PaaS are naturally multi-cloud, seamlessly keeping up with the best components and services that solve business problems. Acting as technology portfolio managers for large and small companies who want to focus on nimble and agile business execution, these platforms are democratizing the notion of multi-cloud for everyone’s benefit.
The Deck will be Cleared for Accelerated Enterprise Digital Transformation
The business landscape is changing like never before. New revenue models, new competition, newer regulations and exceeding customer expectations are forcing organizations to rethink how they do business.
Digital transformation is one of the key initiatives for many organizations looking for ways to leverage digital technologies, become agile, more productive and above all, provide a connected digital experience for their customers. For digital transformation to succeed, a solid data management foundation is a must.
Today’s Modern Data Management Platforms as a Service (PaaS) seamlessly powers data-driven applications, which are both analytical and operational, delivering contextual, goal-based insights and actions, which are specific and measurable, allowing outcomes to be correlated, leading to that Return on Investment (ROI) Holy Grail, and forming a foundation for machine learning to drive continuous improvement. As an added bonus, multi-tenant Modern Data Management PaaS in the Cloud, will also begin to provide industry comparables, so companies can finally understand how they rank relative to their peers.
With the emergence of IDNs, ACOs and MCOs, the approach to healthcare is evolving. The focus is on overall well-being and quality of life, rather than a one-time treatment. This requires a new patient-centric approach, complete understanding of the patient’s needs, behaviors and preferences, and focus on building long-term relationships.
In this changing healthcare environment, a modern approach to data management that enables complete understanding of patients, physicians and other partners across all clinics and facilities, while guaranteeing HIPAA compliance is necessary.
Whatever the industry or business need, most enterprises will need to first focus on IA (Information Augmentation): getting their data organized in a manner that ensures it can be reconciled, refined and related, to uncover relevant insights that support efficient business execution across all departments, while addressing the burden of regulatory compliance.
And that’s just the perception and impact of basic data quality (DQ). Even more critical, business decisions are made every day on uncorrelated data which may not be “dirty,” but missing key information that might have resulted in a better decision and outcome.
What can companies do to not just minimize the impact of dirty data, but thrive by using data as a strategic asset? Here’s a quick checklist that can be used to achieve the best outcomes.
1. Use basic cleansing and Data Quality (DQ) not in isolation, but as a precursor to the correlation of data across different siloed sources
2. Leverage core master data management capabilities to match and merge entities to do the correlation of master profiles of any entity (people, products, organizations, places) in a single multi-domain platform
3. Once a continuous process is in place for reliable data, use graph technology to capture relationships of any type between such entities (e.g. people-to-people, people-to-products, people-to-orgs, products-to-suppliers, locations-to-people-to-orgs-to-suppliers)
4. With this reliable data foundation in place, bring in transaction, interaction and social data related to each entity to get a true 360-degree understanding of the behaviors and pattern that can provide the relevant insight that can help improve operating efficiency and execution
8. Repeat the process in a efficient seamless manner while adding new data sources across the enterprise to tie in more siloes to solve more business problems
Modern Data Management Platforms as a Service today do all of the above and more. They help both IT and business work together to prevent dirty data from flushing “3 Trillion dollars” worth of wasted effort and lost opportunities down the drain.
“Fake news is a type of yellow journalism or propaganda that consists of deliberate misinformation or hoaxes spread via traditional print and broadcast news media or online social media. Fake news is written and published with the intent to mislead in order to damage an agency, entity, or person, and/or gain financially or politically, often with sensationalist, exaggerated, or patently false headlines that grab attention.”
“Dirty data, also known as rogue data, is inaccurate, incomplete or erroneous data, especially in a computer system or database. Dirty data can contain such mistakes as spelling or punctuation errors, incorrect data associated with a field, incomplete or outdated data, or even data that has been duplicated in the database. It can be cleaned through a process known as data cleansing.”
And that’s just the perception and impact of basic data quality (DQ). Even more critical, business decisions are made every day on uncorrelated data which may not be “dirty,” but missing key information that might have resulted in a better decision and outcome.
I was honored to be invited to participate on a panel discussing the evolution of Machine Learning (ML) at the Thomson Reuters Emerging Tech Conference. The panel consisted of luminaries such as moderator Jonathan Weber, Global Tech Editor, Reuters News, Asif Alam Global Business Director, Thomson Reuters, and Vikram Madan Senior Product Manager, AWS, Machine Learning.
We covered a lot of topics including deep learning, neural networks, image recognition, reliable data foundation as an ML imperative, digital personal identities, the increasing value of enterprise data, how you should safeguard your private data, GDPR, closed-loop as the last mile in ML, how LinkedIn is an example of the next generation data-driven application, autonomous data management and machine learning for data matching and correlation, classification of different types of data, Gluon and the Microsoft – AWS partnership, how elastic cloud computing with unlimited processing power makes ML a reality, and more.
Here are some key takeaways from the panel:
Machine Learning requires a foundation of continuous reliable data to ensure that algorithms are acting on the right information. Generating reliable data is usually the task of master data management (MDM) tools, that blend and correlate profile attributes across disparate siloed sources and applications. However MDM itself cannot deliver the complete picture as it’s missing the critical set of interactions and transactions that complete the 360-degree view. Today’s modern data management platforms go beyond MDM, and beyond data lakes that have been largely ineffective by providing a seamless feed of reliable data to maximize the potential of machine learning.
One challenge for, not just machine learning, but advanced analytics in general has been the friction of synchronizing data models between operational applications and data sources, and downstream data warehouses and lakes that are being used as the data pool for analysis and ML. Today it is possible to eliminate that friction by seamlessly transitioning information into Spark on demand, so that machine learning can operate on the latest, most up-to-date data, without the need to wait for data model updates and changes which have traditionally hindered business agility.
Another critical element of making sense of the output from machine learning and advanced analytics is closing the loop and bridging the gap between insight, action and outcomes. Today’s insights are still siloed from the actual actions that business teams will eventually take based upon the data. Further the outcomes of any actions take are rarely correlated back to the originating insights. The added value of a continuous feed of reliable data, relevant insights and recommended actions generated from ML is to have a closed loop where users can contribute to data reliability, and provide data on the outcomes, implicitly through their actions, or explicitly through feedback responses, so that ROI can be tracked, and ML has the historical data to learn and improve
It was a fun night with the audience contributing to the discussion. The future for AI and ML is a bright one. Everyone agreed that in order for such initiatives to deliver true value, a reliable foundation of data must be established, in order to ensure success.
After all these years, master data management (MDM) has finally emerged from its awkward teenage years as a pimply-faced young adult, not quite sure if it’s ready to take on the world. A few industry analysts have even said that MDM is officially in the “trough of disillusionment,” confirming that while MDM is no longer in diapers, it is not quite mature enough to get a real job or get married.
Having worked in data management for the past 23 years, with most of that time in MDM, I thought I had seen it all.
Traditional 20th century MDM has certainly seen its ups and downs throughout its short history, but what excited me about joining Reltio was the idea of starting with a clean slate and building a 21st century Modern Data Management solution from the ground up. A solution that not only revolutionizes MDM, but goes beyond the basic single version of the truth.
Fortunately, Reltio doesn’t have any legacy 20th century pieces and parts to “Frankenstein” the next generation MDM offering. A luxury that legacy MDM vendors typically cannot afford.
As part of redefining not just the MDM market, but data management in general, Reltio decided to focus on refining one of the key capabilities of MDM–data matching. Although matching algorithms and techniques haven’t changed much over the years, the way these algorithms and techniques are applied could certainly be improved.
By applying a modern approach, with techniques including an ongoing emphasis towards leveraging machine learning to improve how matching is done, allows companies to be flexible in the early phases of development.
At Reltio, we are about being right faster. Therefore, our ability to tune and re-match all of your company’s key business entities faster, enables your organization to be more agile and accurate in a way that’s a clear departure from today’s MDM norm.
Being able to fire off all match rules at once, versus the traditional way of traversing match rules one at a time, and stopping once a match is found is one example.
In another example, a life sciences customer of ours defined over 100 match rules with a non-Reltio MDM solution. When they deployed Reltio Cloud, they were able to reduce the number of match rules to just 16. Reltio Cloud is a clear departure from the norm that provides key stakeholders with a modern, agile and simplified approach to data matching.
When you distill all of this information down, you’ll find that today’s traditional MDM solutions suffer from the same fatal flaw–a relational database that is used to manage and store data used in the match process.
Today’s MDM requirements go beyond yesterday’s repository of simple “common” master data in the thousands of records, and necessitates a modern solution that is able to integrate millions of transaction and interaction data across multiple systems.
Trying to manage relational database cross-reference tables, joins, intersection tables and more across newly mastered entities, including millions of transaction and interaction relationships creates a relational “spaghetti” mess that just won’t scale.
In the end, what business users need today is a single place where they can find reliable data and relevant insights that drive recommended actions across their entire enterprise.
How CDOs & CIOs are Driving Digital Transformation
Ajay Khanna, VP of Marketing, Reltio
I recently got an opportunity to present at MITCDOIQ Symposium in Cambridge, MA. Here is the outline of my presentation where I discussed how today’s CIOs and CDOs are driving digital transformation across their enterprises. It discusses the key drivers for digital transformation and how Modern Data Management is helping them with their initiatives. You can now download my slides from the event from this page.
Today’s business landscape more dynamic than ever. New revenue models, stringent regulations and high customer expectations are forcing organizations to evolve or face being overrun by more nimble competitors.
CDOs and CIOs of established business are looking to digital transformation as a key initiative. But what exactly does digital transformation entail? At its core, any digital transformation requires clean and consistent data, reconciled across systems and channels. An enterprise-wide data management foundation that ensures real-time access to reliable data of all types at scale and is non-negotiable. Data access must be democratized across all groups and divisions so that teams can get a 360-degree view of customers, products, organizations and more.
However, it’s not just about disconnected siloed analytics. It’s about the next generation of operational data-driven applications that allow frontline business users to gain relevant insight and intelligent recommended actions so they can achieve their goals. This session explores how some of the largest companies in the world are transforming themselves using the same modern data management technology used by Internet giants such as Amazon, Facebook, LinkedIn and Google.
The presentation covers the following topics:
Changes in business environment and need for agility
Digital transformation drivers
Digital transformation examples
Data-driven digital transformation with Modern Data Management
Please fill out the form below to download the presentation slides:
The 2017 NBA championship is in the books, and the Warriors triumphed in record breaking fashion, going an amazing 16-1 in the playoffs. LeBron James to his credit, averaged a triple double in the finals, but it wasn’t enough.
Of course many said this win was ordained July 4th, 2016 when the eventual MVP of the finals Kevin Durant, made the decision to join the Warriors. But something else might have been in play.
Pundits pointed to Game 3 of the finals where the Warriors stormed back to shock the Cavaliers, effectively ending any hope of a come back. In that game, many commentators expressed how LeBron (and Kyrie Irving) looked “gassed”. Many blamed how tired they were after a long season.
In fact, the data shows that LeBron played the most minutes of any player in the league during the regular season. The table below shows he was ranked first, and Kyrie 15th. The first Warrior doesn’t appear until Klay Thompson at 28th, and Curry & Durant rank 38th!
Clearly in 2017, they continued their data-driven ways. Emphasizing once again that “Strength in Numbers” is as much about the data (and in this case managing minutes and rest), as it is about fan support, and bench strength.
Your own enterprise data has insights which can help drive your business. Before you rush to leverage it, make sure it’s reliable and relevant.
Recap of June 2015 insight into Warrior’s analysis of resting their players.
Along with many I watched and marveled as the Golden State Warriors took on and defeated all challengers in the regular season. Then in the playoffs, each time they faced adversity, coach Steve Kerr seemed to have all of the answers.
One of the major turning points came with finals MVP Andre Iguodala‘s insertion into the starting line-up in game four. Making it much harder for LeBron James to continue his dominance. But how did Andre, a 31 year old veteran, play so hard after such a long season? The fact that he did not start a single game in the regular season kept him fresh. The Warrior’s deep bench and dominance during the regular season even allowed Kerr to rest league MVP Steph Curry in 20 fourth quarters, when the Warriors were way ahead.
Unsurprisingly, being a Bay Area/Silicon Valley team, owned by a venture capitalist who undoubtedly knows a thing or too about tech, analytics and data, it goes much deeper than just gut feel when resting players. As this article by ESPN, “Biggest winner of the Finals? Rest!“, points out the data that may have been the most valuable was not about shot percentage, match-ups or even opponent tendencies, but about the amount of effort, rest and potential fatigue of each player on the team. When the Warriors fans donned jerseys that said “Strength in Numbers” there was a double entendre.
As clients of wearable technology provider Catapult Sports, they monitor their players’ workloads in practice with GPS monitors and analyze the data with acute attention to maximizing performance while minimizing injury risk.
The latest project: Led by the training staff, Gelfand and the team’s data programmers, the Warriors have engineered a readiness rating for each player built on a 0-to-100 scale (100 is prime shape and 0 is burnt out).
The idea is to give Kerr a handy all-in-one metric that aggregates various health indicators, including a daily five-question survey given to the players to help assess their soreness. Simple questions like, “How do you feel?” and “What’s your mood?” and “How’d you sleep?” Each question has multiple phrases that the players choose from. Each answer corresponds to a number on a five-point scale. The lower the number, the lower the stress levels.
““It’s research,” Lyles says of the survey. “The wording in the answers are specific so it gives guys a good guide. Each guy is very individual. I may ask you the same questions. We want a low score. The best score you can have is a five. So let’s say your average is an 11, that’s your norm after months of doing it. It’s 5 to 25. One point for each question.
““You come in, now you have two days that are 18 and 19. All right, now that’s a trigger. He’s normally an 11, let’s check in. If it’s sleep, we’ll look at the questions that are bad. We’ll look at the travel.”
The Warriors noticed that player stress was linked to lack of sleep. So they rescheduled their flights to the day after, not the night of games, so they could sleep in and get a full night’s rest.
“With the subjective side taken care of, the team then tackles the objective portion. They look at SportVU player-tracking data (for game workloads), Catapult data (for practice workloads) and Omegawave heart variability data (to test neurological stress). With these four inputs (including the subjective side), the Warriors have a dashboard that indicates whether a player should give it a go, and for how long.
What the Warriors have done is what every business is should be looking to do with their own data. Most companies focus on the “objective portion” relying on specific facts that have occurred through historical transactions from their systems, or purchased from outside sources. All of this data goes into high performance and visualization analytics that ultimately come up with what they believe to be the right answers or insights to form the basis of strategic decisions for the company.
What’s often missing is the subjective data that can be equally if not more valuable. In the Warrior’s case it was individualized feedback from their players. In business this could be input from sales teams, “the feet on the street” who are interacting with customers, and encountering the latest, most accurate data, and typically have no way of feeding that back into the mix. This is a lost opportunity, as sales people in large corporations could be their biggest data-driven advantage.
What the Warriors have shown is that a fully closed-loop of objective data, complemented by subjective feedback, delivers the most reliable data. Only then can coach Kerr have confidence in the recommended actions derived from relevant insights for each player.
Surprisingly even some of the largest corporations don’t embrace this concept. Billions are being poured into analytics, but they are still siloed from any frontline business applications that are used for day-to-day execution. As the most popular consumer data-driven applications such as LinkedIn and Facebook have shown, the most immersive and complete experience comes from a new generation of apps that are both analytical and operational. Only then can you gather data from those closest to the action, while blending the information with complementary sources to get the complete picture.
Many will point out that Cleveland were shorthanded due to major injuries, and LeBron’s incredible performances to the point of exhaustion were not enough. But the Warriors tipped the scales in their favor through their version of a data-driven application, and rebounded before the playoffs even started, to ultimately become champions.
From hoops to Hadoop, congratulations to the Warriors, now it’s time for not just the rest of the NBA, but corporate America to catch up.
Evolving from traditional to next-gen data management would mean taking these four key steps.
1. Tear Down the Silos Between IT and Business
As per the “State of the CIO 2017 Survey”, 59 percent of CIOs said IT collaborates with business units to build business cases for new MDM (and technology) initiatives, while only 24 percent of business leaders said that was the case. This indicates clear departmental silos between IT and Business. To bridge this gap, you must stop seeing your MDM as a separate siloed discipline, requiring complex IT infrastructure, processes, leading to months of design and implementation. Modern data management encourages an alignment and partnership between business and IT through an extremely user-friendly data governance interface, thus maximizing the ROI of your MDM investment. It has inbuilt collaboration and workflow capabilities to meet your enterprise’s governance framework and way of doing business. In addition, it allows IT and business users to provide feedback in a collaborative and controlled manner thus preserving valuable intelligence and competitive advantage.
2. Close the Loop Between Operations and Analytics
According to a recent McKinsey survey, 86% executives reported that their organizations were only somewhat effective at meeting the goals they set out for their data and analytics initiatives. The biggest culprit is a gap between analytics and embedding these insights into the operating model of the larger organization. Turning data into real value requires a profound reshaping of your day-to-day workflow and digitization of transactions and processes to generate and collect all useful data. A modern data management solution helps consolidate and cleanse data from all sources, transform it into reliable data, and provides relevant insights and recommended actions in the context of your operational applications using predictive analytics and machine learning. It allows you to apply analytics to improve the performance of your core operations. It doesn’t stop there and further correlates downstream business actions and results back in an integrated closed-loop, thus converting big data into smart data, providing faster Time To Analytics (TTA), measurable ROI, and better outcomes.
Modern Data Management Brings Together Best of Breed Technology in a Unified Platform
3. Take Polyglot Data Storage Approach to Achieve Big Data Scale as well as Performance
Different databases are designed to solve different business problems. Using a single database for all the requirements usually leads to non-performant solutions. A data-driven application should be able to bring together data from different database types to achieve the business objective. Most of the operational business applications run on relational, columnar databases, but they do not manage relationships well. Graph databases, while suited for uncovering and handling relationships, don’t have the horizontal scalability and agility to meet enterprise needs. This limitation is leading to the emergence of modern data management platforms built on columnar-graph hybrid stores. Once you create data-driven applications on a reliable data foundation of a Commercial Graph, you can visualize all relevant information and relationships as well as quickly pivot from one application to the other. For example, you can see all the purchased products and stores visited in a consumer’s profile, and with a simple click, you can drill into the product profile, roll-up dynamic hierarchical information (revenue, value, product usage) or find key influencers in customer networks.
4. Strike the Balance Between Offensive and Defensive Data Strategy
Retailers who fail to comply with data security can be fined up to 4% of their revenue and lose the confidence of their customers forever. On the other hand, personalization can deliver five to eight times the ROI on marketing spend, and can lift sales by 10% or more. Thus, the need of the hour is to strike a balance between your defensive and offensive data strategies and make considered trade-offs between “defensive” (e.g. security, governance, and compliance) and “offensive” (e.g. revenue growth, profitability, and customer satisfaction) uses of data as illustrated in this HBR article. A modern data management platform offers flexible data and information architectures that involve both a single source of truth (SSOT) and multiple versions of the truth (MVOTs). It focuses on reliable data at scale for “defense” and delivers relevant insights for “offense” from complete contextual 360-degree views, for personalized engagement.
Can you prove the ROI of your data management efforts? Are you able to conquer the gap between your analytical insights and operational execution? Can your business teams leverage reliable data and relevant insights to solve their day-to-day challenges without compromising on scalability and performance? Are you able to implement a mix of defensive as well as offensive data strategies to meet your data security as well as personalization goals? If the answer is no to one or more of these questions, you must rethink (and reinvent) your data management philosophy.
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings. These are cookies that are required for the operation of our Site and under our terms with you. They include, for example, cookies that make use of certain Services offered through the Site.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.