Focus on the algorithms. Eliminate data cleansing by leveraging a continuously reliable data foundation for AI, machine learning & deep learning  


As a Data Scientist or Business Analyst, your job is to uncover insights from a massive ever-growing pool of uncorrelated data. You've created data lakes, data warehouses, marts and even tried tools that access that through virtualization. One thing you absolutely need is for any data that you process to be clean and reliable.

Dirty Data in the form of unreliable, duplicate, or fraudulent information, may have even a larger impact, as much as 3 trillion dollars! Whether those numbers are accurate is debatable. Closer to home for businesses, Experian estimates that, on average, U.S. organizations believe 32 percent of their data is inaccurate.

And that’s just the perception and impact of basic data quality (DQ). Even more critical, business decisions are made every day on uncorrelated data which may not be “dirty,” but missing key information that might have resulted in a better decision and outcome.

So naturally, Dirty Data is a major concern for both Data Scientists and Business Analysts alike. The Verge’s article titled “The biggest headache in machine learning? Cleaning dirty data off the spreadsheets contains a humorous but no doubt close-to- home quote:

There’s the joke that 80 percent of data science is cleaning the data and 20 percent is complaining about cleaning the data,
— Kaggle founder and CEO Anthony Goldbloom (via The Verge)
At Reltio, we leverage machine learning in several ways, including:
  1. Continuously improving consistency, accuracy and manageability for better data quality (DQ), uncovering patterns, anomaly detection and assisting humans such as data stewards, to make their job more focused and efficient. Which in turn provides the continuously reliable data foundation for data science and business analysis.
  2. Enabling a seamless foundation for the generation of relevant insights and contextual recommended actions, which can be operationalized through data-driven applications.

We all agree that the context of machine learning needs to be applied to a company's specific business and industry, with a focused set of benefits for each business users' role in order for it to be accurately measured, so it doesn't get labelled yet another project with limited value.

Just as the process of aggregating data to perform historical or predictive analytics is a cumbersome and expensive process, gathering and blending all of the right data that will guarantee machine learning is effective must be the in the DNA of any Modern Data Management Platform as a Service (PaaS).   

Bolting on AI or ML into legacy master data management (MDM) systems, or using such MDM tools to feed downstream, disparate ML tools is a path destined for failure. Reliable data, relevant insights and recommended actions via machine learning needs to be seamlessly combined into one, single cloud application, delivering both analytical intelligence and operational execution.  

But it goes beyond just providing technology. An open ecosystem that allows you to choose and partner with technologies, and domain experts of your choice is critical to getting the most out of a still young and evolving landscape. Most companies are already trying to evolve out of their legacy MDM platforms. Getting further locked into a single vendor delivering both MDM and ML, through siloed disparate tools will not only fail to provide clarity, but may further complicate an already fragmented data management strategy.

Reltio Insights, as part of Reltio Cloud PaaS, brings mastered data together with big data scale interactions and transactions, into a single environment, such as Apache Spark on demand, to quickly enable agile, closed-loop insights and action. This not only provides for faster Time to Analytics (TTA), but more relevant and accurate information through best of breed machine learning tools and technologies.


Reltio puts all information you need for analytics, and the use of machine learning in one place, accessible in real-time. Reliable and accurate data from master profiles, interactions, transactions, third-party, public and social media sources is consolidated for deeper analytics. When analytics run on a reliable data foundation, organizations can make better and informed decisions.


Reltio Insights bring aggregated analytics back to master data profiles to enrich and improve data. Unlike analytics-only tools or disparate machine learning technologies, Reltio's bidirectional connectors enable recommended actions for business improvement inside user applications. The key is a "closed-loop" of DQ, insights, action and ultimately outcomes to power continuous machine learning.


Faster correlation of accurate, up-to-date profile data with transaction data from multiple sources makes deployments quick and simple. Pre-configured connectors to analytic environments make leveraging reliable data easy. Data is made available to analytics apps, and machine learning algorithms in near real-time. Deployed in the cloud, you'll be up and running faster than you dreamed possible.


With Reltio Insights you can enable traditional and non-traditional analytics, together with machine learning on a single reliable source of structured, unstructured and graph data. Automated updates of business model changes help keep up with dynamic business environments. Support for multiple representations of data based on algorithms and inference helps contextual presentation of insights. Reltio Cloud is multi-tenant, horizontally-scalable and always available. Companies can scale-up and scale-down, on demand and use compute power by type of insight and machine learning through recommended actions.