Henrik Liliendahl, Chairman & CTO, Product Data Lake

Henrik Liliendahl is an MDM and PIM expert; speaker and blogger — Liliendahl on Data Quality and  The Disruptive MDM/PIM/DQM List — and the Co-Founder, Chairman and CTO of Product Data Lake a product information exchange service. He has many years of cross-industry experience in consulting with clients across all major MDM and PIM technologies.

What was your route into technology, MDM and PIM?

I got good grades in mathematics at school but it was an evening class in secondary school that I took in what was then called EDP (Electronic Data Processing) that got me started. We didn’t even have a computer at that school so the whole class was about things drawn on the blackboard. Then one day we had a trip to a larger college that actually had computers and I was hooked!

I was educated in computer engineering but my first job was on the business side. Since then I’ve bounced back and forth between IT and business. In the late 1990s I got into data matching and developed some algorithms myself because the internet was not as good then or I probably could have found some online.

“The situation was that two companies wanted to merge and wanted to know if they had customers in common. To find that out we applied some matching techniques.”

I developed that product and later merged it into a larger company. But that was my route into data management and going from there into MDM PIM and data governance. Now I think I have a good coverage of all the different disciplines.

How would you define “modern” data management and what does it /should it mean for organisations that adopt it?

Getting a more holistic view on data. Looking at data enterprise-wide has always been part of MDM but modern needs mean you must now look across companies and business ecosystems.

“Driven by digital transformation, companies must interact more and share data – they cannot reinvent the wheel, they need to work together. Business ecosystems are surely a part of modern data management.”

Embracing different data stores in the technology space. Entity-relationship databases have ruled the world for thirty years or so but now we are seeing datastores that are not that rigid, and data lakes coming in. Lots of different types of databases – graph, document databases that can all be used for different purposes and we need to embrace those.

Also of course deployment in the cloud is the thing in modern data management – we will stop running all of this on our own iron and get it from someone else.

“Cloud gives more agility in deploying and new ways of handling data. Funding is also easier because it is a subscription and the funding corresponds to your usage.”

Also DaaS. I remember back in the old days if the postcode table changed in your system you used to have to get a new postcode table and put that back into all of your services. Now with DaaS, you have this kind of data in real-time – not just postcodes but all sorts of reference data and many other third-party data sources available immediately in new applications. That’s a huge advantage.

What are your top 3 tips or resources to share for aspiring modern data masters?

  1. Find ways not to manually type in the data – try and find data that is already digitalised. You get so many errors when people try and type it in themselves.  Get it from outside which is already out there and is refined.
  2. Connect instead of collect. Don’t harvest all data yourself – get it externally where it is already collected and maintained. You can find third and second-party data – you don’t always need to be collecting it internally.
  3. Data models – look at the real world. Eg a customer table. Yes you have customers but does your definition of customer make sense in the real world? Often a customer is also another entity at the same time, such as a person/organisation/supplier/employer as well.

“Try and model the real world into your data stores because then you are prepared for future use cases. That is often something that goes wrong – you make your model or application according to how your business looks today and then in two years you have to scrap everything.”

However, if you look at the real world and think about what might come then you will be better prepared and can probably re-use your data later in new business models.

You have a lot of experience and success, particularly in product-related data. Can you tell us a little more about the concepts behind Product Data Lake and your vision for how it could be used in the future?

It is based on the theme of business ecosystems and products flowing easily between trading partners. Product data looks very different depending on the product you are talking about and manufacturers see products very differently from the merchants.

“On the technology side, you can exploit cloud services and data lake concepts which means you can receive the data in the form that says, a manufacturer might submit it, but consume it in a way that a merchant might want to.”

We do the linking, matching and merging inside the data lake at the time of consumption. We also start to use AI in doing this linking and mapping and transformation – it’s a huge task in terms of the data that flows through and very complex because it is not just one organisation but many. We are at the crawling stage at doing this now but we are aiming to walk then run!

It is also a bit like social media in terms of linking. That is, trading partners request and accept partnerships as you would do in a social network. We are also using tagging of data to help with the linking and consumption of data – as you do in social networks. Really using and embracing these concepts that are out there in terms of collaboration. We also like to collaborate with suppliers of similar technologies like Reltio and others out there on the market.

“It is a bit like an advanced Dropbox where the manufacturers drop their data in and the merchants take it out. In the middle, we do all the matching and linking and provide the infrastructure that the algorithms or exchange gateways run on.”

Could you also please share an example of where things have not gone so well and what you learned from the experience?

I always like to collaborate and that is a great thing of course but only if you align expectations. You need to be sure that you have the same expectations – not that you have to write a 30-page legal contract – but that you talk through success criteria and what each of you considers to be a win-win. I missed that a few times and hope I will not do that in the future.

It may be that some highly technical people are sometimes not great at talking but probably more because in these projects you mix technical and commercial people. Aligning these two worlds is difficult and what one side thinks is obvious is not obvious to the other side. Make sure that you are not making assumptions about other people’s understanding.

“As far as specific examples go, I could share a few stories but it’s probably better if I don’t! (Laughs)”

What trends or changes do you predict to the data management arena in the next few years?

We talked about business ecosystems and I really think that will increase in the future. We also touched on AI which is another buzz word but one that I think will be more sustainable than others have been. I think it will have huge growth and have a great impact on what we do in the future. We could also talk about the Internet of Things – we will see more and more smart devices, not just phones but refrigerators and drilling machines and of course lots of smarter industrial machines. We will see scope-creep from IoT into data management – it will have a huge impact.

There will be more cross-department and enterprise-wide working – companies are increasingly coming up with global rather than local solutions already. This will then spread into the business ecosystems so we will see people working with people who are on the payroll in other companies or who are contractors. We will see that we will have shifting colleagues and working scenarios around data, especially where it is connected to business outcomes.

Is there anything we have not discussed that you would like to cover?

No, I think we have covered all my pet hobby horses!

What do you like to do outside of work?

I am at an age now where I have grandchildren so I really like to be with them. I like to walk and bicycle – it is good exercise but you can also think while you do these things. It is as much to freshen my mind and do some out-of-the-box thinking.

I am very interested in history so I like to visit historical places – fortunately, I can often combine that with my work so when I travel I try to leave some time for pleasure and to experience different cultures. Things have changed a lot since I was young – it used to be completely different wherever you went but now you have very similar shops on every high street – always an H&M and a Zara. You can tell that I have daughters!

Which 3 people – living or dead, real or fictional – would you invite to a dinner party and why?

I said I was very interested in history so it’s probably not surprising that I would pick some historical figures. The first one would be Herodotus – an ancient Greek historian – perhaps the father of history. He was probably the first person who applied a methodology to writing history – he did not just pick the first and best anecdote. Instead, he gathered the facts and made an informed decision on what to include.

Charles Darwin – famous for his theory of survival of the fittest. Modern science has come up with a slightly different theory which is about the survival of the fit enough – you don’t always have to be a winner.

“There’s a clear analogy to MDM in that – the thing about the golden record being the only data to survive is not true – there is more than one version of the truth.”

It would be good to hear what Charles Darwin thinks of these later theories.

And with these clever people, I think Albert Einstein would be the third one.

“There are all these quotes on the internet that are attributed to him and it would be good to know what he really said and what he really meant instead of these quotes that we can’t find the source for. That really irritates a data quality practitioner!”

That sounds great – what are you cooking – can I come too?

If I am cooking my signature dish is Spanish Paella. And sure, you are welcome.