Originally published at Dataconomy at http://dataconomy.com/simplifying-data-acquisition-augmentation/
Despite the advances in data integration and the blurring of the lines between batch (ETL) and real-time (EAI) integration, the process of managing and blending third-party or externally public datasets continues to be a burden for IT, and a cost for business, both in dollars and missed opportunities. Identifying and blending third party premium data sources has been a painful and costly process.
The advent of Data as a Service (DaaS) built into data-driven applications dramatically changes the game, not just for acquiring external data from multiple sources, but for sharing data internally and providing the opportunity to monetize data through outbound licensing.
DaaS gives business end users real-time access to external data providers and industry sources that they can subscribe to on-demand, enriching their data profiles or investigating segments for a new target market. Data acquisition during the new product launch is extremely critical and often becomes a bottleneck. Quick access to data sets of choice from readily available data tenants in the cloud, via simple subscription, reduces time-to-market. The following new trends and advancements are simplifying data acquisition.
1. EASY CONNECTION AND CONSUMPTION
Companies struggle to keep their customers, products and account records in order. It’s an ongoing monumental task that requires the constant evolution of data management practices that IT organizations are challenged to handle due to the constantly evolving nature of the data and data sources. Companies do not like to be locked-in by a single data source and want the ability to switch when the business needs change, but the cost of acquisition and integration makes it impossible.
Today’s modern data management technology is making it easier and faster for companies to access third-party data and enrich the data collected from internal applications and systems. Data as a Service provides the pre-aligned premium third party, social media, and public industry sources that are on-boarded and combined with internal data as per the business policies, with just one-click. Data as a service is the way forward for the agile organizations where they need to update and change the data sources because of the dynamic business environment. DaaS allows companies to choose from the excellent range of data providers, pre-integrated and accessible through a simple point and click.
2. REAL-TIME SEARCH AND UPDATES ACROSS DATA SOURCES
Once it is easy to connect to data sources in the cloud, whole new possibilities emerge. The on-demand access to data sources allows the business user to search directly for data sets across multiple sources. Organizations can search the entire universe of external datasets in real-time. They can search and filter the data using hundreds of attributes, data quality metrics, and even the business value of profiles. Business users can then purchase the data they need from multiple vendors and dynamically onboard and combine the data with their own. With DaaS, data provider updates are propagated automatically to the end user application without any waiting for batch loads or quarterly refreshes.
3. COLLABORATIVE CURATION OF DATA
Data cleansing is not a one-and-done task. Companies must continuously monitor and maintain the data. Modern data management platforms have inbuilt collaboration capabilities like ratings, discussion threads, and even gamification to enable and encourage data users to contribute towards data quality. This will allow data providers and data consumers to collaborate more closely and keep data current and clean. Data users will participate in data curation, provide feedback, and make data change requests to the data vendor seamlessly, even from mobile business applications. This real-time feedback keeps the data current. DaaS makes it easy to bring in third party data assets to enrich information for data-driven applications also allows data consumers and data providers to curate data collaboratively, optimizing the efficiency and cost on both sides.
4. MACHINE LEARNING HELPS WITH MATCHING
Modern data management platforms utilize machine learning to match and merge data from various sources in the cloud to create a reliable data foundation. As the data sources increase and users get the ability to switch data sources quickly, simple rules-based matching will not be sufficient. Various machine learning algorithms will be helping data stewards with matching and merging data from dozens of sources to create complete profiles of data entities. However, there is still a level of distrust on black box machine matching. People want visibility into the matching logic. In this initial phase of machine learning, transparency of the actual rules that drive the merge is necessary. Data consumers want to evaluate the machine-discovered rules and then persist it in the system for ongoing matching.
Enterprise data is a valuable asset. Chief data officers (CDOs) are thinking about not just using data to improve operational efficiency within their business, but also monetizing data as a significant revenue stream. Companies will increasingly use reliable a data foundation to ensure data quality for internal consumption and to create new revenue streams by monetizing this data.
DaaS capabilities in cloud-based modern data management Platform as a Service (PaaS) solutions are enabling companies to consume the data efficiently as well as providing an opportunity to distribute their data across internal business units and license the data externally. The technology that is used to bring in third party data can be used to distribute and license a company’s data externally, effectively making the company a Data as a Service provider.But the companies must provide full audit and lineage as to where the data originated from so that licensing rights are clear.
Data as a Service is changing the way organizations view data subscriptions. Data subscriptions and integration into business applications is no longer an arduous task. Data acquisition and consumption are becoming more agile with pre-aligned, one-click onboarding. Modern data management platform as a service now offers capabilities like collaborative curation of data, machine learning for matching, and empowers business users to derive the most value of their data assets, through internal use and monetization