Category | Glossary Topic | Overview |
Artificial Intelligence | What is AI? | Artificial Intelligence (AI) refers to computer systems designed to mimic human intelligence and perform tasks that typically require human cognitive abilities. |
| What is augmented data management? | Augmented data management (ADM) is an advanced approach to handling and processing data that combines artificial intelligence (AI), machine learning (ML), and natural language processing (NLP) technologies. |
| What is ethical AI? | Ethical AI refers to the principles and practices aimed at ensuring that artificial intelligence systems operate in a manner that is fair, responsible, and aligned with societal values. |
| What is explainable AI? | Explainable AI (also known as XAI) is a field within artificial intelligence focused on making the decision-making processes of AI systems understandable and transparent to humans. |
| What is generative AI? | Generative AI is an advanced field of artificial intelligence focused on creating new content by learning patterns and structures from existing data. |
| What is GPT? | Generative Pre-trained Transformer (GPT) is an advanced AI language model developed by OpenAI that uses deep learning to generate human-like text. |
| What are large language models? | Large language models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. |
Data Analytics | What are data analytics? | Data analytics is the practice of examining large and varied datasets in order to extract useful insights and information. |
| What is big data? | Big data refers to the large volume, velocity, and variety of data that is generated at a high rate from various sources such as social media, sensors, and business transactions. |
| What are data catalogs? | A data catalog is a centralized repository that provides information about the data assets an organization has available. |
| What are data hierarchies? | Data hierarchies represent the relationship between different data elements in a structured way, often in a tree-like structure. |
| What is a data model? | Data modeling is a process in organizations used to create a simplified diagram of a software system. |
| What is data mapping? | Data mapping is the process of connecting and transforming data from one source to another. |
| What is data observability? | Data observability is a comprehensive approach to monitoring, managing, and ensuring the quality, reliability, and performance of data systems and pipelines across an organization. |
Data Governance | What is data governance? | Data governance is a set of rules applied to incoming and stored data where the rules determine what to do to ensure data adheres to regulatory and quality standards. |
| What is AI governance?
| AI is more than just a buzzword—it’s a transformative technology that is reshaping industries, driving innovation, and changing the way people work. |
| What is data architecture? | Data Architecture is a fundamental pillar of Data Management which translates the short and long term needs of the business into defined data management requirements. |
| What are data lifecycles? | Data lifecycle management is a comprehensive approach to managing data from creation to disposal, including how data is collected, stored, processed, and analyzed. |
| What is data lineage? | Data lineage is the process of tracing the flow of data through an organization, from its original source to its final destination. |
| What is data stewardship? | Data stewardship refers to the management of data assets within an organization. |
| What is data tracing? | Data tracing is the process of tracking data flows and transformations across systems and applications in order to understand the path that data takes through an organization's infrastructure. |
Data Infrastructure | What is data infrastructure? | Data infrastructure is a critical component of modern data-driven organizations, providing the foundation for managing and analyzing large volumes of data to gain insights and make data-driven decisions. |
| The Difference Between a Data Lake, Data Warehouse and Data Lakehouse | In the world of big data, there are three main types of data repositories: data lake, data warehouse, and data lakehouse. |
| What is a graph database? | A graph database is a type of database that uses nodes and edges to represent and store data, allowing for the representation of complex and interconnected data structures. |
| What is a data cloud? | A data cloud is a modern data architecture that enables organizations to store, manage, integrate, and analyze vast amounts of data seamlessly across multiple environments, including on-premises systems, public clouds, private clouds, and hybrid setups. |
| What is data fabric? | Data fabrics are an approach to managing and processing data across distributed environments, providing a unified and integrated approach to data management. |
| What is a data lake? | Understanding data lakes is important for organizations looking to derive insights from their data assets. |
| What is data mesh? | A data mesh is a modern approach to data architecture that emphasizes the decentralized management of data within an organization. |
| What is data warehousing? | Data warehouses are repositories for processed data pooled together from raw data sources and made available for downstream processing, analysis, and reporting business intelligence (BI) to organizational decision makers.A data warehouse is an essential core element for enterprise data infrastructure systems |
Data Integration | What is data integration? | Data integration and interoperability (DII) encompasses the processes related to the movement and ultimate consolidation of enterprise data within data marts, hubs, warehouses, and lakes. |
| What is data consolidation? | Data consolidation is the practice of integrating data from multiple sources into a single target such as a data warehouse to be used by data consumers. |
| What is data extraction? | Data extraction is the first stage in a data integration and interoperability data pipeline. |
| What is dataOps? | Data operations, or DataOps, is the collection of practices and tools used by businesses in the management of their data life cycles. |
| What are data pipelines? | Data pipelines describe the sequential processes used to transform source data into data suitable for storing into target locations. |
| What is data streaming? | Data streaming is the process of continuously and rapidly ingesting, processing, and analyzing large volumes of data as it is generated and received in real-time. |
| What are data workflows? | Data workflows encompass the wider groups of processes that work on data to make it useful. In a simple way, a data workflow is a data pipeline. |
Data Management | What is data management? | Data management refers to the set of practices, techniques, and tools for managing storage of and access to enterprise data assets while ensuring security and governance. |
| What is data interoperability? | Data interoperability is the ability to exchange and process data among different systems and business processes. |
| What is data deduplication? | Data deduplication is a data management technique that identifies and eliminates duplicate or redundant data entries within a dataset. |
| What is customer data management? | Customer data management involves the collection, storage, analysis, and utilization of information about customers to enhance decision-making. |
Data Quality | What is data quality? | Data Quality is a central pillar in any Data Management framework, which assumes data is inherently “dirty” and challenged by multiple inequalities: incompleteness, inaccuracies, duplications, inconsistencies, invalidity, dis-integrity, reasonability, and currency. |
| What is data cleansing? | Data cleansing, cleaning, or scrubbing, is a process within data quality management for the purpose of transforming data to align with data standards and domain rules. |
| What is data integrity? | Data integrity is the assurance that data meets quality standards at all times. |
| What is data profiling? | Data profiling is a form of statistical analysis used to inspect data sets and determine their inherent quality. |
| What is data standardization? | Data standardization is the data quality process of transforming data to fit a predefined and constrained set of values, relying on the power of uniformity to improve data efficiencies. |
| What is data validation? | Data validation is a technique used in Data Quality Management to improve a data’s quality by restricting the range of values in a data field to an acceptable set of values that may be entered. |
| What is entity resolution? | Entity resolution is the process of determining if two data entries actually represent the same real object. |
| What is data preparation? | Data preparation involves cleaning, organizing, and transforming raw data to ensure its quality, consistency, and usability for analysis. |
Data Unification | What is data unification? | Data unification is part of a broad category of services that gather data from different sources, and with different attributes, into a single trusted, unified view. |
Master Data Management | What is master data management? | Businesses that follow a master data approach accept the necessity of using multiple data systems in today’s fast paced digital world, but recognize the overall limitation of these systems to work together and support the needs of the business. |
| What are MDM implementation styles? | Effective master data management (MDM) requires a clear implementation plan and an appropriate implementation style that fits the needs of the business. |
| MDM vs CRM vs CDP | CRMs, CDPs, and MDM platforms all rest on a foundation of data, in particular, surrounding data about customers. |
| What is Customer 360? | In the era of big data and heightened customer expectations, companies are increasingly focusing on efficiently managing and leveraging customer data. |
| What is a customer data platform? | Customer data platforms (CDP) are marketing focused technologies that combine customer data from multiple sources to establish a unified view of each individual customer. |
| What is DaaS? | DaaS stands for Data-as-a-Service and is a cloud-based storage provisioning and data distribution model and access like other “as-a-Service'' models via a network connection and often on pay-as-you-go schemes. |
| What is data engineering? | Data Engineering is the set of processes used to turn raw data into usable data for data scientists and data consumers. |