Data validation is the technique of restricting the possible range of values of a data field in order to eliminate entry errors, inconsistencies, and to build context around a data field.
Validation rules applied during data (such as on electronic forms) entry can instantly determine if input data falls within the acceptable range of values and prevent erroneous values. For instance, when recording addresses, each US state has a standardized official code and abbreviation, which can be adopted as the acceptable range of values for state locations. Any value that falls outside of this range is not accepted, whereas any valid value is accepted as input, regardless if it is accurate. (The state code XY would be rejected, and AZ would be accepted because it is in the range of acceptable state abbreviations.)
Validation can be carried out further than simply cross-referencing for a valid input. A clever designer will realize that there are certain patterns that can be validated against, such as addresses which can be validated against zip codes. While a simple validation rule can restrict the range of state abbreviation, it can also open up inaccuracies—while an address may be in Hollywood, it is possible that an incorrect state abbreviation, one other than California (CA), may be inputted. By using zip codes, the designer can create a more accurate scope of validation by comparing it to a higher validation standard. That is because the US Postal Service knows which addresses fall within each zip code, an address is valid if it is in the right zip code. In fact many address validation tools and services exist simply to augment client data systems in this capacity.
Valid data is a key principle of Data Quality, and as the address example shows, validation can become a complex undertaking for which many techniques and tools have been developed to make it more manageable.