Data profiling is a data hygiene technique that assesses the quality of the data within a formal data set based on specific business rules. Data profiling is usually performed using a statistical analysis in which a program draws conclusions about the content of a relational database and can determine whether that data meets business standards.
Why use data profiling?
Data profiling gives a lot of information about the quality and utility of a data set with very little human effort. It can tell you what information is contained in the data so that you can decide how to use it or whether to retain it. It can tell you how much data cleansing effort will be needed to merge the data with another data set. Data profiling promotes good data governance.
What are the benefits of data profiling?
The primary benefit of data profiling is having an accurate prediction of how much effort a data project may require and can help strengthen an ROI prediction. Data profiling is an early step for organizations attempting to take control of all enterprise data, especially a Master Data Management initiative.