Best Practices for Data Profiling and Cleanse

Data cleansing is the process of applying the findings of data profiling to standardize the data and remove anomalous patterns. Whereas, data profiling is the process of examining your source data.

It is crucial to profile and analyze the data before bringing it into any data management repository, including Reltio. Data profiling helps with many aspects of design and the following are some of those aspects:

  • Determining the quality, the range of values, consistency, and completeness of data within a source and across all sources
  • Identifying the source attributes that qualify as good elements for matching purposes
  • Identifying the source attributes that must never be used in the matching process. These attributes may negatively impact the performance or result of the matching.
  • Identifying the reference data, consistency, and commonality of the referenced data across sources
  • Identifying the attributes that can be used for faceted search
  • Data mapping from customer data sources to the target model within the Reltio Connected Cloud
Note: 2018.2 version of the Reltio Connected Cloud was considered at the time of writing this document.