Understanding Rank and Data Quality Score

Learn how to use Rank and Data Quality Score.

Rank helps to identify the key influencers in the entity set. Whereas, Data Quality Score helps to identify any gap in the data quality. When used in conjunction with Rank, Data Quality Score helps to address the data quality of key profiles.

Rank

Rank defines the importance of a profile and can be a number between 1 - 100. Entities may share the same rank if the key components used in the calculation are the same.

It is calculated using the following key components of a profile:

  • Number of sources - more sources indicate that information is shared more widely. Hence, the importance of the profile is increased.
  • Number of relationships - higher the number of relationships, higher is the importance of a profile.
  • Number of potential matches - this can be both positive or negative (range -1 to 1).
  • Data Quality Score - validity of data is high.

DQ Score

Data Quality (DQ) Score represents the quality of data and can be a number between 1 - 100. Entities may share the same score if the key components used in the calculation are the same.

It is calculated at the profile, entity type, and the overall (tenant) levels.

Currently, for Data Quality Score, the following two metrics are provided:

  • Completeness - based on empty vs non-empty values
  • Recency - based on last updated date
Note:
  • By default, if a match rule exists for an entity type, the system chooses and evaluates all attributes in the match rule.
  • If no match rule exists for an entity type, the system picks the first ten attributes specified in the UI configuration.

Use Cases

Note: Instead of running models for analysis on the entire data set, you can use entities with a high Data Quality Score. This benefits you with models of higher accuracy and better outcomes.

A profile is richer and more accurate when there are multiple sources contributing. When a profile is related to many other profiles, it translates to higher importance for that profile. For example, a doctor related to many hospitals can be a potential key opinion leader. Being related to multiple hospitals, the doctor is assumed to have multiple connections. The doctor's profile is thus considered to have a high Rank.

However, for high Data Quality Score, the doctor's profile must have good quality of data. That is, appropriate first name, last name, contact number, email address, and so on. With all these information available, the doctor's profile is now considered to have high Data Quality Score as well.

Note:
  • No configuration is required for calculating the scores. However, it is possible to customize these scores based on specific requirements.
  • The initial Data Quality configuration is created with the attributes that are defined in the match rules. If no match rules are available, the first few attributes are chosen.
  • Reindexing a profile makes the Data Quality Scores searchable in the Reltio UI. The reindex option can be manual or automated. By default, reindexing is manual. If you want to reindex automatically after Data Quality calculation, contact the Customer Support Team.
    Note: You can also reindex by using the Tenant Management application in the Reltio Console.
  • The weightage and selection of these attributes can be modified in the Data Quality Configuration UI available in the Console.
  • To customize the Data Quality Score, you can use Reltio IQ. You can choose to subscribe for any entity type and change the thresholds for the calculated scores through a microservice.
  • Both scores are calculated and refreshed weekly at an automated schedule.
  • By default, all entities defined in L3 are subscribed.