Defining Datasets with Matches

The Data Access API provides information about exposing matches as Dataset fields.

For a match event, you can expose the following information:
  • Type of match (For example, Potential Match, Manual Match, or Not a Match)
  • Match rules
  • Timestamp

Using .matches

Use the .matches method with the matchType parameter to get information about matches:


import com.reltio.analytics.objects.Match.MatchType._
//possible values:
POTENTIAL_MATCH, NOT_MATCH, MANUAL_MATCH, ANY
af.matches(matchType = POTENTIAL_MATCH).show
    		

Result Dataset:


| matchKey        | sourceId | targetId | matchRules           | timestamp     | type            | matchScore |
| 1EhCWga:1DypjwO | 1EhCWga  | 1DypjwO  | [configuration/en... | 1490104311627 | POTENTIAL_MATCH | 30         |
| 1DypjwO:1EhCWga | 1DypjwO  | 1EhCWga  | [configuration/en... | 1490104311627 | POTENTIAL_MATCH | 30         |
    		
Note:
  • Each match represents a bidirectional edge that connects both entities. Two rows exist for each pair of matched entities, where sourceId and targetId are swapped.
  • matchRules contains an array of matchGroups URI for potential matches. Other types of matches have an empty array in this field.
  • matchScore is calculated only for POTENTIAL_MATCHES that have a non-empty list of matchRules. Calculation process is based on scoreIncremental/scoreStandalone values from matchGroup config and resembles the formula from Unmark Entities Defined as Matches.
  • timestamp represents the platform's event timestamp indicating when the match was created.

    Exception: if POTENTIAL_MATCH exists for this mergeKey with timestamp t1, and MANUAL_MATCH or NOT_MATCH was issued for this mergeKey with timestamp t2 > t1, the POTENTIAL_MATCH entry is removed. If MANUAL_MATCH/NO_MATCH is later reset, POTENTIAL_MATCH is back, but its timestamp now equals the timestamp of the reset event + 1ms.