Data Access API

You can define the data scope for analytics (Scala-driven) using the Data Access API (part of the Reltio Data Science Spark SDK).

This API provides a simple and intuitive way to construct Spark Datasets from underlying Reltio objects. Any Reltio object (entity, interaction, relation) can be accessed in Spark.

Usage

The Data Access API usage includes the following high level steps:

  • Initialization
  • Define Dataset based on entities/attributes/relationships/interactions data from MDM
  • Get schema
  • Run queries
Note: During initialization, while updating the existing tenant configuration or reading a different tenant configuration from a Qubole notebook, if the internal cache does not get updated, you may not be able to find a new attribute. In such scenarios, perform the following steps to refresh the internal cache:
  1. Invoke the com.reltio.analytics.utils.config.ContextConfig.releaseServices() service.
  2. Re-login to the tenant. The internal cache is refreshed.