Defining Datasets for Relations

Learn about defining datasets for relations.

Dataset for HasAddress relation

The following code creates a new dataframe for the configuration/relationTypes/HasAddress relation type:

val relations:DataFrame = framework.dataAccess
  .dataset(
    new RelationDatasetBuilder()
      .ofType("configuration/relationTypes/HasAddress")
      .select("Id")
      .select("attributes")
      .explode(“attributes.AddressType”, “AddressType”)
      .asTable("relationsTable")
  ).build()

Dataset for HasAddress relation with start or end object and crosswalks

The following code creates a new dataframe for HasAddress relation with start or end object and crosswalks:

val relations:DataFrame = framework.dataAccess
  .dataset(
    new RelationDatasetBuilder()
      .ofType("configuration/relationTypes/HasAddress")
      .select("Id")
      .select("attributes")
      .select(“Start”)
      .crosswalks()
      .asTable("relationsTable")
  ).build()
Note: URI of the end object is stored in the End field. So, to include it to the dataframe, select this field together with or instead of the Start field.

Viewing Data in the Dataset

Perform the following steps to view data in the dataset:

  1. To view all the data in the dataset using SQL, execute the following query:

    %sql select * from relationsTable

  2. Another way to view data in the dataset is to call the native method .show():
    DataFrame.count()
    Note: By default, the .show() method trims the number of visible columns. To view the whole dataframe, use .show(false).
  3. To get the number of objects in the dataframe, either use an SQL statement or the native method count():

    relations.count()

  4. To show the dataframe schema, use the printSchema() method:

    relations.printSchema()