Dataset Terminology

Some terms regarding the data structures are explained here, including the definition of dataset, component, and attribute.

Data structures

  • SingleDataset: A data type storing input data (ie. all elements of all components) for a single scenario

  • BatchDataset: A data type storing update and or output data for one or more scenarios

  • Dataset: Either a single or a batch dataset

Type of Dataset

The type of Dataset. i.e. input, update, sym_output, asym_output, sym_sc_output, asym_sc_output. The examples in brackets are given in context of a dataset of a line component.

  • input: Contains attributes relevant to configuration of grid.(eg. id, from_node, from_status, …)

  • update: Contains attributes relevant to multiple scenarios. (eg. from_status,to_status)

  • sym_output: Contains attributes relevant to symmetrical steady state output of power flow or state estimation calculation. (eg. p_from, p_to, …)

  • asym_output: Contains attributes relevant to asymmetrical steady state output of power flow or state estimation calculation. (eg. p_from, p_to, …). Attributes are similar to sym_output except some values of the asymmetrical dataset will contain detailed data for all 3 phases individually.

  • sym_sc_output: Contains attributes relevant to symmetrical short circuit calculation output. (eg. i_from, i_from_angle, …)

  • asym_sc_output: Contains attributes relevant to asymmetrical short circuit calculation output. (eg. i_from, i_from_angle, …). Attributes are similar sym_sc_output while some values of the asymmetrical dataset will contain detailed data for all 3 phases individually.

Terms regarding data structures

  • Component: The definition of a part of a grid: e.g. node, source, line, etc. Check highlighted section of graph in Component Hierarchy

  • Element: A single instance of a node, source, line etc.

  • Attribute: The definition of id, energized, p, etc. of any component.

  • Value: The value under an attribute, ie. id, energized, p, etc.

  • Array: All elements of one specific component, for one or more scenarios. I.e. a node array or line array. An array can be one dimensional (containing all elements of a single component for one scenario), two-dimensional (containing all elements of a single component for multiple scenarios), or it can be a sparse array, which is essentially a dictionary with a data buffer and an index pointer.

The Power Grid Model can process many scenarios (i.e. time steps, switch states, etc.) at once, which we call a batch. The batch size is the number of scenarios.

  • Scenario: A single time step / switch state topology.

  • Batch: The total set of scenarios. (there is only one batch)

  • Batch size: The total number of scenarios in the batch.

  • n_scenarios: The total number of scenarios in the batch. (Same as Batch Size)

  • n_component_elements_per_scenario: The number of elements of a specific component for each scenario. This can be an integer (for dense batches), or a list of integers for sparse batches, where each integer in the list represents the number of elements of a specific component for the scenario corresponding to the index of the integer.

  • Sub-batch: Only used internally, in the C++ code, when all scenarios in a batch calculation are distributed over multiple threads, so that each thread can handle a sub-batch, to utilize the calculation power of multi-core processors.

Attributes of Components

Attribute characteristic

Description

name

Name of the attribute. It is exactly the same as the attribute name in power_grid_model.power_grid_meta_data.

data type

Data type of the attribute. It is either a type from the table in Native Data Interface. Or it can be an enumeration as above defined. There are two special data types RealValueInput and RealValueOutput which are independent.

RealValueInput is used for some input attributes. It is a double for a symmetric class (e.g. sym_load) and double[3] an asymmetric class (e.g. asym_load). It is explained in detail in the corresponding types.

RealValueOutput is used for many output attributes. It is a double in symmetric calculation and double[3] for asymmetric calculation.

unit

Unit of the attribute, if it is applicable. As a general rule, only standard SI units without any prefix are used.

description

Description of the attribute.

required

If the attribute is required. If not, then it is optional. Note if you choose not to specify an optional attribute, it should have the null value as defined in Basic Data Types.

update

If the attribute can be mutated by the update call PowerGridModel.update on an existing instance, only applicable when this attribute is part of an input dataset.

valid values

If applicable, an indication which values are valid for the input data