# Native Data Interface The calculation core of `power-grid-model` is written in C++. To interface with Python side, a format is needed to exchange the input/output data between Python and native C++ compiled code. This library chooses dictionary of [numpy structured arrays](https://numpy.org/doc/stable/user/basics.rec.html) as the data format. Each entry in the dictionary represents one type of components: the key is the component type, the value is a `numpy` structured array. Each element in the array represents one single physical component. ## Structured Array To use the component type `node` as an example, the input data of a `node` is defined in C++ as follows (not exactly the definition in real code, only a demonstration example). A node input data contains two attributes: `id` and `u_rated`. ```C++ struct NodeInput { int32_t id; double u_rated; }; ``` One can create a `std::vector` to hold input dataset for multiple nodes. In the example below a node input dataset is created with two nodes. The `id` is 1 and 2, and the attribute `u_rated` is 150 kV and 10 kV. ```c++ std::vector node_input{ { 1, 150e3 }, { 2, 10e3 } }; ``` In the Python side, we create a `numpy` structured array with exactly the same memory layout. We specify the same attributes with the same data types and memory offset. ```python import numpy as np node_dtype = np.dtype( { 'names': [AttributeType.id, AttributeType.u_rated], 'formats': ['` above. ```python node = np.empty(shape=2, dtype=node_dtype) node[AttributeType.id] = [1, 2] node[AttributeType.u_rated] = [150e3, 10e3] ``` ## Columnar data format Additionally, we can represent the contents mentioned `NodeInput` struct in [Structured Array](#structured-array) for only specific attributes. This is especially useful when the component in question, e.g., a transformer, has many default attributes. In that case, the user can save significantly on memory usage. Hence, we can term it into `NodeInputURated` which is of `double` type. (note again, its representation in C++ core might be different than that of `NodeInputURated`). One can create a `std::vector` to hold input for multiple nodes. In a similar example we create attribute data with `u_rated` of two nodes of 150 kV and 10 kV. ```c++ using NodeInputURated = double; std::vector node_u_rated_input{ 150.0e3 , 10.0e3 }; ``` Similar would be the case for `NodeInputId` and `std::vector`. To recreate this in Python using NumPy arrays, we should create it with the correct dtype - as mentioned in [Structured Array](#structured-array) - for each attribute. ```python node_id = np.empty(shape=2, dtype=node_dtype[AttributeType.id]) node_id[AttributeType.id] = [1, 2] node_u_rated = np.empty(shape=2, dtype=node_dtype[AttributeType.u_rated]) node_u_rated[AttributeType.u_rated] = [150e3, 10e3] ``` ## Creating Dataset We further save this array into a dictionary. With other types of components, the dictionary is a valid input dataset for the constructor of `PowerGridModel`, see [Python API Reference](../api_reference/python-api-reference.md). For a row based data format, ```python input_data = {ComponentType.node: node} ``` or for columnar data format, ```python input_data_columnar = {ComponentType.node: {AttributeType.id: node_id, AttributeType.u_rated: node_u_rated}} ``` There can also be a combination of both row based and columnar data format in a dataset. In the `ctypes` wrapper the pointers to all the array data will be retrieved and passed to the C++ code. This is also true for result dataset. The memory block of result dataset is allocated using `numpy`. The pointers are passed into C++ code so that the C++ program can write results into those memory blocks. ## Basic Data Types The basic data types used in the interface between C++ and Python are shown in the table below. | C++ type | `numpy.dtype` | null value | usage | | ----------- | ------------- | --------------------------------------------------- | --------------------------------------------------------------------------------------- | | `int32_t` | `'i4'` | - 2^31 | ids of physical components | | `int8_t` | `'i1'` | - 2^7 | enumeration types, boolean types, and small integers (e.g. tap position of transformer) | | `double` | `'f8'` | NaN ([IEEE 754](https://en.wikipedia.org/wiki/NaN)) | physical quantities (e.g. voltage) | | `double[3]` | `'(3, )f8'` | NaN ([IEEE 754](https://en.wikipedia.org/wiki/NaN)) | three-phase asymmetric physical quantities (e.g. voltage) | *The [endianness](https://en.wikipedia.org/wiki/Endianness) of C++ and Python side is also matched. For `x86-64` platform the little endian is used, so that the `numpy.dtype` is `'