Make Test Dataset

When you encounter unexpected errors in the power-grid-model, you would like certainly to report the issue and debug (maybe by another developer) the calculation core with certain dataset. To make this possible, we have implemented a generic mechanism to export/import the dataset to/from JSON files, and to debug the calculation core in both Python and C++ with the test dataset.

In this notebook we will learn how test datasets are made in this repository, including:

Structure of validation test datasets in this repository
Format of test datasets (JSON)
Use of helper functions to save and load the datasets

Structure of Validation Datasets

All validation test datasets are located in the tests/data folder. The structure of the folder is as follows:

data
   |
   |
   - power_flow
             |
             - power_flow_testset_1
             - power_flow_testset_2
             - ...
   - state_estimation
             |
             - state_estimation_testset_1
             - state_estimation_testset_2
             - ...

The testsets are separated in two types of calculations: power_flow and state_estimation. In each folder there are subfolders for individual testset. The test datasets are used in both Python and C++ unit tests. Therefore, once you create extra test datasets in the folder, you can debug the program in both Python and C++.

Individual Dataset

An individual dataset folder (in either power_flow or state_estimation) will consists of following files:

params.json: calculation parameters, mandatory
input.json: input data, mandatory
sym_output.json: reference symmetric output
asym_output.json: reference asymmetric output
update_batch.json: update batch data, mandatory if sym_output_batch.json or asym_output_batch.json exists.
sym_output_batch.json: reference symmetric batch output
asym_output_batch.json: reference asymmetric batch output

The params.json and input.json are always needed. The test program (in Python and C++) will detect other files and instantiate relevant test calculations. For example, if sym_output.json exists, the test program will run a one-time symmetric calculation and compare the results to the reference results in sym_output.json.

Test Parameters

The params.json looks something like this:

{
  "calculation_method": "iterative_linear",
  "rtol": 1e-8,
  "atol": {
    "default": 1e-8,
    ".+_residual": 1e-4
  }
}

You need to specify the method to use for the calculation, the relative and absolute tolerance to compare the calculation results with the reference results. For rtol you always give one number. For atol you can also give one number, or you can give a dictionary with regular expressions to match the attribute names. In this way you can have fine control of individual tolerance for each attribut (e.g. active/reactive power). In the example it has an absolute tolerance of 1e-4 for attributes which ends with _residual and 1e-8 for everything else.

The calculation_method can be one string or list of strings. In the latter case, the test program will run the validation test mutilple times using all the specified methods.

JSON Data Format

The data format for json is object based. You give a dictionary of all component types. Each entry is a list of components. And each component is a dictionary of named attributes and values.

Input

See below as an example of input.json. Note that you don’t need to specify the optional attributes if you don’t need them. In the example the r0 are x0 are not specified for lines.

{
  "node": [
    {
      "id": 1,
      "u_rated": 10.0
    },
    {
      "id": 2,
      "u_rated": 10.0
    }
  ],
  "line": [
    {
      "id": 4,
      "from_node": 1,
      "to_node": 2,
      "from_status": 1,
      "to_status": 1,
      "r1": 0.0,
      "x1": 1.0,
      "c1": 0.0,
      "tan1": 0.0,
      "i_n": 1e3
    }
  ],
  "source": [
    {
      "id": 6,
      "node": 1,
      "status": 1,
      "u_ref": 1.0,
      "sk": 100.0,
      "rx_ratio": 0.0
    }
  ],
  "sym_load": [
    {
      "id": 7,
      "node": 2,
      "status": 1,
      "type": 0,
      "p_specified": 0.0,
      "q_specified": 0.0
    },
    {
      "id": 8,
      "node": 2,
      "status": 1,
      "type": 1,
      "p_specified": 0.0,
      "q_specified": 0.0
    }
  ]
}

Output

An example of sym_output.json is shown below. The test program will only compare the results which exist in the output file. For the example below, the program only compares the attribute of id and u for component node. Other results are not compared (e.g. u_pu for node or all attributes from line).

{
  "node": [
    {
      "id": 1,
      "u": 10.0
    },
    {
      "id": 2,
      "u": 10.0
    }
  ]
}

It is important that you specify all the components of the same type (you can skip a type completely) in the output file, also in the same order as the input file!. The available attributes have also to be the same for all the components. the The program will fail if you specify the output in the following JSON files.

Not all components are present for a type

{
  "node": [
    {
      "id": 1,
      "u": 10.0
    }
  ]
}

Order of the components are not the same as the input file

{
  "node": [
    {
      "id": 2,
      "u": 10.0
    },
    {
      "id": 1,
      "u": 10.0
    }
  ]
}

Attributes are not the same.

{
  "node": [
    {
      "id": 1,
      "u_pu": 1.0
    },
    {
      "id": 2,
      "u": 10.0
    }
  ]
}

Batch Dataset

You can also execute batch calculations in the test program. You need to then specify a series of mutations in update_batch.json in a list. See below an example. Each mutation is always applied to the original input dataset. You don’t need to specify the same components in each batch.

[
  {
    "sym_load": [
      {
        "id": 7,
        "q_specified": 6.9444444444444455
      }
    ]
  },
  {
    "sym_load": [
      {
        "id": 8,
        "q_specified": 6.666666666666667
      }
    ]
  },
  {
    "sym_load": [
      {
        "id": 7,
        "q_specified": 2.132231404958678
      },
      {
        "id": 8,
        "q_specified": 2.4199999999999995
      }
    ]
  }
]

With the update dataset, you can then put a batch reference result file sym_output_batch.json:

[
  {
    "node": [
      {
        "id": 1,
        "u": 9.166666666666666
      },
      {
        "id": 2,
        "u": 8.333333333333334
      }
    ]
  },
  {
    "node": [
      {
        "id": 1,
        "u": 9.411764705882353
      },
      {
        "id": 2,
        "u": 8.823529411764707
      }
    ]
  },
  {
    "node": [
      {
        "id": 1,
        "u": 9.545454545454545
      },
      {
        "id": 2,
        "u": 9.090909090909092
      }
    ]
  }
]

The length of the result list should be the same as the length of the update batch list.

Empty Result File

If you encounter a crash for a certain dataset. You can also create the input data into JSON files. In this case you might not have any reference result to compare, because you just need to find where the crash happens. You still need an empty (dictionary) result file to trigger the calculation.

For sym_output.json it is just an empty dictionary:

{}

For sym_output_batch.json it is a list of empty dictionaries:

[{}, {}, {}]

Helper Functions to Import and Export

In the module power_grid_model.utils we have some helper functions to import a json file to a power-grid-model compatible dataset, or the other way around.

Please refer to the documentation for detailed function signature.

In this notebook we export the example network from Power Flow to json.

# first build the network

import numpy as np
import pandas as pd

from power_grid_model import LoadGenType
from power_grid_model import PowerGridModel, CalculationMethod
from power_grid_model import initialize_array

# network

# node
node = initialize_array("input", "node", 3)
node["id"] = [1, 2, 6]
node["u_rated"] = [10.5e3, 10.5e3, 10.5e3]

# line
line = initialize_array("input", "line", 3)
line["id"] = [3, 5, 8]
line["from_node"] = [1, 2, 1]
line["to_node"] = [2, 6, 6]
line["from_status"] = [1, 1, 1]
line["to_status"] = [1, 1, 1]
line["r1"] = [0.25, 0.25, 0.25]
line["x1"] = [0.2, 0.2, 0.2]
line["c1"] = [10e-6, 10e-6, 10e-6]
line["tan1"] = [0.0, 0.0, 0.0]
line["i_n"] = [1000, 1000, 1000]

# load
sym_load = initialize_array("input", "sym_load", 2)
sym_load["id"] = [4, 7]
sym_load["node"] = [2, 6]
sym_load["status"] = [1, 1]
sym_load["type"] = [LoadGenType.const_power, LoadGenType.const_power]
sym_load["p_specified"] = [20e6, 10e6]
sym_load["q_specified"] = [5e6, 2e6]

# source
source = initialize_array("input", "source", 1)
source["id"] = [10]
source["node"] = [1]
source["status"] = [1]
source["u_ref"] = [1.0]

# all
input_data = {"node": node, "line": line, "sym_load": sym_load, "source": source}

Export to JSON

We can use the fuction export_json_data the input data to a json file.

from power_grid_model.utils import export_json_data
import tempfile
from pathlib import Path

temp_path = Path(tempfile.gettempdir())
export_json_data(temp_path / "input.json", input_data)

# we can display the json file

with open(temp_path / "input.json", "r") as f:
    print(f.read())

{
  "node": [
    {
      "id": 1,
      "u_rated": 10500.0
    },
    {
      "id": 2,
      "u_rated": 10500.0
    },
    {
      "id": 6,
      "u_rated": 10500.0
    }
  ],
  "line": [
    {
      "id": 3,
      "from_node": 1,
      "to_node": 2,
      "from_status": 1,
      "to_status": 1,
      "r1": 0.25,
      "x1": 0.2,
      "c1": 1e-05,
      "tan1": 0.0,
      "i_n": 1000.0
    },
    {
      "id": 5,
      "from_node": 2,
      "to_node": 6,
      "from_status": 1,
      "to_status": 1,
      "r1": 0.25,
      "x1": 0.2,
      "c1": 1e-05,
      "tan1": 0.0,
      "i_n": 1000.0
    },
    {
      "id": 8,
      "from_node": 1,
      "to_node": 6,
      "from_status": 1,
      "to_status": 1,
      "r1": 0.25,
      "x1": 0.2,
      "c1": 1e-05,
      "tan1": 0.0,
      "i_n": 1000.0
    }
  ],
  "sym_load": [
    {
      "id": 4,
      "node": 2,
      "status": 1,
      "type": 0,
      "p_specified": 20000000.0,
      "q_specified": 5000000.0
    },
    {
      "id": 7,
      "node": 6,
      "status": 1,
      "type": 0,
      "p_specified": 10000000.0,
      "q_specified": 2000000.0
    }
  ],
  "source": [
    {
      "id": 10,
      "node": 1,
      "status": 1,
      "u_ref": 1.0
    }
  ]
}

Import JSON

We can use the fuction import_json_data the input data to a json file. To import json you need to specify the type of the data, it is either input, update, sym_output, or asym_output.

# round trip and run power flow

from power_grid_model.utils import import_json_data

imported_data = import_json_data(temp_path / "input.json", "input")

pgm = PowerGridModel(imported_data)
result = pgm.calculate_power_flow()

import pandas as pd

print(pd.DataFrame(result["node"]))

   id  energized      u_pu             u   u_angle
 1          1  0.998988  10489.375043 -0.003039
 2          1  0.952126   9997.325181 -0.026031
 6          1  0.962096  10102.012975 -0.021895

Import and Export Batch Update/Result Dataset

You can use the same function to import and export batch update/result dataset for batch calculation.

# create batch set

load_profile = initialize_array("update", "sym_load", (3, 2))
load_profile["id"] = [[4, 7]]
# this is a scale of load from 0% to 100%
load_profile["p_specified"] = [[30e6, 15e6]] * np.linspace(0, 1, 3).reshape(-1, 1)


time_series_mutation = {"sym_load": load_profile}

# export and print

export_json_data(temp_path / "update_batch.json", time_series_mutation)

with open(temp_path / "update_batch.json", "r") as f:
    print(f.read())

[
  {
    "sym_load": [
      {
        "id": 4,
        "p_specified": 0.0
      },
      {
        "id": 7,
        "p_specified": 0.0
      }
    ]
  },
  {
    "sym_load": [
      {
        "id": 4,
        "p_specified": 15000000.0
      },
      {
        "id": 7,
        "p_specified": 7500000.0
      }
    ]
  },
  {
    "sym_load": [
      {
        "id": 4,
        "p_specified": 30000000.0
      },
      {
        "id": 7,
        "p_specified": 15000000.0
      }
    ]
  }
]

# import round trip, calculate

imported_batch_update = import_json_data(temp_path / "update_batch.json", "update")

batch_result = pgm.calculate_power_flow(update_data=imported_batch_update)

print(batch_result["sym_load"]["p"])

[[       0.        0.]
 [15000000.  7500000.]
 [30000000. 15000000.]]