Utilities

The physlearn.datasets.google.utils._dataset_helper_functions module provides basic utilities for wrangling, serializing, and deserializing superconducting quantum computing calibration data.

physlearn.datasets.google.utils._helper_functions._json_dump(train_test_data, folder, n_qubits=None)[source]

Serializes the training and test data dictionary as a JSON formatted stream.

Parameters
  • train_test_data (dict) – A dictionary with keys: ‘X_train’, ‘X_test’, ‘y_train’, and ‘y_test’.

  • folder (str) – Directory in which the training and test data is dumped.

  • n_qubits (int or None, optional (default=None)) – Number of qubits. If specified, then this value is utilied in the file name.

physlearn.datasets.google.utils._helper_functions._json_load(filename)[source]

Deserializes the training and test data dictionary.

The training and test data dictionary were serialized as a JSON formatted stream.

Parameters

filename (str) – Name of the file in which the training and test data dictionary has been dumped.

Returns

train_test_data

Return type

dict

physlearn.datasets.google.utils._helper_functions._train_test_split(X, y, test_size, random_state)[source]

Splits the X and y data intro training and test data.

The split is determined by the fraction of the test size.

Parameters
  • X (DataFrame or Series) – The design matrix, where each row corresponds to an example and the column(s) correspond to the feature(s).

  • y (DataFrame or Series) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).

  • test_size (float) – The decimal amount of test data.

  • random_state (int, RandomState instance or None.) – Determines random number generation in sklearn.model_selection.train_test_split.

Returns

train_test_data

Return type

dict

Notes

As shuffling is handled by sklearn.utils.shuffle, there is no shuffling parameter.

physlearn.datasets.google.utils._helper_functions._shuffle(data, drop=True)[source]

Shuffles the pandas data object.

Parameters
  • data (DataFrame or Series) – The pandas data that is to be shuffled.

  • drop (bool) – Resets the index of the pandas data object.

Returns

pandas

Return type

DataFrame or Series

physlearn.datasets.google.utils._helper_functions._iqr_outlier_mask(data)[source]

Computes the interquartile range, then it masks the outliers.

Parameters

data (DataFrame or Series) – The pandas data that is to be masked.

Returns

pandas

Return type

DataFrame or Series

physlearn.datasets.google.utils._helper_functions._path_to_google_data()[source]

Finds the path to the Google quantum computer calibration data.

Returns

path

Return type

str

physlearn.datasets.google.utils._helper_functions._path_to_google_json_folder()[source]

Finds the path to the folder with the serialized Google data.

Returns

path

Return type

str