Estimator Checks

The physlearn.supervised.utils._estimator_checks module provides basic utilities for automated estimator checking.

physlearn.supervised.utils._estimator_checks._basic_autocorrect(init_choice, candidate_choices)[source]

Chooses the candidate string that minimizes the edit distance.

Parameters
  • init_choice (str) – Specify the initial choice as a string, e.g., the Scikit-Learn class Ridge as ‘ridge’, ‘Ridge’, ‘RIDGE’, etc.

  • candidate_choices (list) – A list of candidate choices, where each candidate is a string.

Returns

out_choice

Return type

str

Notes

The edit distance between the initial choice and each possible choice corresponds to the Levenshtein distance, which uses the operations of insertion, removal, or substitution to count the distance.

physlearn.supervised.utils._estimator_checks._check_estimator_choice(estimator_choice, estimator_type, estimator_choices=None)[source]

Chooses the candidate estimator that minimizes the edit distance.

Parameters
  • estimator_choice (str) – Specify the estimator choice as a string, e.g., the Scikit-Learn class Ridge as ‘ridge’, ‘Ridge’, ‘RIDGE’, etc.

  • estimator_type (str) – Specify the supervised learning task, e.g., regression.

  • estimator_choices (list or None, optional (default=None)) – A list of estimator choices, where each estimator is a string.

Returns

estimator_choice

Return type

str

physlearn.supervised.utils._estimator_checks._check_stacking_layer(stacking_layer, estimator_type)[source]

Chooses the the first and second stacking layer estimators.

Parameters
  • stacking_layer (dict) – Specify the estimator(s) in the first stacking layer, and the final estimator in the second stacking layer.

  • estimator_type (str) – Specify the supervised learning task, e.g., regression.

Returns

stacking_layer

Return type

dict

physlearn.supervised.utils._estimator_checks._check_line_search_options(line_search_options)[source]

Checks the line search computation options for base boosting.

Parameters
  • init_guess (int, float, or ndarray) – The initial guess for the expansion coefficient.

  • opt_method (str) – Choice of optimization method. If 'minimize', then scipy.optimize.minimize, else if 'basinhopping', then scipy.optimize.basinhopping.

  • method (str or None) – The type of solver utilized in the optimization method.

  • tol (float or None) – The epsilon tolerance for terminating the optimization method.

  • options (dict or None) – A dictionary of solver options.

  • niter (int or None) – The number of iterations in basin-hopping.

  • T (float or None) – The temperature paramter utilized in basin-hopping, which determines the accept or reject criterion.

  • loss (str) – The loss function utilized in the line search computation, where ‘ls’ denotes the squared error loss function, ‘lad’ denotes the absolute error loss function, ‘huber’ denotes the Huber loss function, and ‘quantile’ denotes the quantile loss function.

  • regularization (int or float) – The regularization strength in the line search computation.

physlearn.supervised.utils._estimator_checks._check_bayesoptcv_param_type(pbounds)[source]

Checks if the Bayesian optimization utility changed the (hyper)parameter type.

Parameters

pbounds (dict) – A dictionary, wherein the keys are the (hyper)parameter names and the values are the (hyper)parameter values.

Returns

pbounds

Return type

dict

Notes

During the sequential Bayesian optimization, the utility occasionally sets the value of a (hyper)parameter with type int to a value with type float.

physlearn.supervised.utils._estimator_checks._preprocess_hyperparams(raw_params, multi_target, chain)[source]

Preprocesses the (hyper)parameters.

The preprocessing is determined by the regression task, and the assumption on the single-targets, if the task is multi-target regression.

Parameters
  • raw_params (dict) – The user provided (hyper)parameters.

  • multi_target (bool) – Distinguishes between single-target and multi-target regression. If True, then the expected task is multi-target regression.

  • chain (bool) – Distinguishes between independent single-target regression subtasks and chaining. If true, then the expected multi-target combination is chaining.

Returns

out_params

Return type

dict

physlearn.supervised.utils._estimator_checks._check_search_method(search_method)[source]

Chooses the (hyper)parameter search method that minimizes the edit distance.

Parameters

search_method (str) – Specifies the Scikit-learn or Bayesian optimization (hyper)parameter search method.

Returns

search_method

Return type

str