The physlearn.loss module enables computation of the average loss
or the negative gradient in either the single-target or the multi-target
regression setting, whereby data can be represented heterogeneously with
Numpy or Pandas. It includes the physlearn.LeastSquaresError,
physlearn.LeastAbsoluteError, physlearn.HuberLossFunction,
physlearn.QuantileLossFunction classes, and the helper
physlearn.loss._difference() function.
Subtract the raw predictions from the single-target(s).
The function supports heterogeneous usage of Numpy and pandas data representations.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
diff – The difference between the single-target(s) and the raw predictions.
DataFrame, Series, or ndarray
Examples
>>> import pandas as pd
>>> from sklearn.datasets import load_linnerud
>>> from physlearn.loss import _difference
>>> X, y = load_linnerud(return_X_y=True)
>>> _difference(y=pd.DataFrame(y), raw_predictions=X).iloc[:2]
0 1 2
0 186.0 -126.0 -10.0
1 187.0 -73.0 -8.0
Bases: LeastSquaresError
Least squares loss function.
The object modifies the original Scikit-learn LeastSquaresError such that the average loss and pseudo-residual computations support heterogeneous usage of Numpy and pandas data representations. Moreover, the modification supports both single-target and multi-target data.
References
Alex Wozniakowski, Jayne Thompson, Mile Gu, and Felix C. Binder. “A new formulation of gradient boosting”, Machine Learning: Science and Technology, 2 045022 (2021).
Jerome Friedman. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, 29(5):1189–1232 (2001).
Computes the average loss.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
sample_weight (float, ndarray, or None, optional (default=None)) – Individual weights for each target. If the weight is a float, then every target will have the same weight.
mse
DataFrame, Series, or ndarray
Examples
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import LeastSquaresError
>>> X, y = load_linnerud(return_X_y=True)
>>> ls = LeastSquaresError()
>>> ls(y=y, raw_predictions=X)
16048.6
Computes the pseudo-residuals.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
residual
DataFrame, Series, or ndarray
Examples
>>> import pandas as pd
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import LeastSquaresError
>>> X, y = load_linnerud(return_X_y=True)
>>> ls = LeastSquaresError()
>>> ls.negative_gradient(y=pd.DataFrame(y), raw_predictions=X).iloc[:2]
0 1 2
0 186.0 -126.0 -10.0
1 187.0 -73.0 -8.0
Bases: LeastAbsoluteError
Absolute error loss function.
The object modifies the original Scikit-learn LeastAbsoluteError such that the average loss and pseudo-residual computations support heterogeneous usage of Numpy and pandas data representations. Moreover, the modification supports both single-target and multi-target data.
References
Alex Wozniakowski, Jayne Thompson, Mile Gu, and Felix C. Binder. “A new formulation of gradient boosting”, Machine Learning: Science and Technology, 2 045022 (2021).
Jerome Friedman. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, 29(5):1189–1232 (2001).
Computes the average loss.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
sample_weight (float, ndarray, or None, optional (default=None)) – Individual weights for each target. If the weight is a float, then every target will have the same weight.
mae
DataFrame, Series, or ndarray
Examples
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import LeastAbsoluteError
>>> X, y = load_linnerud(return_X_y=True)
>>> lad = LeastAbsoluteError()
>>> lad(y=y, raw_predictions=X)
104.23333333333333
Computes the pseudo-residuals.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
residual
DataFrame, Series, or ndarray
Examples
>>> import pandas as pd
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import LeastAbsoluteError
>>> X, y = load_linnerud(return_X_y=True)
>>> lad = LeastAbsoluteError()
>>> lad.negative_gradient(y=pd.DataFrame(y), raw_predictions=X).iloc[:2]
0 1 2
0 1.0 -1.0 -1.0
1 1.0 -1.0 -1.0
Bases: HuberLossFunction
Huber loss function.
The object modifies the original Scikit-learn HuberLossFunction such that the average loss and pseudo-residual computations support heterogeneous usage of Numpy and pandas data representations. Moreover, the modification supports both single-target and multi-target data.
References
Alex Wozniakowski, Jayne Thompson, Mile Gu, and Felix C. Binder. “A new formulation of gradient boosting”, Machine Learning: Science and Technology, 2 045022 (2021).
Jerome Friedman. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, 29(5):1189–1232 (2001).
Computes the delta threshold.
This threshold determines whether to use the squared error or the absolute error loss function.
difference (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The difference between the single-target(s) and the raw prediction(s).
sample_weight (float, ndarray, or None, optional (default=None)) – Individual weights for each target. If the weight is a float, then every target will have the same weight.
delta
np.float64
Computes the average loss.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
sample_weight (float, ndarray, or None, optional (default=None)) – Individual weights for each target. If the weight is a float, then every target will have the same weight.
huber
DataFrame, Series, or ndarray
Examples
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import HuberLossFunction
>>> X, y = load_linnerud(return_X_y=True)
>>> huber = HuberLossFunction()
>>> huber(y=y, raw_predictions=X)
7989.893
Computes the pseudo-residuals.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
residual
DataFrame, Series, or ndarray
Examples
>>> import pandas as pd
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import HuberLossFunction
>>> X, y = load_linnerud(return_X_y=True)
>>> huber = HuberLossFunction()
>>> huber.negative_gradient(y=pd.DataFrame(y), raw_prediction=X).iloc[:2]
0 1 2
0 186.0 -126.0 -10.0
1 187.0 -73.0 -8.0
Bases: QuantileLossFunction
Quantile loss function.
The object modifies the original Scikit-learn QuantileLossFunction such that the average loss and pseudo-residual computations support heterogeneous usage of Numpy and pandas data representations. Moreover, the modification supports both single-target and multi-target data.
References
Alex Wozniakowski, Jayne Thompson, Mile Gu, and Felix C. Binder. “A new formulation of gradient boosting”, Machine Learning: Science and Technology, 2 045022 (2021).
Jerome Friedman. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, 29(5):1189–1232 (2001).
Computes the average loss.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
sample_weight (float, ndarray, or None, optional (default=None)) – Individual weights for each target. If the weight is a float, then every target will have the same weight.
quantile
DataFrame, Series, or ndarray
Examples
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import QuantileLossFunction
>>> X, y = load_linnerud(return_X_y=True)
>>> quantile = QuantileLossFunction()
>>> quantile(y=y, raw_predictions=X)
174.27
Computes the pseudo-residuals.
y (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The target matrix, where each row corresponds to an example and the column(s) correspond to the single-target(s).
raw_predictions (array-like of shape = [n_samples] or shape = [n_samples, n_targets]) – The estimate matrix, where each row corresponds to an example and the column(s) correspond to the prediction(s) for the single-target(s).
residual
DataFrame, Series, or ndarray
Examples
>>> import pandas as pd
>>> from sklearn.datasets import load_linnerud
>>> from physlearn import QuantileLossFunction
>>> X, y = load_linnerud(return_X_y=True)
>>> quantile = QuantileLossFunction()
>>> quantile.negative_gradient(y=pd.DataFrame(y), raw_predictions=X).iloc[:2]
0 1 2
0 0.9 -0.1 -0.1
1 0.9 -0.1 -0.1