Api
find_best_model
find_best_model(
X,
y,
model,
search_space,
optimizing_metric,
k_outer=5,
skip_outer_folds=None,
k_inner=5,
skip_inner_folds=None,
n_initial_points=5,
n_calls=10,
calibrate="no",
calibrate_params=None,
other_metrics=None,
skopt_func=gp_minimize,
verbose=False,
)
Performs nested cross validation to find the best classification model.
The inner loop does hyperparameters tuning (using a skopt
primitive)
and the outer loop computes metrics for assessing the quality of the model
without risk of overfitting bias.
After the nested loop, the whole procedure is used with the full dataset to return a single model trained on all available data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_samples, n_features)
|
Features |
required |
y |
array-like of shape (n_samples,)
|
Targets to predict. It has to be discrete (for classification), and both binary and multiclass targets are supported. |
required |
model |
estimator object
|
estimator object implementing |
required |
search_space |
list of tuple
|
Search space dimensions provided as a list.
Each search dimension should be defined as an instance of
a |
required |
optimizing_metric |
str or callable
|
Strategy to evaluate
the performance of the cross-validated model on
each inner test set, to find the best hyperparameters. It should
follow the
|
required |
k_outer |
int
|
Number of folds for the outer cross-validation. |
5
|
skip_outer_folds |
list
|
If set, list of folds to skip during the loop, to reduce computational cost. |
None
|
k_inner |
int
|
Number of folds for the inner cross-validation. |
5
|
skip_inner_folds |
list
|
If set, list of folds to skip during the loop, to reduce computational cost. |
None
|
n_initial_points |
int
|
Number of initial points to use in Skopt Optimization. |
5
|
n_calls |
int
|
Number of additional calls to use in Skopt Optimization. |
10
|
calibrate |
str
|
Whether to calibrate the output probabilities. Options:
|
'no'
|
calibrate_params |
dict
|
Dictionary of params for the CalibratedClassifierCV |
None
|
other_metrics |
dict
|
If not empty, in the report output every metric specified in this parameter will be computed, showing the results over the inner folds (during tuning) and over the outer folds (during performance evaluation). The parameter should be provided as a dictionary with metric names as keys and callables or str a values. See examples for examples. |
None
|
skopt_func |
callable
|
Minimization function of the skopt library to be used. Available options are:
|
gp_minimize
|
verbose |
bool
|
Whether to trace progress or not. |
False
|
Returns:
Name | Type | Description |
---|---|---|
model |
estimator
|
Model trained with the full dataset using the same procedure as in the inner cross validation. |
params |
dict
|
Dictionary of (hyper)parameters of the best model. |
loop_info (dataclass) : Dataclass with information about the optimization process.
The opt_info object has a method |
Source code in nestedcvtraining/api.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|
separate_signature: True