You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`BayesSearchCV`). -1 means using all available processors.
71
-
Defaults to -1.
72
-
time_limit_param (list): A parameter for future use, intended to set
73
-
time limits on model fitting. Currently not implemented.
74
-
Defaults to [3].
75
-
random_state_val (int): A seed value for random number generation to
76
-
ensure reproducibility across runs. Defaults to 1234.
77
-
n_jobs_model_val (int): The number of parallel jobs for models that
78
-
support it (e.g., RandomForest). -1 means using all available
79
-
processors. Defaults to -1.
80
-
max_param_space_iter_value (int): A hard limit on the number of
81
-
parameter combinations to evaluate in `RandomizedSearchCV` or
82
-
`BayesSearchCV`. Prevents excessively long run times.
83
-
Defaults to 10.
84
-
store_models (bool): Whether to save trained models to disk.
85
-
metric_list (Dict[str, Union[str, Callable]]): A dictionary of scoring
86
-
metrics to evaluate models during cross-validation. Keys are metric
87
-
names and values are scikit-learn scorer strings or callable objects.
88
46
"""
89
47
90
48
_instance=None
91
49
92
50
# Class attributes with type hints
93
51
debug_level: int
52
+
"""The verbosity level for debugging. Not widely used. Defaults to 0."""
94
53
knn_n_jobs: int
54
+
"""The number of parallel jobs to run for KNN algorithms. -1 means using all available processors. Defaults to -1."""
95
55
verbose: int
56
+
"""Controls the verbosity of output during the pipeline run. Higher values produce more detailed logs. Defaults to 0."""
96
57
rename_cols: bool
58
+
"""If True, renames DataFrame columns to remove special characters (e.g., '[, ], <') that can cause issues with some models like XGBoost. Defaults to True."""
97
59
error_raise: bool
60
+
"""If True, the pipeline will stop and raise an exception if an error occurs during model training or evaluation. If False, it will log the error and continue. Defaults to False."""
98
61
random_grid_search: bool
62
+
"""If True and `bayessearch` is False, uses `RandomizedSearchCV` instead of `GridSearchCV`. Defaults to False."""
99
63
bayessearch: bool
64
+
"""If True, uses `BayesSearchCV` from `scikit-optimize` for hyperparameter tuning, which can be more efficient than grid or random search. Defaults to True."""
100
65
sub_sample_param_space_pct: float
66
+
"""The percentage of the total parameter space to sample when using `RandomizedSearchCV`. For example, 0.1 means 10% of the combinations will be tried. Defaults to 0.0005."""
101
67
grid_n_jobs: int
68
+
"""The number of jobs to run in parallel for hyperparameter search (`GridSearchCV`, `RandomizedSearchCV`, `BayesSearchCV`). -1 means using all available processors. Defaults to -1."""
102
69
time_limit_param: List[int]
70
+
"""A parameter for future use, intended to set time limits on model fitting. Currently not implemented. Defaults to [3]."""
103
71
random_state_val: int
72
+
"""A seed value for random number generation to ensure reproducibility across runs. Defaults to 1234."""
104
73
n_jobs_model_val: int
74
+
"""The number of parallel jobs for models that support it (e.g., RandomForest). -1 means using all available processors. Defaults to -1."""
105
75
max_param_space_iter_value: int
76
+
"""A hard limit on the number of parameter combinations to evaluate in `RandomizedSearchCV` or `BayesSearchCV`. Prevents excessively long run times. Defaults to 10."""
106
77
store_models: bool
78
+
"""Whether to save trained models to disk. Defaults to True."""
107
79
metric_list: Dict[str, Union[str, Callable]]
80
+
"""A dictionary of scoring metrics to evaluate models during cross-validation. Keys are metric names and values are scikit-learn scorer strings or callable objects."""
0 commit comments