No path specified. Models will be saved in: "AutogluonModels/ag-20231201_114515/"
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20231201_114515/"
AutoGluon Version: 0.8.2
Python Version: 3.10.13
Operating System: Linux
Platform Machine: x86_64
Platform Version: #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2
Disk Space Avail: 248.33 GB / 490.57 GB (50.6%)
Train Data Rows: 1338
Train Data Columns: 6
Label Column: charges
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == float and many unique label-values observed).
Label info (max, min, mean, stddev): (63770.42801, 1121.8739, 13270.42227, 12110.01124)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 124932.12 MB
Train Data (Original) Memory Usage: 0.28 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['bmi']
('int', []) : 2 | ['age', 'children']
('object', []) : 3 | ['sex', 'smoker', 'region']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 1 | ['region']
('float', []) : 1 | ['bmi']
('int', []) : 2 | ['age', 'children']
('int', ['bool']) : 2 | ['sex', 'smoker']
0.0s = Fit runtime
6 features in original data used to generate 6 features in processed data.
Train Data (Processed) Memory Usage: 0.04 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.04s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 1070, Val Rows: 268
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif ...
-11788.3793 = Validation score (-root_mean_squared_error)
0.01s = Training runtime
0.02s = Validation runtime
Fitting model: KNeighborsDist ...
-11888.3553 = Validation score (-root_mean_squared_error)
0.01s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
-3998.3761 = Validation score (-root_mean_squared_error)
0.38s = Training runtime
0.0s = Validation runtime
Fitting model: LightGBM ...
-4015.3178 = Validation score (-root_mean_squared_error)
0.22s = Training runtime
0.0s = Validation runtime
Fitting model: RandomForestMSE ...
-4482.3325 = Validation score (-root_mean_squared_error)
0.3s = Training runtime
0.05s = Validation runtime
Fitting model: CatBoost ...
-3902.3094 = Validation score (-root_mean_squared_error)
0.42s = Training runtime
0.0s = Validation runtime
Fitting model: ExtraTreesMSE ...
-4150.3287 = Validation score (-root_mean_squared_error)
0.28s = Training runtime
0.11s = Validation runtime
Fitting model: NeuralNetFastAI ...
-4198.263 = Validation score (-root_mean_squared_error)
1.95s = Training runtime
0.02s = Validation runtime
Fitting model: XGBoost ...
-4185.5391 = Validation score (-root_mean_squared_error)
0.18s = Training runtime
0.0s = Validation runtime
Fitting model: NeuralNetTorch ...
-4125.3087 = Validation score (-root_mean_squared_error)
6.37s = Training runtime
0.02s = Validation runtime
Fitting model: LightGBMLarge ...
-4295.031 = Validation score (-root_mean_squared_error)
0.45s = Training runtime
0.0s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
-3873.2752 = Validation score (-root_mean_squared_error)
0.21s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 11.3s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20231201_114515/")