Apa itu Optuna
Model pembelajaran mesin memiliki beberapa hiperparameter, dan akurasinya dapat sangat bervariasi tergantung pada pengaturan hiperparameter. Tugas untuk menemukan hiperparameter yang optimal disebut penyetelan hiperparameter. Algoritme pencarian berikut ini telah diusulkan untuk penyetelan hyperparameter.
- Pencarian Grid
- Pencarian Acak
- Optimasi Bayesian
Pencarian Grid mencoba semua kombinasi hiperparameter dalam rentang yang ditetapkan. Pencarian Acak mencoba kombinasi acak dari hiperparameter. Bayesian Optimization secara efisien mencari kombinasi hyperparameter berdasarkan uji coba kombinasi hyperparameter sebelumnya.
Optuna adalah kerangka kerja Python untuk penyetelan hiperparameter. Ini terutama menggunakan algoritma yang disebut TPE (Tree-structured Parzen Estimator), sejenis Bayesian Optimization, untuk menemukan nilai optimal.
Terminologi Optuna
Optuna memiliki istilah-istilah berikut.
- Study: serangkaian uji coba optimasi
- Trial: satu uji coba fungsi objektif
Cara menggunakan Optuna
Pertama, instal Optuna.
$ pip install optuna
Optimalisasi Optuna dapat dilakukan dalam tiga langkah utama berikut ini:
- Tentukan fungsi
objective
yang membungkus fungsi tujuan - Membuat variabel dengan tipe
Study
- Pptimize dengan metode
optimize
Kode berikut ini akan mencari x
yang meminimumkan (x - 2) ** 2
.
import optuna
# step 1
def objective(trial: optuna.Trial):
x = trial.suggest_uniform('x', -10, 10)
score = (x - 2) ** 2
print('x: %1.3f, score: %1.3f' % (x, score))
return score
# step 2
study = optuna.create_study(direction="minimize")
# step 3
study.optimize(objective, n_trials=100)
study.best_value
berisi nilai minimum (x - 2) ** 2
.
>> study.best_value
0.00026655993028283496
study.best_params
berisi parameter x
minimal (x - 2) ** 2
.
>> study.best_params
{'x': 2.016326663170496}
study.best_trial
berisi uji coba untuk (x - 2) ** 2
minimum.
>> study.best_trial
FrozenTrial(number=46, state=TrialState.COMPLETE, values=[0.00026655993028283496], datetime_start=datetime.datetime(2023, 1, 20, 11, 6, 46, 200725), datetime_complete=datetime.datetime(2023, 1, 20, 11, 6, 46, 208328), params={'x': 2.016326663170496}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=46, value=None)
study.trials
berisi uji coba yang telah dilakukan.
>> study.trials
[FrozenTrial(number=0, state=TrialState.COMPLETE, values=[48.70102052494164], datetime_start=datetime.datetime(2023, 1, 20, 6, 4, 39, 240177), datetime_complete=datetime.datetime(2023, 1, 20, 6, 4, 39, 254344), params={'x': 8.978611647379559}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=0, value=None),
.
.
.
FrozenTrial(number=99, state=TrialState.COMPLETE, values=[1.310544492087495], datetime_start=datetime.datetime(2023, 1, 20, 6, 4, 40, 755667), datetime_complete=datetime.datetime(2023, 1, 20, 6, 4, 40, 763725), params={'x': 0.8552098480125299}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=99, value=None)]
Pengaturan Trial
Pengaturan untuk parameter mana yang akan dioptimalkan dan cara mengoptimalkannya dijelaskan di bawah ini.
optimizer = trial.suggest_categorical('optimizer', ['MomentumSGD', 'Adam'])
num_layers = trial.suggest_int('num_layers', 1, 3)
dropout_rate = trial.suggest_uniform('dropout_rate', 0.0, 1.0)
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-2)
drop_path_rate = trial.suggest_discrete_uniform('drop_path_rate', 0.0, 1.0, 0.1)
Optuna menawarkan metode berikut untuk Trial.
Metode | Deskripsi |
---|---|
suggest_categorical (name, choices) |
Suggest a value for the categorical parameter. |
suggest_discrete_uniform (name, low, high, q) |
Suggest a value for the discrete parameter. |
suggest_float (name, low, high, [, step, log]) |
Suggest a value for the floating point parameter. |
suggest_int (name, low, high[, step, log]) |
Suggest a value for the integer parameter. |
suggest_loguniform (name, low, high) |
Suggest a value for the continuous parameter. |
suggest_uniform (name, low, high) |
Suggest a value for the continuous parameter. |
Argumen fungsi adalah sebagai berikut:
- name: nama hyperparameter
- low: nilai minimum dari kisaran parameter
- high: nilai maksimum dari kisaran parameter
- step: interval antara nilai-nilai yang mungkin dari parameter
- q: interval diskritisasi
- log: benar jika parameter diambil sampelnya dari domain logaritmik
- choices: daftar nilai kategorikal untuk parameter
Fitur kenyamanan Optuna
Optuna menawarkan fitur kenyamanan berikut ini:
- Pruner
- Pengoptimalan terdistribusi
- Fungsionalitas dasbor
Pruner
Optuna memiliki fitur yang disebut Pruner yang dapat secara otomatis menangguhkan uji coba dengan prospek rendah.
study = optuna.create_study(
pruner=optuna.pruners.MedianPruner(),
)
Kode di atas menentukan Pruner yang disebut MedianPruner()
, tetapi Pruner lain juga ada. Silakan lihat dokumen resmi berikut ini untuk detailnya.
Pengoptimalan terdistribusi
Dengan menentukan study_name
dan storage
sebagai argumen untuk create_study
, riwayat percobaan dapat dibagikan di antara proses dan pemrosesan terdistribusi dapat diimplementasikan dengan mudah.
study = optuna.create_study(
study_name="example-study",
storage="sqlite://example.db",
load_if_exists=True
)
Dengan mengatur load_if_exists
ke True
, Anda juga dapat mengizinkan pemuatan dan melanjutkan ketika Studi dengan nama yang sama sudah ada di DB.
Fungsionalitas dasbor
Optuna menyediakan fitur dasbor yang memungkinkan Anda untuk melacak kemajuan pencarian Anda.
Mengoptimalkan model PyTorch
Dalam contoh berikut ini, kita mengoptimalkan akurasi validasi pengenalan produk fesyen menggunakan PyTorch dan FashionMNIST.
import optuna
from optuna.trial import TrialState
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data
from torchvision import datasets
from torchvision import transforms
DEVICE = torch.device("cpu")
BATCHSIZE = 128
CLASSES = 10
DIR = os.getcwd()
EPOCHS = 10
N_TRAIN_EXAMPLES = BATCHSIZE * 30
N_VALID_EXAMPLES = BATCHSIZE * 10
def define_model(trial: optuna.Trial):
# We optimize the number of layers, hidden units and dropout ratio in each layer.
n_layers = trial.suggest_int("n_layers", 1, 3)
layers = []
in_features = 28 * 28
for i in range(n_layers):
out_features = trial.suggest_int("n_units_l{}".format(i), 4, 128)
layers.append(nn.Linear(in_features, out_features))
layers.append(nn.ReLU())
p = trial.suggest_float("dropout_l{}".format(i), 0.2, 0.5)
layers.append(nn.Dropout(p))
in_features = out_features
layers.append(nn.Linear(in_features, CLASSES))
layers.append(nn.LogSoftmax(dim=1))
return nn.Sequential(*layers)
def get_mnist():
# Load FashionMNIST dataset.
train_loader = torch.utils.data.DataLoader(
datasets.FashionMNIST(DIR, train=True, download=True, transform=transforms.ToTensor()),
batch_size=BATCHSIZE,
shuffle=True,
)
valid_loader = torch.utils.data.DataLoader(
datasets.FashionMNIST(DIR, train=False, transform=transforms.ToTensor()),
batch_size=BATCHSIZE,
shuffle=True,
)
return train_loader, valid_loader
def objective(trial: optuna.Trial):
# Generate the model.
model = define_model(trial).to(DEVICE)
# Generate the optimizers.
optimizer_name = trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"])
lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
optimizer = getattr(optim, optimizer_name)(model.parameters(), lr=lr)
# Get the FashionMNIST dataset.
train_loader, valid_loader = get_mnist()
# Training of the model.
for epoch in range(EPOCHS):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
# Limiting training data for faster epochs.
if batch_idx * BATCHSIZE >= N_TRAIN_EXAMPLES:
break
data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
# Validation of the model.
model.eval()
correct = 0
with torch.no_grad():
for batch_idx, (data, target) in enumerate(valid_loader):
# Limiting validation data.
if batch_idx * BATCHSIZE >= N_VALID_EXAMPLES:
break
data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)
output = model(data)
# Get the index of the max log-probability.
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
accuracy = correct / min(len(valid_loader.dataset), N_VALID_EXAMPLES)
trial.report(accuracy, epoch)
# Handle pruning based on the intermediate value.
if trial.should_prune():
raise optuna.exceptions.TrialPruned()
return accuracy
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100, timeout=600)
pruned_trials = study.get_trials(deepcopy=False, states=[TrialState.PRUNED])
complete_trials = study.get_trials(deepcopy=False, states=[TrialState.COMPLETE])
print("Study statistics: ")
print(" Number of finished trials: ", len(study.trials))
print(" Number of pruned trials: ", len(pruned_trials))
print(" Number of complete trials: ", len(complete_trials))
print("Best trial:")
trial = study.best_trial
print(" Value: ", trial.value)
print(" Params: ")
for key, value in trial.params.items():
print(" {}: {}".format(key, value))
Study statistics:
Number of finished trials: 100
Number of pruned trials: 64
Number of complete trials: 36
Best trial:
Value: 0.8484375
Params:
n_layers: 1
n_units_l0: 77
dropout_l0: 0.2621844457931539
optimizer: Adam
lr: 0.0051477826780949205
Mengoptimalkan LightGBM
Contoh berikut ini mengoptimalkan akurasi validasi deteksi kanker menggunakan LightGBM.
import numpy as np
import optuna
import lightgbm as lgb
import sklearn.datasets
import sklearn.metrics
from sklearn.model_selection import train_test_split
def objective(trial):
data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
train_x, valid_x, train_y, valid_y = train_test_split(data, target, test_size=0.25)
dtrain = lgb.Dataset(train_x, label=train_y)
param = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": -1,
"boosting_type": "gbdt",
"lambda_l1": trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
"lambda_l2": trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
"num_leaves": trial.suggest_int("num_leaves", 2, 256),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
}
gbm = lgb.train(param, dtrain)
preds = gbm.predict(valid_x)
pred_labels = np.rint(preds)
accuracy = sklearn.metrics.accuracy_score(valid_y, pred_labels)
return accuracy
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100)
[I 2023-01-20 09:18:25,197] A new study created in memory with name: no-name-26038220-3fba-4ada-9237-9ad9e0a7eff4
[I 2023-01-20 09:18:25,278] Trial 0 finished with value: 0.951048951048951 and parameters: {'lambda_l1': 3.6320373475789714e-05, 'lambda_l2': 0.0001638841686303377, 'num_leaves': 52, 'feature_fraction': 0.5051855392259837, 'bagging_fraction': 0.48918754678745996, 'bagging_freq': 4, 'min_child_samples': 30}. Best is trial 0 with value: 0.951048951048951.
.
.
.
[I 2023-01-20 09:18:37,148] Trial 99 finished with value: 0.972027972027972 and parameters: {'lambda_l1': 4.921752856772178e-06, 'lambda_l2': 5.0633857392202624e-08, 'num_leaves': 28, 'feature_fraction': 0.48257699231443446, 'bagging_fraction': 0.7810382257111896, 'bagging_freq': 3, 'min_child_samples': 28}. Best is trial 36 with value: 0.993006993006993.
print(f"Number of finished trials: {len(study.trials)}")
print("Best trial:")
trial = study.best_trial
print(f" Value: {trial.value}")
print(" Params: ")
for key, value in trial.params.items():
print(f" {key}: {value}")
Number of finished trials: 100
Best trial:
Value: 0.993006993006993
Params:
lambda_l1: 2.2820624207211886e-06
lambda_l2: 4.100655307616414e-08
num_leaves: 253
feature_fraction: 0.6477416602072985
bagging_fraction: 0.7393534933706116
bagging_freq: 5
min_child_samples: 36
LightGBM Tuner
Khusus untuk LightGBM, Optuna menawarkan LightGBM Tuner, yang membuat penyetelan LightGBM menjadi lebih mudah.
Namun, LightGBM Tuner hanya menyetel hiperparameter berikut ini.
lambda_l1
lambda_l2
num_leaves
feature_fraction
bagging_fraction
bagging_freq
min_child_samples
import numpy as np
import optuna.integration.lightgbm as lgb
from lightgbm import early_stopping
from lightgbm import log_evaluation
import sklearn.datasets
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
if __name__ == "__main__":
data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
train_x, val_x, train_y, val_y = train_test_split(data, target, test_size=0.25)
dtrain = lgb.Dataset(train_x, label=train_y)
dval = lgb.Dataset(val_x, label=val_y)
params = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": -1,
"boosting_type": "gbdt",
}
model = lgb.train(
params,
dtrain,
valid_sets=[dtrain, dval],
callbacks=[early_stopping(100), log_evaluation(100)],
)
prediction = np.rint(model.predict(val_x, num_iteration=model.best_iteration))
accuracy = accuracy_score(val_y, prediction)
best_params = model.params
print("Best params:", best_params)
print(" Accuracy = {}".format(accuracy))
print(" Params: ")
for key, value in best_params.items():
print(" {}: {}".format(key, value))
Best params: {'objective': 'binary', 'metric': 'binary_logloss', 'verbosity': -1, 'boosting_type': 'gbdt', 'feature_pre_filter': False, 'lambda_l1': 3.9283033758323693e-07, 'lambda_l2': 0.11914982777201996, 'num_leaves': 4, 'feature_fraction': 0.4, 'bagging_fraction': 0.46448877892449625, 'bagging_freq': 3, 'min_child_samples': 20}
Accuracy = 0.9790209790209791
Params:
objective: binary
metric: binary_logloss
verbosity: -1
boosting_type: gbdt
feature_pre_filter: False
lambda_l1: 3.9283033758323693e-07
lambda_l2: 0.11914982777201996
num_leaves: 4
feature_fraction: 0.4
bagging_fraction: 0.46448877892449625
bagging_freq: 3
min_child_samples: 20
Contoh pengoptimalan lainnya
GitHub berikut ini menunjukkan banyak contoh implementasi Optuna dalam model lain.
Referensi