Traffine I/O

Bahasa Indonesia

2023-01-20

Optuna

Apa itu Optuna

Model pembelajaran mesin memiliki beberapa hiperparameter, dan akurasinya dapat sangat bervariasi tergantung pada pengaturan hiperparameter. Tugas untuk menemukan hiperparameter yang optimal disebut penyetelan hiperparameter. Algoritme pencarian berikut ini telah diusulkan untuk penyetelan hyperparameter.

  • Pencarian Grid
  • Pencarian Acak
  • Optimasi Bayesian

Pencarian Grid mencoba semua kombinasi hiperparameter dalam rentang yang ditetapkan. Pencarian Acak mencoba kombinasi acak dari hiperparameter. Bayesian Optimization secara efisien mencari kombinasi hyperparameter berdasarkan uji coba kombinasi hyperparameter sebelumnya.

Optuna adalah kerangka kerja Python untuk penyetelan hiperparameter. Ini terutama menggunakan algoritma yang disebut TPE (Tree-structured Parzen Estimator), sejenis Bayesian Optimization, untuk menemukan nilai optimal.

Terminologi Optuna

Optuna memiliki istilah-istilah berikut.

  • Study: serangkaian uji coba optimasi
  • Trial: satu uji coba fungsi objektif

Cara menggunakan Optuna

Pertama, instal Optuna.

$ pip install optuna

Optimalisasi Optuna dapat dilakukan dalam tiga langkah utama berikut ini:

  1. Tentukan fungsi objective yang membungkus fungsi tujuan
  2. Membuat variabel dengan tipe Study
  3. Pptimize dengan metode optimize

Kode berikut ini akan mencari x yang meminimumkan (x - 2) ** 2.

import optuna

# step 1
def objective(trial: optuna.Trial):
    x = trial.suggest_uniform('x', -10, 10)
    score = (x - 2) ** 2
    print('x: %1.3f, score: %1.3f' % (x, score))
    return score

# step 2
study = optuna.create_study(direction="minimize")

# step 3
study.optimize(objective, n_trials=100)

study.best_value berisi nilai minimum (x - 2) ** 2.

>> study.best_value

0.00026655993028283496

study.best_params berisi parameter x minimal (x - 2) ** 2.

>> study.best_params

{'x': 2.016326663170496}

study.best_trial berisi uji coba untuk (x - 2) ** 2 minimum.

>> study.best_trial

FrozenTrial(number=46, state=TrialState.COMPLETE, values=[0.00026655993028283496], datetime_start=datetime.datetime(2023, 1, 20, 11, 6, 46, 200725), datetime_complete=datetime.datetime(2023, 1, 20, 11, 6, 46, 208328), params={'x': 2.016326663170496}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=46, value=None)

study.trials berisi uji coba yang telah dilakukan.

>> study.trials

[FrozenTrial(number=0, state=TrialState.COMPLETE, values=[48.70102052494164], datetime_start=datetime.datetime(2023, 1, 20, 6, 4, 39, 240177), datetime_complete=datetime.datetime(2023, 1, 20, 6, 4, 39, 254344), params={'x': 8.978611647379559}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=0, value=None),
.
.
.
 FrozenTrial(number=99, state=TrialState.COMPLETE, values=[1.310544492087495], datetime_start=datetime.datetime(2023, 1, 20, 6, 4, 40, 755667), datetime_complete=datetime.datetime(2023, 1, 20, 6, 4, 40, 763725), params={'x': 0.8552098480125299}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=99, value=None)]

Pengaturan Trial

Pengaturan untuk parameter mana yang akan dioptimalkan dan cara mengoptimalkannya dijelaskan di bawah ini.

optimizer = trial.suggest_categorical('optimizer', ['MomentumSGD', 'Adam'])
num_layers = trial.suggest_int('num_layers', 1, 3)
dropout_rate = trial.suggest_uniform('dropout_rate', 0.0, 1.0)
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-2)
drop_path_rate = trial.suggest_discrete_uniform('drop_path_rate', 0.0, 1.0, 0.1)

Optuna menawarkan metode berikut untuk Trial.

Metode Deskripsi
suggest_categorical(name, choices) Suggest a value for the categorical parameter.
suggest_discrete_uniform(name, low, high, q) Suggest a value for the discrete parameter.
suggest_float(name, low, high, [, step, log]) Suggest a value for the floating point parameter.
suggest_int(name, low, high[, step, log]) Suggest a value for the integer parameter.
suggest_loguniform(name, low, high) Suggest a value for the continuous parameter.
suggest_uniform(name, low, high) Suggest a value for the continuous parameter.

Argumen fungsi adalah sebagai berikut:

  • name: nama hyperparameter
  • low: nilai minimum dari kisaran parameter
  • high: nilai maksimum dari kisaran parameter
  • step: interval antara nilai-nilai yang mungkin dari parameter
  • q: interval diskritisasi
  • log: benar jika parameter diambil sampelnya dari domain logaritmik
  • choices: daftar nilai kategorikal untuk parameter

Fitur kenyamanan Optuna

Optuna menawarkan fitur kenyamanan berikut ini:

  • Pruner
  • Pengoptimalan terdistribusi
  • Fungsionalitas dasbor

Pruner

Optuna memiliki fitur yang disebut Pruner yang dapat secara otomatis menangguhkan uji coba dengan prospek rendah.

study = optuna.create_study(
    pruner=optuna.pruners.MedianPruner(),
)

Kode di atas menentukan Pruner yang disebut MedianPruner(), tetapi Pruner lain juga ada. Silakan lihat dokumen resmi berikut ini untuk detailnya.

https://optuna.readthedocs.io/en/stable/reference/pruners.html

Pengoptimalan terdistribusi

Dengan menentukan study_name dan storage sebagai argumen untuk create_study, riwayat percobaan dapat dibagikan di antara proses dan pemrosesan terdistribusi dapat diimplementasikan dengan mudah.

study = optuna.create_study(
  study_name="example-study",
  storage="sqlite://example.db",
  load_if_exists=True
)

Dengan mengatur load_if_exists ke True, Anda juga dapat mengizinkan pemuatan dan melanjutkan ketika Studi dengan nama yang sama sudah ada di DB.

Fungsionalitas dasbor

Optuna menyediakan fitur dasbor yang memungkinkan Anda untuk melacak kemajuan pencarian Anda.

https://github.com/optuna/optuna-dashboard

Mengoptimalkan model PyTorch

Dalam contoh berikut ini, kita mengoptimalkan akurasi validasi pengenalan produk fesyen menggunakan PyTorch dan FashionMNIST.

import optuna
from optuna.trial import TrialState
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data
from torchvision import datasets
from torchvision import transforms


DEVICE = torch.device("cpu")
BATCHSIZE = 128
CLASSES = 10
DIR = os.getcwd()
EPOCHS = 10
N_TRAIN_EXAMPLES = BATCHSIZE * 30
N_VALID_EXAMPLES = BATCHSIZE * 10


def define_model(trial: optuna.Trial):
    # We optimize the number of layers, hidden units and dropout ratio in each layer.
    n_layers = trial.suggest_int("n_layers", 1, 3)
    layers = []

    in_features = 28 * 28
    for i in range(n_layers):
        out_features = trial.suggest_int("n_units_l{}".format(i), 4, 128)
        layers.append(nn.Linear(in_features, out_features))
        layers.append(nn.ReLU())
        p = trial.suggest_float("dropout_l{}".format(i), 0.2, 0.5)
        layers.append(nn.Dropout(p))

        in_features = out_features
    layers.append(nn.Linear(in_features, CLASSES))
    layers.append(nn.LogSoftmax(dim=1))

    return nn.Sequential(*layers)


def get_mnist():
    # Load FashionMNIST dataset.
    train_loader = torch.utils.data.DataLoader(
        datasets.FashionMNIST(DIR, train=True, download=True, transform=transforms.ToTensor()),
        batch_size=BATCHSIZE,
        shuffle=True,
    )
    valid_loader = torch.utils.data.DataLoader(
        datasets.FashionMNIST(DIR, train=False, transform=transforms.ToTensor()),
        batch_size=BATCHSIZE,
        shuffle=True,
    )

    return train_loader, valid_loader


def objective(trial: optuna.Trial):

    # Generate the model.
    model = define_model(trial).to(DEVICE)

    # Generate the optimizers.
    optimizer_name = trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"])
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    optimizer = getattr(optim, optimizer_name)(model.parameters(), lr=lr)

    # Get the FashionMNIST dataset.
    train_loader, valid_loader = get_mnist()

    # Training of the model.
    for epoch in range(EPOCHS):
        model.train()
        for batch_idx, (data, target) in enumerate(train_loader):
            # Limiting training data for faster epochs.
            if batch_idx * BATCHSIZE >= N_TRAIN_EXAMPLES:
                break

            data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)

            optimizer.zero_grad()
            output = model(data)
            loss = F.nll_loss(output, target)
            loss.backward()
            optimizer.step()

        # Validation of the model.
        model.eval()
        correct = 0
        with torch.no_grad():
            for batch_idx, (data, target) in enumerate(valid_loader):
                # Limiting validation data.
                if batch_idx * BATCHSIZE >= N_VALID_EXAMPLES:
                    break
                data, target = data.view(data.size(0), -1).to(DEVICE), target.to(DEVICE)
                output = model(data)
                # Get the index of the max log-probability.
                pred = output.argmax(dim=1, keepdim=True)
                correct += pred.eq(target.view_as(pred)).sum().item()

        accuracy = correct / min(len(valid_loader.dataset), N_VALID_EXAMPLES)

        trial.report(accuracy, epoch)

        # Handle pruning based on the intermediate value.
        if trial.should_prune():
            raise optuna.exceptions.TrialPruned()

    return accuracy
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100, timeout=600)

pruned_trials = study.get_trials(deepcopy=False, states=[TrialState.PRUNED])
complete_trials = study.get_trials(deepcopy=False, states=[TrialState.COMPLETE])

print("Study statistics: ")
print("  Number of finished trials: ", len(study.trials))
print("  Number of pruned trials: ", len(pruned_trials))
print("  Number of complete trials: ", len(complete_trials))

print("Best trial:")
trial = study.best_trial

print("  Value: ", trial.value)

print("  Params: ")
for key, value in trial.params.items():
    print("    {}: {}".format(key, value))
Study statistics:
  Number of finished trials:  100
  Number of pruned trials:  64
  Number of complete trials:  36
Best trial:
  Value:  0.8484375
  Params:
    n_layers: 1
    n_units_l0: 77
    dropout_l0: 0.2621844457931539
    optimizer: Adam
    lr: 0.0051477826780949205

Mengoptimalkan LightGBM

Contoh berikut ini mengoptimalkan akurasi validasi deteksi kanker menggunakan LightGBM.

import numpy as np
import optuna

import lightgbm as lgb
import sklearn.datasets
import sklearn.metrics
from sklearn.model_selection import train_test_split

def objective(trial):
    data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
    train_x, valid_x, train_y, valid_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)

    param = {
        "objective": "binary",
        "metric": "binary_logloss",
        "verbosity": -1,
        "boosting_type": "gbdt",
        "lambda_l1": trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
        "lambda_l2": trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
        "bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
        "min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
    }

    gbm = lgb.train(param, dtrain)
    preds = gbm.predict(valid_x)
    pred_labels = np.rint(preds)
    accuracy = sklearn.metrics.accuracy_score(valid_y, pred_labels)
    return accuracy

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100)
[I 2023-01-20 09:18:25,197] A new study created in memory with name: no-name-26038220-3fba-4ada-9237-9ad9e0a7eff4
[I 2023-01-20 09:18:25,278] Trial 0 finished with value: 0.951048951048951 and parameters: {'lambda_l1': 3.6320373475789714e-05, 'lambda_l2': 0.0001638841686303377, 'num_leaves': 52, 'feature_fraction': 0.5051855392259837, 'bagging_fraction': 0.48918754678745996, 'bagging_freq': 4, 'min_child_samples': 30}. Best is trial 0 with value: 0.951048951048951.
.
.
.
[I 2023-01-20 09:18:37,148] Trial 99 finished with value: 0.972027972027972 and parameters: {'lambda_l1': 4.921752856772178e-06, 'lambda_l2': 5.0633857392202624e-08, 'num_leaves': 28, 'feature_fraction': 0.48257699231443446, 'bagging_fraction': 0.7810382257111896, 'bagging_freq': 3, 'min_child_samples': 28}. Best is trial 36 with value: 0.993006993006993.
print(f"Number of finished trials: {len(study.trials)}")
print("Best trial:")

trial = study.best_trial

print(f"  Value: {trial.value}")

print("  Params: ")
for key, value in trial.params.items():
    print(f"    {key}: {value}")

Number of finished trials: 100
Best trial:
  Value: 0.993006993006993
  Params:
    lambda_l1: 2.2820624207211886e-06
    lambda_l2: 4.100655307616414e-08
    num_leaves: 253
    feature_fraction: 0.6477416602072985
    bagging_fraction: 0.7393534933706116
    bagging_freq: 5
    min_child_samples: 36

LightGBM Tuner

Khusus untuk LightGBM, Optuna menawarkan LightGBM Tuner, yang membuat penyetelan LightGBM menjadi lebih mudah.

Namun, LightGBM Tuner hanya menyetel hiperparameter berikut ini.

  • lambda_l1
  • lambda_l2
  • num_leaves
  • feature_fraction
  • bagging_fraction
  • bagging_freq
  • min_child_samples
import numpy as np
import optuna.integration.lightgbm as lgb

from lightgbm import early_stopping
from lightgbm import log_evaluation
import sklearn.datasets
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split


if __name__ == "__main__":
    data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
    train_x, val_x, train_y, val_y = train_test_split(data, target, test_size=0.25)
    dtrain = lgb.Dataset(train_x, label=train_y)
    dval = lgb.Dataset(val_x, label=val_y)

    params = {
        "objective": "binary",
        "metric": "binary_logloss",
        "verbosity": -1,
        "boosting_type": "gbdt",
    }

    model = lgb.train(
        params,
        dtrain,
        valid_sets=[dtrain, dval],
        callbacks=[early_stopping(100), log_evaluation(100)],
    )

    prediction = np.rint(model.predict(val_x, num_iteration=model.best_iteration))
    accuracy = accuracy_score(val_y, prediction)

    best_params = model.params
    print("Best params:", best_params)
    print("  Accuracy = {}".format(accuracy))
    print("  Params: ")
    for key, value in best_params.items():
        print("    {}: {}".format(key, value))
Best params: {'objective': 'binary', 'metric': 'binary_logloss', 'verbosity': -1, 'boosting_type': 'gbdt', 'feature_pre_filter': False, 'lambda_l1': 3.9283033758323693e-07, 'lambda_l2': 0.11914982777201996, 'num_leaves': 4, 'feature_fraction': 0.4, 'bagging_fraction': 0.46448877892449625, 'bagging_freq': 3, 'min_child_samples': 20}
  Accuracy = 0.9790209790209791
  Params:
    objective: binary
    metric: binary_logloss
    verbosity: -1
    boosting_type: gbdt
    feature_pre_filter: False
    lambda_l1: 3.9283033758323693e-07
    lambda_l2: 0.11914982777201996
    num_leaves: 4
    feature_fraction: 0.4
    bagging_fraction: 0.46448877892449625
    bagging_freq: 3
    min_child_samples: 20

Contoh pengoptimalan lainnya

GitHub berikut ini menunjukkan banyak contoh implementasi Optuna dalam model lain.

https://github.com/optuna/optuna-examples

Referensi

https://optuna.readthedocs.io/en/stable/index.html
https://optuna.readthedocs.io/en/stable/reference/generated/optuna.integration.lightgbm.LightGBMTuner.html
https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial
https://github.com/optuna/optuna-examples
https://github.com/optuna/optuna-dashboard

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!