params['learning_rate'] = 0.1
params['boosting_type'] = 'gbdt'
params['objective'] = 'gamma'
params['metric'] = 'l1'
params['sub_feature'] = 0.5
params['num_leaves'] = 40
params['min_data'] = 50
params['max_depth'] = 30
lgb_model = lgb.train(params, d_train, 1000)
#Prediction
y_pred=lgb_model.predict(X_test)
mae_error = mean_absolute_error(y_test,y_pred)
print(mae_error)
但当我继续使用GridSearchCV时,我遇到了问题。我不完全确定如何正确设置。我找到了一些有用的资料,例如
here
,但他们似乎在用一个分类器工作。
1st try:
from sklearn.metrics import make_scorer
score_func = make_scorer(mean_absolute_error, greater_is_better=False)
model = lgb.LGBMClassifier(
boosting_type="gbdt",
objective='regression',
is_unbalance=True,
random_state=10,
n_estimators=50,
num_leaves=30,
max_depth=8,
feature_fraction=0.5,
bagging_fraction=0.8,
bagging_freq=15,
learning_rate=0.01,
params_opt = {'n_estimators':range(200, 600, 80), 'num_leaves':range(20,60,10)}
gridSearchCV = GridSearchCV(estimator = model,
param_grid = params_opt,
scoring=score_func)
gridSearchCV.fit(X_train,y_train)
gridSearchCV.grid_scores_, gridSearchCV.best_params_, gridSearchCV.best_score_
,之前给了我一堆的错误。
"ValueError:未知标签类型:'连续'"
更新:我使代码运行时切换了LGBMClassifier和LGBMModel。我是否也应该尝试使用LGBMRegressor,或者这并不重要?(来源:LGBMMel)https://lightgbm.readthedocs.io/en/latest/_modules/lightgbm/sklearn.html)