贝叶斯网络。与R相比,Python中的结构学习是非常缓慢的。

2 人关注

我目前正在研究一个问题,用贝叶斯网络对图像进行分类。我已经尝试使用 pomegranate pgmpy bnlearn 。我的数据集包含20多万张图片,我对这些图片进行了一些特征提取算法,得到了一个大小为1026的特征向量。

pgmpy

from pgmpy.models import BayesianModel
from pgmpy.estimators import HillClimbSearch, BicScore, K2Score
est = HillClimbSearch(feature_df, scoring_method=BicScore(feature_df[:20]))
best_model = est.estimate()
edges = best_model.edges()
model = BayesianModel(edges)
from pomegranate import *
model = BayesianNetwork.from_samples(feature_df[:20], algorithm='exact')
library(bnlearn)
df <- read.csv('conv_encoded_images.csv')
df$Age = as.numeric(df$Age)
res <- hc(df)
model <- bn.fit(res,data = df)

The program written in bnlearn in R completes running in couple of minutes, while the pgmpy runs for hours and 石榴 freezes my system after a few minutes. You can see from my code that I'm giving first 20 rows for training in both pgmpy and pomegranate programs, while bnlearn takes the whole dataframe. Since I am doing all my image preprocessing and feature extraction in python, it is difficult for me to switch between R and python for training.

我的数据包含从0到1的连续值。我也试过将数据离散为0和1,但这并没有解决这个问题。

有没有什么办法可以加快这些python包的训练速度,或者我在代码中做错了什么?

提前感谢任何帮助。