原理+代码详解 | 稠密重建之SGM/tSGM算法
目录
- 立体匹配算法介绍SGM算法
- tSGM算法
- SGM/tSGM代码实现
立体匹配算法介绍
-
全局立体匹配算法
- 全局立体匹配算法主要是采用了全局的优化理论方法估计视差,建立全局能量函数,通 过最小化全局能量函数得到最优视差值;
- 通过二维相邻像素视差之间的约束(如平滑性约束)而得到更好的匹配效果,但是对内存的占用 量大,速度慢不适合实时运行。主要的算法有图割(graph cuts)、信念传播(belief propagation)、动态规划等算法。
-
局部立体匹配算法
- 主要是采用局部优化方法进行视差值估计,局部立体匹配算法有 SAD,SSD 等算法,与全局立体匹配算法一样,也是通过能量最小化方法进行视差估计,但是在能量函数中,只有数据项,而没有平滑项;
- 该算法由于每个像素计算互不干扰可以并行计算,所以可以实时。但由于所基于的局部窗口视差相同的假设在很多情况下并不成立导致匹配效果较差。
-
半全局立体匹配算法SGM
- 综合上述局部和全局算法的优缺点,半全局算法依旧采用全局框架,但是在计算能量函数最小化的步骤时使用高效率的一维路径聚合方法来代替全局算法中的二维最小化算法,使用一维最优来近似二维最优,得到的视差图在效果上和全局算法没有太大的差别,但是算法效率却有非常大的提升。
SGM算法
-
参考文献:
-
[1]
Stereo Processing by Semi-global Matching and Mutual Information
-
[2]
SURE: Photogrammetric Surface Reconstruction from Imagery
-
SGM算法详细介绍
匹配代价计算
文献[1]中介绍的SGM的代价计算是基于互信息(Mutual Information,MI)的匹配测度计算算法来计算匹配代价,互信息是一种对影像明暗变化不敏感的相关性测度。但由于原理复杂且计算需要迭代效率比较低,在实际应用中,更简单有效的方法如Census变换,故在此不再介绍MI。
-
Census
变换:
- 使用像素邻域内的局部灰度差异将像素灰度转换为比特串即为census值;
- 基于Census变换的匹配代价计算方法是计算左右影像对应的两个像素的Census变换值的汉明(Hamming)距离(两个比特串的对应位不相同的数量:先进行异或运算,再统计运算结果中1的个数)
-
代价聚合
采用全局立体匹配算法,即找到每个像素的最优视差使得整体能量最小。能量方程如下:
各位亲爱的读者,由于知乎文章平台不支持以下公式的格式,查看完整内容请点击链接阅读原文—— 原理+代码详解 | 稠密重建之SGM/tSGM算法
SGM/tSGM代码实现
//参考文献:SGM:Stereo Processing by Semiglobal Matching and Mutual Information
// tSGM:SURE: Photogrammetric surface reconstruction from imagery
// SGM 算法主要实现两种经典SGM和tSGM,主要区别是代价聚合的视差搜索范围不同,故聚合代价有区别。经典SGM使用的全局视差范围,每个
// 像素的视差范围都是相同。tSGM 视差搜素范围每个像素的范围都不一样。这样可以大大减少代价聚合的计算量。
void SemiGlobalMatcher::Match(const Scene& scene, IIndex idxImage, IIndex numNeighbors, unsigned minResolution)
const Image& leftImage = scene.images[idxImage];
const float fMinScore(MAXF(leftImage.neighbors.front().score* (OPTDENSE::fViewMinScoreRatio*0.1f), OPTDENSE::fViewMinScore));
FOREACH(idxNeighbor, leftImage.neighbors) {
const ViewScore& neighbor = leftImage.neighbors[idxNeighbor];
// exclude neighbors that over the limit or too small score
// 分数较小的邻域帧不参与计算
ASSERT(scene.images[neighbor.idx.ID].IsValid());
if ((numNeighbors && idxNeighbor >= numNeighbors) ||
(neighbor.score < fMinScore))
break;
// check if the disparity-map was already estimated for the same image pairs
// 判断是否已经进行视差图计算,如果已经计算则不再计算
const Image& rightImage = scene.images[neighbor.idx.ID];
const String pairName(MAKE_PATH(String::FormatString("%04u_%04u", leftImage.ID, rightImage.ID)));
if (File::isPresent((pairName+".dimap").c_str()) || File::isPresent(MAKE_PATH(String::FormatString("%04u_%04u.dimap", rightImage.ID, leftImage.ID))))
continue;
TD_TIMER_STARTD();
IndexArr points;
Matrix3x3 H; Matrix4x4 Q;
ViewData leftData, rightData;
MaskMap leftMaskMap, rightMaskMap; {
// fetch pairs of corresponding image points
//TODO: use precomputed points from SelectViews()
//!!! 查找左右图能看到的3D点,改进点是可以提前计算好避免重复计算浪费时间
Point3fArr leftPoints, rightPoints;
FOREACH(idxPoint, scene.pointcloud.points) {
const PointCloud::ViewArr& views = scene.pointcloud.pointViews[idxPoint];
if (views.FindFirst(idxImage) != PointCloud::ViewArr::NO_INDEX) {
points.push_back((uint32_t)idxPoint);
if (views.FindFirst(neighbor.idx.ID) != PointCloud::ViewArr::NO_INDEX) {
const Point3 X(scene.pointcloud.points[idxPoint]);
leftPoints.emplace_back(leftImage.camera.TransformPointW2I3(X));
rightPoints.emplace_back(rightImage.camera.TransformPointW2I3(X));
// stereo-rectify image pair
// Step 3_2_2_1 极线校正,H 像素坐标从原图转换到校正后的图上,Q 把校正图像上的[x' y' disparity 1] 转换到原图坐标系下 [x*z y*z z 1]*w
if (!Image::StereoRectifyImages(leftImage, rightImage, leftPoints, rightPoints, leftData.imageColor, rightData.imageColor, leftMaskMap, rightMaskMap, H, Q))
continue;
ASSERT(leftData.imageColor.size() == rightData.imageColor.size());
#ifdef _USE_FILTER_DEMO
// run openCV implementation
const LPCSTR argv[] = {"disparityFiltering", "-algorithm=sgbm", "-max_disparity=160", "-no-downscale"};
disparityFiltering(leftData.imageColor, rightData.imageColor, (int)SizeOfArray(argv), argv);
#endif
// color to gray conversion
// 彩色图转灰度图
#if SGM_SIMILARITY == SGM_SIMILARITY_CENSUS
leftData.imageColor.toGray(leftData.imageGray, cv::COLOR_BGR2GRAY, false, true);
rightData.imageColor.toGray(rightData.imageGray, cv::COLOR_BGR2GRAY, false, true);
#else
leftData.imageColor.toGray(leftData.imageGray, cv::COLOR_BGR2GRAY, true, true);
rightData.imageColor.toGray(rightData.imageGray, cv::COLOR_BGR2GRAY, true, true);
#endif
// compute scale used for the disparity estimation
REAL scale(1);
// 计算用来depth计算的scale
if (minResolution) {
unsigned resolutionLevel(8);
Image8U::computeMaxResolution(leftData.imageGray.width(), leftData.imageGray.height(), resolutionLevel, minResolution);
scale = REAL(1)/MAXF(2,POWI(2,resolutionLevel));
// 如果scale不等于1就使用tSGM算法(金字塔)上述论文有介绍
const bool tSGM(!ISEQUAL(scale, REAL(1)));
DisparityMap leftDisparityMap, rightDisparityMap; AccumCostMap costMap;
do {
#if 0
// export the intermediate disparity-maps
if (!leftDisparityMap.empty()) {
const REAL _scale(scale*0.5);
Matrix3x3 _H(H); Matrix4x4 _Q(Q);
Image::ScaleStereoRectification(_H, _Q, _scale);
ExportDisparityDataRawFull(String::FormatString("%s_%d.dimap", pairName.c_str(), log2i(ROUND2INT(REAL(1)/_scale))), leftDisparityMap, costMap, Image8U::computeResize(leftImage.GetSize(), _scale), H, _Q, 1);
#endif
// initialize
// Step 3_2_2_2 视差图初始化
const ViewData leftDataLevel(leftData.GetImage(scale));
const ViewData rightDataLevel(rightData.GetImage(scale));
const cv::Size size(leftDataLevel.imageGray.size());
const cv::Size sizeValid(size.width-2*halfWindowSizeX, size.height-2*halfWindowSizeY);
const bool bFirstLevel(leftDisparityMap.empty());//第一次计算标志
if (bFirstLevel) {
// initialize the disparity-map with a rough estimate based on the sparse point-cloud
//TODO: remove DepthData::ViewData dependency
// 初始用稀疏点云对depth进行初始化,与patchMatch初始化一样。
Image leftImageLevel(leftImage.GetImage(scene.platforms, scale*0.5, false));
DepthData::ViewData image;
image.pImageData = &leftImageLevel; // used only for avgDepth
image.image.create(leftImageLevel.GetSize());
image.camera = leftImageLevel.camera;
DepthMap depthMap;
Depth dMin, dMax;
TriangulatePoints2DepthMap(image, scene.pointcloud, points, depthMap, dMin, dMax);
points.Release();
Matrix3x3 H2(H); Matrix4x4 Q2(Q);
Image::ScaleStereoRectification(H2, Q2, scale*0.5);
const cv::Size sizeHalf(Image8U::computeResize(size, 0.5));
// 有效视差去除边界不大于窗口的像素
const cv::Size sizeValidHalf(sizeHalf.width-2*halfWindowSizeX, sizeHalf.height-2*halfWindowSizeY);
leftDisparityMap.create(sizeValidHalf);
//depth转视差图,H,Q介绍看上个函数极限矫正StereoRectifyImages
Depth2DisparityMap(depthMap, H2.inv(), Q2.inv(), 1, leftDisparityMap);
// resize masks
cv::resize(leftMaskMap, leftMaskMap, size, 0, 0, cv::INTER_NEAREST);
cv::resize(rightMaskMap, rightMaskMap, size, 0, 0, cv::INTER_NEAREST);
const cv::Rect ROI(halfWindowSizeX,halfWindowSizeY, sizeValid.width,sizeValid.height);
leftMaskMap(ROI).copyTo(leftMaskMap);
rightMaskMap(ROI).copyTo(rightMaskMap);
} else {
// upscale masks
UpscaleMask(leftMaskMap, sizeValid);
UpscaleMask(rightMaskMap, sizeValid);
// Step 3_2_2_3 计算视差图estimate right-left disparity-map
Index numCosts;
if (tSGM) {
// upscale the disparity-map from the previous level
// 左右视差转换为右左视差
FlipDirection(leftDisparityMap, rightDisparityMap);
// 计算每个像素点的视差搜索范围(对应论文中SURE: Photogrammetric Surface Reconstruction from Imagery 2.2.2的介绍)
numCosts = Disparity2RangeMap(rightDisparityMap, rightMaskMap, bFirstLevel?11:5, bFirstLevel?33:7);
} else {
// extract global min and max disparities
// 标准的SGM,每个像素视差范围都一致,故计算一个全局的最小最大视差即可。
Range range{std::numeric_limits<Disparity>::max(), std::numeric_limits<Disparity>::min()};
ASSERT(leftDisparityMap.isContinuous());
const Disparity* pd = leftDisparityMap.ptr<const Disparity>();
const Disparity* const pde = pd+leftDisparityMap.area();
do {
const Disparity d(*pd);
if (range.minDisp > d)
range.minDisp = d;
if (range.maxDisp < d)
range.maxDisp = d;
} while (++pd < pde);
// set disparity search range to the global min/max range
//? numDisp+16原因?
const Disparity numDisp(range.numDisp()+16);
const Disparity disp(range.minDisp+range.maxDisp);
range.minDisp = disp-numDisp;//2*minDisp-16
range.maxDisp = disp+numDisp;//2*maxDisp+16
maxNumDisp = range.numDisp();
numCosts = 0;//(size.width-2*halfWindowSizeX, size.height-2*halfWindowSizeY)*(range.maxDisp-range.minDisp)
imagePixels.resize(sizeValid.area());
for (PixelData& pixel: imagePixels) {
pixel.range = range;
pixel.idx = numCosts;
numCosts += maxNumDisp;
imageCosts.resize(numCosts);
imageAccumCosts.resize(numCosts);
//构建right-left 视差图
Match(rightDataLevel, leftDataLevel, rightDisparityMap, costMap);
// estimate left-right disparity-map
if (tSGM) {
numCosts = Disparity2RangeMap(leftDisparityMap, leftMaskMap, bFirstLevel?11:5, bFirstLevel?33:7);
imageCosts.resize(numCosts);
imageAccumCosts.resize(numCosts);
} else {
//图像左右,换了视差*-1
for (PixelData& pixel: imagePixels) {
const Disparity maxDisp(-pixel.range.minDisp);
pixel.range.minDisp = -pixel.range.maxDisp;
pixel.range.maxDisp = maxDisp;
//构建left-right视差图
Match(leftDataLevel, rightDataLevel, leftDisparityMap, costMap);
// Step 3_2_2_4 视差一致性检查check disparity-map cross-consistency
#if 0
if (ISEQUAL(scale, REAL(1))) {
cv::Ptr<cv::ximgproc::DisparityWLSFilter> filter = cv::ximgproc::createDisparityWLSFilterGeneric(true);
const cv::Rect rcValid(halfWindowSizeX,halfWindowSizeY, sizeValid.width,sizeValid.height);
Image32F leftDisparityMap32F, rightDisparityMap32F, filtered32F;
leftDisparityMap.convertTo(leftDisparityMap32F, CV_32F, 16);
rightDisparityMap.convertTo(rightDisparityMap32F, CV_32F, 16);
filter->filter(leftDisparityMap32F, leftData.imageColor(rcValid), filtered32F, rightDisparityMap32F);
filtered32F.convertTo(leftDisparityMap, CV_16S, 1.0/16, 0.5);
} else
#endif
if (bFirstLevel) {
// perform a rigorous filtering of the estimated disparity maps in order to
// estimate the common region or interest and set the validity masks
// 错误匹配剔除方法是左右一致性法,将左右影像互换位置,即左影像成为右影像,右影像成为左影像,再做一次立体匹配,得到另一张视差图,因为视差图中每个值所反映的是
// 两个像素之间的对应关系,所以依据视差的唯一性约束,通过左影像的视差图,找到每个像素在右影像的同名点像素及该像素对应的视差值,这两个视差值之间的差值若小于一定
// 阈值(一般为1个像素),则满足唯一性约束被保留,反之则不满足唯一性约束而被剔除
ConsistencyCrossCheck(leftDisparityMap, rightDisparityMap);
ConsistencyCrossCheck(rightDisparityMap, leftDisparityMap);
// 视差后处理剔除speckles
cv::filterSpeckles(leftDisparityMap, NO_DISP, OPTDENSE::nSpeckleSize, 5);
cv::filterSpeckles(rightDisparityMap, NO_DISP, OPTDENSE::nSpeckleSize, 5);
ExtractMask(leftDisparityMap, leftMaskMap);
ExtractMask(rightDisparityMap, rightMaskMap);
} else {
// simply run a left-right consistency check
// 左右一致性检查
ConsistencyCrossCheck(leftDisparityMap, rightDisparityMap);
} while ((scale*=2) < REAL(1)+ZEROTOLERANCE<REAL>());
#if 0
// remove speckles
if (OPTDENSE::nSpeckleSize > 0)
cv::filterSpeckles(leftDisparityMap, NO_DISP, OPTDENSE::nSpeckleSize, 5);
#endif
// sub-pixel disparity-map estimation
// Step 3_2_2_5 亚像素提取:采用二次曲线内插的方法获得子像素精度
RefineDisparityMap(leftDisparityMap);
#if 1
// export disparity-map for the left image
DEBUG_EXTRA("Disparity-map for images %3u and %3u: %dx%d (%s)", leftImage.ID, rightImage.ID,
leftImage.width, leftImage.height, TD_TIMER_GET_FMT().c_str());
ExportPointCloud(pairName+".ply", leftImage, leftDisparityMap, Q, subpixelSteps);
ExportDisparityMap(pairName+".png", leftDisparityMap);
ExportDisparityDataRawFull(pairName+".dimap", leftDisparityMap, costMap, leftImage.GetSize(), H, Q, subpixelSteps);
#else
// convert disparity-map to final depth-map for the left image
//转换为深度图
DepthMap depthMap(leftImage.image.size()); ConfidenceMap confMap;
Disparity2DepthMap(leftDisparityMap, costMap, H, Q, subpixelSteps, depthMap, confMap);
DEBUG_EXTRA("Depth-map for images %3u and %3u: %dx%d (%s)", leftImage.ID, rightImage.ID,
depthMap.width(), depthMap.height(), TD_TIMER_GET_FMT().c_str());
ExportDepthMap(pairName+".png", depthMap);