Multiview plus depth (MVD) videos are widely used in freeviewpoint TV systems. The bestknown technique to determine depth information is based on stereo vision. In this paper, we propose a novel local stereo matching algorithm which is radiometric invariant. The key idea is to use a combined matching cost of intensity and gradient based similarity measure. In addition, we realize an adaptive cost aggregation scheme by constructing an adaptive support window for each pixel, which can solve the boundary and low texture problems. In the disparity refinement process, we propose a fourstep postprocessing technique to handle outliers and occlusions. Moreover, we conduct stereo reconstruction tests to verify the performance of the algorithm more intuitively. Experimental results show that the proposed method is effective and robust against local radiometric distortion. It has an average error of 5.93% on the Middlebury benchmark and is compatible to the stateofart local methods.
1. Introduction
I
n recent years, threedimensional TV (3DTV) and freeviewpoint TV (FTV) are promising technologies for the next generation of home and entertainment services. The key point in 3DTV and FTV is calculating depth information of the scenes or objects. Binocular stereovision is a popular technique for building a three dimensional description of a scene observed from two slightly different viewpoints. By finding correspondent pixels in the reference and target images, depth information can be gained through disparity. This process is called stereo matching. Stereo matching is a classical and challenging problem in computer vision, which has been a hot research focus for a long time. In the last decade, researchers had put forward a large number of algorithms to solve this problem, but because of the illposedness of such a problem, there is not a perfect solution yet. Most stereo matching algorithms focus on establishing an energy function and minimizing such an energy function to estimate disparities. So, stereo matching is essentially a problem of finding an optimized solution. The equation is conducted by establishing reasonable energy functions, adding some constraints and adopting an optimization algorithm, which is also the method for solving all illposed problems. A thorough survey and taxonomy of dense stereo techniques was provided by Scharstein and Szeliski
[1]
. They summarized the stereo matching process into four steps: matching cost computation, cost aggregation, disparity computation and disparity refinement. They also divided stereo matching algorithms into local methods and global methods respectively according to the way of cost aggregation. Global methods can generally acquire a higher accuracy, but with less efficiency. On the contrary, local methods are fast and easy to realize, while it is difficult to choose a proper matching cost function
[2]
and construct right support windows.
Matching cost is the similarity measure of corresponding points between the left and right images. Most stereo matching algorithms use intensity based similarity measures. For instance, the sum of absolute difference (SAD), sum of square difference (SSD)
[1]
, Adapt Weight
[3]
and Segment Support
[4]
etc. are all in this category. For ideal images, they can produce results with high precision, but these methods are very sensitive to the image radiometric distortion. When the illumination condition and exposure time change, the accuracy will fall down quickly. Thus it is impossible to apply these methods to real images. Fortunately, there are some kinds of matching costs which are robust to radiometric distortion. The normalized crosscorrelation (NCC), Gradient
[5]
[6]
[7]
, Rank and Census transform
[8]
[9]
are the most commonly used ones.
Local stereo methods need to aggregate single pixels’ matching costs in a support region which is defined by a window. Inevitably, they will run into problems when deciding the window size to be used. Small windows do not contain enough information and can lead to noisy results, while large windows contain enough texture information but encompass pixels at different depths near depth discontinuities, resulting the foreground fattening effect. Fusiello and Roberto
[10]
proposed to select a best window among multiple predefined windows as the support window; Veksler
[11]
presented a variable window choosing method by exploring a useful range of interesting window shapes and sizes; Zhang
[12]
constructed a crossbased adaptive window for every pixel according to the color correlation of adjacent pixels and achieved good results. Qu
[13]
developed a binary support window by calculating the mean intensity in a predefined fixed window, but this binary support window may have a disconnected structure and would degrade the accuracy.
Global stereo methods consider stereo matching as a labeling problem where the pixels of the reference image are nodes and the estimated disparities are labels. They typically skip the cost aggregation step and define a global energy function that includes a data term and a smoothness term. The former sums pixelwise matching costs, while the latter supports piecewise smooth disparity selection. The labeling problem is solved by energy function minimization, using dynamic programming, graph cuts, or belief propagation. Some newest global stereo matching algorithms can be found in
[14]
[15]
[16]
[17]
.
To address the above matching cost computation and window size selection problems, this paper proposes a stereo matching algorithm based on an improved gradient cost and adaptive cost aggregation. Our main contributions are twofold: First, we improve the gradient matching cost by incorporating the phase information and proposed a hybrid cost function which combines gradient and color matching cost. Second, we develop a fourstep disparity refinement method to eliminate mismatches.
The remaining portions of this paper are organized as follows: We first propose our method and describe the algorithm thoroughly in section 2. Section 3 presents the experimental results and we finally conclude our work in section 4.
2. Proposed Method
According to Scharstein and Szeliski’s taxonomy, stereo matching process can be concluded into the following four steps: matching cost computation, cost aggregation, disparity computation and disparity refinement. We will follow this classification to describe our algorithm in detail. The outline of the proposed algorithm is shown in
Fig. 1
. Given two rectified images, we first calculate the corresponding gradient images, which is the prerequisite for computing matching cost. Then an adaptive window is constructed for every pixel to meet the need of cost aggregation. After this, by using the WinnerTakesAll strategy, the initial disparity maps are gained. At last, the final depth images are produced after disparity refinement.
Outline of the proposed algorithm.
 2.1 Matching cost computation
Matching cost is the similarity measure of corresponding points between the left and right images. Using different cost functions will get different disparity discriminations. As we discussed before, gray or color intensitybased matching costs are very sensitive to radiometric distortion and noise, while gradientbased matching costs are more robust to these factors and have been widely used.
The gradient of an image corresponds to the direction along which the gray value of the image changes most remarkably. In other words, the change of image intensity can be described by image gradient. Mathematically, image gradient is defined as the firstorder partial derivatives of image intensity with respect to
x
and
y
, which are represented as a vector:
where
I
(
x
,
y
) is the image intensity of an anchor pixel (
x
,
y
). In practical applications,
G
can be calculated by convolving the image with gradient masks. Here we just use the simplest gradient mask:
Thus, we can get the gradient images of both left and right images:
G_{L}
= (
G_{Lx}
,
G_{Ly}
)
^{T}
,
G_{R}
= (
G_{Rx}
,
G_{Ry}
)
^{T}
. For rectified images, supposing
p
= (
x
,
y
) is a pixel in the left image, then
pd
= (
x

d
,
y
) is the corresponding pixel in the right image with disparity
d
. Hence, the gradient matching cost function
C_{G}
can be defined as:
The above cost function only considers the modulus information of the gradient vector. Here, we develop an improved cost function which incorporates the gradient phase, similar to
[6]
. Using the gradient vector’s two components
G_{x}
and
G_{y}
, the modulus and the phase are computed as:
Generally, the modulus
m
represents the rate of change and the phase φ represents its direction. To show them intuitively,
Fig. 2
gives an example of the computed
m
and
φ
for Tsukuba image. We can see that gradient values can reflect the image edges or skeleton to some extent as well as the differences between
m
and
φ
.
(a) Modulus of gradient; (b) Phase of gradient.
As
m
and
φ
provide different information about the neighborhood of a pixel, they have different invariance properties with respect to radiometric distortion. For instance, neither the modulus nor the phase is affected by additive (offset) changes in the input images, while multiple variations (gain) affect the modulus but not the phase. So, it is more proper to consider them separately. Our method is based on this idea. To make full use of the gradient information, we combine the modulus and phase linearly with a weight parameter α, forming our new cost function:
where,
m^{c}
and
φ^{c}
are the modulus and phase of the gradient operator applied to each color band c∈{R, G, B} respectively; α is the weight of modulus with a range of [0, 1]. Considering the πperiodicity property of the phase, we employ
f
to normalize it into single period:
Because we have used a weight parameter α, it is easy to adjust the algorithm’s performance by changing the value of α. This is important as different lighting and exposure time can lead to different degrees of radiometric distortion and noise. From (6), we can see that the larger α is, the bigger effect the modulus will have. On the contrary, the phase will dominate if α is small. According to the radiometric distortion degree, the proper value of α can be set empirically.
As color intensities of an image directly reflect the brightness of pixels, using the gradient similarity alone may lose lots of details of the scene. Thus, we propose a combination of the color based SAD cost and the improved gradient cost, which is simple but very effective as it can yield more reliable similarity measure by compensating one another. The color based SAD matching cost can be represented as:
Then we use a robust function to normalize the costs into [0, 1]:
where λ is a controlling parameter. The final integrated matching cost of pixel
p
corresponding to disparity
d
is defined as:
In this way, both
G
(
p
,
d
) and
C
(
p
,
d
) are in the range of [0, 1] and their contributions to the final cost can be adjusted by setting different values of λ
_{c}
and λ
_{G}
. The proper values of these parameters can be got empirically.
 2.2 Adaptive window construction
As the identification ability of single pixel’s matching cost is weak, we need to propagate the adjacent pixels’ matching costs and aggregate them to improve accuracy. The neighborhood region is determined by a local support window and the pixels in the window will be included for aggregation. So, it is natural to ask how large the window should be. In fact, a fixed window can never get satisfactory results, because image regions with different characters need different windows. In textureless regions, larger windows are needed to provide enough pixels. On the contrary, regions with high texture and depth discontinuities need smaller windows to avoid being oversmoothed. To address this problem, Zhang proposed a crossbased adaptive window construction method which can alter the window’s shape and size adaptively. Such a crossbased support region is achieved by expanding a crossshaped skeleton around each pixel
p
to create four segments
, defining two sets of pixels H(
p
), V(
p
) in the horizontal and vertical directions. More details about the method can be found in
[12]
. In their original implementation, only one threshold for color similarity and one threshold for spatial closeness are used, which cannot satisfy all cases. Motivated by
[18]
, we present a modification of the original crossbased support region approach in this paper.
The key idea of the crossbased support region is to decide an upright cross for every pixel
p
in the input image, which is based on the color similarity and spatial closeness. As is shown in
Fig. 3
, the pixelwise adaptive cross consists of two orthogonal line segments, intersecting at the anchor pixel
p
. We use H(
p
) and V(
p
) to represent the horizontal and vertical segments respectively. Thus, four arms: left, right, up and down are constructed for each pixel and represented as
. By changing the length of the arms adaptively, we can effectively capture an adaptive support region for each pixel. Here, we use enhanced rules to decide each pixel’s arm length. Just taking
p
’s left arm as an example, it stops when it finds an endpoint pixel
p_{i}
that violates one of the three following rules:
Construction process of the adaptive window.
1.
D_{c}
(
p_{i}
,
p
) ˂
τ
_{1}
and
D_{c}
(
p_{i}
,
p_{i}
+(1, 0)) ˂
τ
_{1}
;
2.
D_{s}
(
p_{i}
,
p
) ˂
L
_{1}
;
3.
D_{c}
(
p_{i}
,
p
) ˂
τ
_{2}
, if
L
_{2}
˂
D_{s}
(
p_{i}
,
p
) ˂
L
_{1}
.
Where,
D_{s}
(
p_{i}
,
p
) is the spatial distance between
p_{i}
and
p
;
D_{s}
(
p_{i}
,
p
) represents the color difference, which is defined as
are the predefined color thresholds and spatial thresholds. Rule 1 restricts the color difference between
p_{i}
and
p
as well as
p_{i}
and its predecessor
p_{i}
+(1, 0) on the same arm. This prevents the arm to span over the edges in the image. Rule 2 and 3 provide multiple choices for the arm length. In textureless regions, we use larger threshold
L
_{1}
and
τ
_{1}
to guarantee enough pixels. But when the arm length exceeds a smaller value
L
_{2}
, Rule 3 will play its role by using a much stringent threshold τ
_{2}
to make sure that the arm will extend only in regions with very similar colors.
After the above process, we can get the end pixels of the four arms:
, then H(
p
) and V(
p
) can be got by:
Finally, by iteratively applying this approach for every pixel
q
along V(
p
), we can get the local support window U(
p
):
Fig. 4
shows an example of the adaptive local support windows, which approximates local image structures appropriately.
Example of the adaptive local support windows
 2.3 Cost aggregation
Traditional local algorithms only take the reference image’s support region into account. In contrast, we will symmetrically consider support regions of both target and reference images. Considering two corresponding pixels
p
=(
x
,
y
) and
pd
=(
x

d
,
y
) in the reference and target images, then we can acquire two local support regions U(
p
) and U′(
pd
). We will combine them to define the union support region:
After the support region being prepared, the aggregation matching cost of
p
is computed as follows:
where N is the number of total pixels in the support region U
_{d}
(
p
), and
e
(
q
,
d
) is the raw per pixel’s matching cost corresponding to disparity
d
. At last, we employ the WinnerTakesAll (WTA) strategy to select the best disparity with the lowest matching cost in the disparity range:
where
d
∈ [0,
d
_{max}
] represents the disparity range,
d
^{0}
(
p
) is chosen as the initial disparity of
p
.
 2.4 Disparity refinement
The disparity maps obtained after the previous three processes still contain some mismatches and unreliable values. For further refinement, postprocessing steps are required. Our postprocessing consists of four steps:
First, we apply a 5×5 median filter to both
d_{L}
and
d_{R}
which represent the left and right disparity maps respectively for removing isolated outliers.
Second, we implement the common reliable tool: leftright consistency check. A pixel
p
is characterized as valid if the constraint:
d_{L}
(
p
) =
d_{R}
(
p

d_{L}
(
p
), 0) holds true. Otherwise,
p
will be marked invalid and needs to be handled if the constraint is violated. Furthermore, the invalid disparities can be classified into two classes: occlusions and mismatches. We employ Hirschmüller’s approach to decide an invalid point is either occlusion or mismatch
[19]
.
Third, we present a disparity refinement method based on the local disparity histogram to recover the invalid disparities. For a pixel
p
in the disparity image, we build a local disparity histogram
φ_{p}
(
d
) in the neighborhood region of
p
, and count the times that every disparity occurs. Thereby, there will be
d_{max}
+1 bins corresponding to each disparity. Here, we do not need to seek for a new neighborhood region, but to reuse the previous local support region U(
p
) for pixel
p
. Thus, this process will not add much computation cost. Let
H
(
i
) be the length of the
i
th bin, i = 0 to
d_{max}
. We calculate
d
* as a disparity with the maximum normalized histogram:
In statistic, this disparity value is the local optimal one, and
h
(
d
*) represents its confidence level. The initial disparity
d
^{0}
(
p
) of pixel
p
is replaced by the new value
d
* if
h
(
d
*) is greater than
τ_{h}
; otherwise, it is left unchanged:
where
τ_{h}
∈ [0.1] is a confidence threshold. This step is repeated iteratively until there are no more updates to disparities in the map.
At last, as the invalid disparities may remain unchanged in step 3, there are still some invalid points need to be filled. We then introduce an interpolation strategy which treats occlusion and mismatch points differently. Interpolation is performed by propagating valid disparities to neighboring invalid disparities areas. For invalid pixel
p
, we find the nearest valid pixels along 8 directions and their disparities
d_{pi}
are stored. The final disparity of
p
is created by:
If
p
is occluded, we select the second lowest value ( seclow
d_{pi}
) to get rid of the preference to foreground or background. If
p
is mismatched, the median (med
d_{pi}
) is used which can maintain discontinuities in cases where the mismatched area is located at the boundary. Experiments show it can get better results.
3. Experimental Results and Discussions
 3.1 Accuracy of the proposed algorithm
This section presents experimental results as we have programmed and implemented the algorithm in C++. To verify the performance of the proposed method, our experiments are based on the rectified stereo images from the Middlebury stereo benchmark
[20]
. It offers 4 pairs of stereo images: Tsukuba, Venus, Teddy and Cones, with the sizes of 384×288, 434×383, 450×375 and 450×375 respectively. The disparity ranges of them are also given, which are: 015, 019, 059 and 059 pixels correspondingly. By comparing the results with the ground truth disparity images, we can get the quantified errors and make objective evaluation. The parameters in the algorithm are set as in
Table 1
, which are kept constant if no special declaring.
Parameter settings for all experiments
Parameter settings for all experiments
Fig. 5
shows the experimental results of our method on all four stereo pairs of the Middlebury stereo database. The left most column contains the left original images of the four stereo pairs. The ground truth disparity images are shown in the second column, our estimated disparity images are displayed in the third column, and the forth column gives the error maps computed with the ground truth. In the error maps, the white regions denote correctly calculated disparity values which do not differ for more than 1 pixel from the ground truth. Instead, if the estimated disparity differs for more than 1 pixel from the ground truth value, it is marked as an error and displayed in black and gray, where black represents the errors in the nonoccluded regions, and gray represents errors in the occluded regions.
Table 2
lists the objective evaluation of ours and other methods with the error threshold: δ
_{d}
= 1 pixel, which means bad pixels are those whose absolute disparity errors are above 1 pixel. Columns Nonocc, All and Disc represent the percentage of bad pixels for pixels in nonoccluded regions, for all pixels and for pixels in regions near depth discontinuities.
Experimental results on Middlebury datasets. From left to right in each row are the original left images, the ground truth disparity maps, the produced disparity maps by our algorithm and the error maps respectively.
Objective evaluation of matching results.
Objective evaluation of matching results.
From overall performance, the proposed method achieves satisfactory results. Our algorithm correctly estimates the disparities of both textureless and textured surfaces. For instance, the large uniform surfaces in stereo pairs Venus and Teddy are successfully recovered while preserving the disparity edges well. For quantified comparison, the proposed method outperforms many classical global and local methods, like Enhanced BP
[21]
, GC+occ
[22]
, SemiGlob
[18]
, AdaptWeight
[3]
and so on. Although the NonLocalFilter
[2]
and PlinearS
[23]
methods have lower average error than ours, but these methods have not consider image amplitude distortion and are sensitive to radiometric difference as they are intensitybased algorithms. In the next subsection, we will demonstrate our method’s robustness to image radiometric distortion thoroughly.
To clarify the function of our improved gradient matching cost, we conduct a quantitative comparison test of the proposed method with the traditional method of only using modulus information. In addition, to eliminate interferences and show the effect of our fourstep disparity refinement method, we use the results without disparity refinement. For simplicity, we only present the errors of the estimated disparities of nonoccluded regions in
Table 3
. It is clear to see that our proposed matching cost imrpove the resulsts a lot. Also, compared with the results after disparity refinement in
Table 2
, the effectiveness of our refinement method is obvious too as the error percentages of disparity maps without refinement are much higher in nonoccluded areas.
Comparison of the proposed matching cost with traditional gradient cost
Comparison of the proposed matching cost with traditional gradient cost
 3.2 Sensitivity to radiometric distortion
To test stereo algorithms’ sensitivity to radiometric differences, Hirschmüller and Scharstein
[20]
created 6 datasets: Art, Books, Dolls, Moebius, Laundry and Reindeer, which are shown in
Fig. 6
as well as their ground truth disparity maps. We also present the disparity maps produced by the proposed method. Each dataset is taken using three different exposures and under three different configurations of the light sources. Thus, there will be 9 different image combinations that exhibit significant radiometric differences. To demonstrate the performance under radiometric distortion of the proposed method, we keep the right image unchanged and alter the exposure and lighting conditions of the left image. Thus we can consider the two factors separately. We show the experimental results of “Reindeer” as an example in
Fig. 7
. Obviously, the qualities of the produced disparity maps are very stable throughout the experiments, which can show the strong robustness of the proposed method.
More experimental results without radiometric difference. From top to down are accordingly the Art, Books, Dolls, Moebius, Laundry, and Reindeer stereo datasets. From left to right are the original color images, ground truth and disparity maps produced by the proposed method.
Experimental results of the Reindeer pairs by the proposed method with radiometric difference. The first row are the left images under three different exposures and the second row are the cooresponding disparity maps. The third row are the left images under three different light conditions with the cooresponding disparity maps shown in the last row.
As the sensitivity to radiometric distortion is mainly affected by the similarity measure or matching cost, we test three different matching costs including our proposed one. To highlight our proposed matching cost, all of the three compared methods use the adaptive window based cost aggregation to exclude the influence of aggregation ways. The resulting curves are shown in
Fig. 8
. The experiments cover all 3×3 combinations of exposure and light changes which are represented as 1/1 to 3/3. The error rates are the average of all 6 datasets. Seeing from the plots, in every exposure and lighting configuration, the proposed method has the best performance while the SAD method is the worst one. All of the 3 methods have better performance when the two images are under the same exposure and lighting configurations than when they are under different exposure and lighting configurations. The SAD method is very sensitive to radiometric distortion as its error percentage rise dramatically when left/right images are under different configurations. The gradient method is much better but still not satisfactory. The proposed method is very robust to radiometric distortion as its error rates keep in a low level and vary little throughout when exposure and lighting condition differs. This is because SAD is an intensity based similarity measure and depends on pixels’ color or gray intensities which are hypersensitive to radiometric difference. Instead, the proposed method utilizes the gradient information and designs a new matching cost function by integrating the gradient modulus and phase. Hence, our method is not sensitive to color variance and keeps strong robustness to radiometric distortion.
Performance comparison under 3×3 left/right image combinations that differ in exposure and lighting conditions. (a). Different lightings; (b). Different exposures.
 3.3 Stereo scene reconstruction
There are many applications for stereo matching, and three dimension (3D) scene reconstruction is an important one. By reprojecting an image pixel to the 3D space using its depth information, we can reconstruct a complete 3D object model from the 2D images. The quality of scene reconstruction is influenced by the accuracy of acquired depth map to a large extent. To illustrate the quality of the derived matching results, we present reconstructed views of the previous test images in
Fig. 9
in order to gain a further impression of the accuracy and details of the computed depth information. The reconstructing results show that our estimated depth maps are competent to 3D reconstruction tasks.
3D scene reconstruction results by using the produced disparity images.
5. Conclusion
This paper presents a novel stereo matching method based on a combined cost function and adaptive window cost aggregation. The improved cost function integrates both modulus and phase components of the gradient vector and then combines them with SAD cost, leading to a superior accuracy. In order to address the window size selecting problem, we introduce an adaptive window solution. The algorithm constructs an adaptive support region for every pixel according to the local color similarity and spatial closeness. Thus, every pixel can get a proper support region for aggregation. In addition, this support region can be reused in the later disparity refinement step. We explore a fourstep refinement process, including median filter, leftright consistence checking, invalid pixels recovering and holes filling. We evaluate our algorithm on the stereo pairs from the Middlebury database. The proposed algorithm matches textureless as well as textured surfaces equally well and can preserve depth discontinuities at the same time. The experimental result comparisons have demonstrated that the proposed method outperforms many local and global methods. Furthermore, the proposed algorithm handles well with radiometric differences, showing strong robustness to radiometric distortion of input images.
Though the proposed method achieves good performance, there are still some aspects to be improved, such as redundancy among the disparity search range, more sophisticated disparity refinement process and parallel implementation for the proposed method will be considered in the next step research.
BIO
Shiping Zhu received the B.Sc. and M.Sc. degrees from Xi’an University of Technology, Xi’an, China, in 1991 and 1994, and the Ph.D. degree from Harbin Institute of Technology, Harbin, China, in 1997. From 1997 to 1999, he was a Postdoctoral Fellow with Beihang University, Beijing, China. From 2000 to 2002, he was a Postdoctoral Fellow with the Brain and Cognition Research Center, Université Paul Sabatier, Toulouse, France. From 2002 to 2004, he was a Postdoctoral Fellow with the Department of Computer Science and Department of Electrical and Computer Engineering, Université de Sherbrooke, Sherbrooke, QC, Canada. Since 2005, he has been an associate professor with the Department of Measurement Control and Information Technology, School of Instrumentation Science and Optoelectronics Engineering, Beihang University, Beijing, China. (Email: spzhu@163.com)
Zheng Li received the B.Sc. degree in Measurement and Control Technology and Instrumentation from China University of Geosciences, Wuhan, China in 2012, and he is currently pursuing the M.Sc. degree in Instrumentation Science and Technology at Beihang University, Beijing, China. His research interests include stereo vision, view synthesis and image processing. (Email: lizheng900911@163.com)
View Fulltext
Scharstein Daniel
,
Szeliski Richard
2002
“A taxonomy and evaluation of dense twoframe stereo correspondence algorithms,”
International Journal of Computer Vision
47
(1)
7 
42
DOI : 10.1023/A:1014573219977
Yang Qinxiong
“A nonlocal cost aggregation method for stereo matching,”
in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition
June 1621, 2012
1402 
1409
Yoon KukJin
,
Kweon InSo
2006
‘Locally adaptive support weight approach for visual correspondence search,”
IEEE Transactions on Pattern Analysis and Machine Intelligence
28
(4)
924 
931
Tombari Federico
,
Mattoccia Stefano
,
Di Stefano Luigi
“Segmentation based adaptive support for accurate stereo correspondence,”
in Proc. of the 2nd Pacific Rim conference on Advances in image and video technology
December 17, 2007
no. 4872
427 
438
Scharstein Daniel
,
Phd thesis
1997
View synthesis using stereo vision
Phd thesis
DeMaeztu Leonardo
,
Villanueva Arantxa
,
Cabeza Rafael
2011
“Stereo matching using gradient similarity and locally adaptive supportweight,”
Pattern Recognition Letters
32
(13)
1643 
1651
DOI : 10.1016/j.patrec.2011.06.027
Zhou Xiaozhou
,
Boulanger Pierre
“Radiometric invariant stereo matching based on relative gradients,”
in Proc. of IEEE International Conference on Image Processing
September 30October 3, 2012
2989 
2992
Zabih Ramin
,
Woodfill John
“Nonparametric local transforms for computing visual correspondence,”
in Proc. of European Conference on Computer Vision
May26, 1994
151 
158
Humenberger Martin
,
Zinner Christian
,
Weber Michael
,
Kubinger Wilfried
,
Vincze Markus
2010
“A fast stereo matching algorithm suitable for embedded realtime systems,”
Computer Vision and Image Understanding
114
(11)
1180 
1202
DOI : 10.1016/j.cviu.2010.03.012
Fusiello Andrea
,
Roberto Vito
,
Truco Emanuele
“Efficient stereo with multiple windowing,”
in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition
June 1719, 1997
858 
863
Veksler Olga
“Fast variable window for stereo correspondence using integral image,”
in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition
June 1820, 2003
556 
561
Zhang Kang
,
Lu Jiangbo
,
Lafruit Gauthier
2009
“Crossbased local stereo matching using orthogonal integral images,”
IEEE Transactions on Circuits and Systems for Video Technology
19
(7)
1073 
1079
DOI : 10.1109/TCSVT.2009.2020478
Qu Yufu
,
Jiang Ji Xiang
,
Deng Xiangjin
2014
“Robust local stereo matching under varying radiometric conditions,”
IET Computer Vision
8
(4)
263 
276
DOI : 10.1049/ietcvi.2013.0117
Besse Frederic
,
Rother Carsten
,
Fitzgibbon Andrew
,
Kautz Jan
2014
“PMBP: PatchMatch belief propagation for correspondence field estimation,”
International Journal of Computer Vision
110
(1)
2 
13
DOI : 10.1007/s1126301306539
Wang Liang
,
Yang Ruigang
“Global stereo matching leveraged by sparse ground control points,”
in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition
June 2025, 2011
3033 
3040
Barzigar Nafise
,
Roozgard Aminmohammad
,
Cheng Samuel
,
Verma Pramood
2013
“SCoBeP: Dense image registration using sparse coding and belief propagation,”
Journal of Visual Communications and Image Representation
24
(2)
137 
147
DOI : 10.1016/j.jvcir.2012.08.002
Papadakis Nicolas
,
Caselles Vicent
2010
“Multilabel depth estimation for graph cuts stereo problems,”
Journal of Mathematical Imaging and Vision
38
(1)
70 
82
DOI : 10.1007/s1085101002128
Mei Xing
,
Sun Xun
,
Zhou Mingcai
,
Jiao Shaohui
,
Wang Haitao
,
Zhang Xiaopeng
“On building an accurate stereo matching system on graphics hardware,”
in Proc of IEEE International Conference on Computer Vision Workshops
November 613, 2011
467 
474
Hirschmüller Heiko
2008
“Stereo processing by semiglobal matching and mutual information,”
IEEE Transactions on Pattern Analysis and Machine Intelligence
30
(2)
328 
341
DOI : 10.1109/TPAMI.2007.1166
Scharstein Daniel
,
Szeliski Richard
“The middlebury stereo vision page,”
http://vision.Middlebury.edu/stereo/
Larsen E. Scott
,
Mordohai Philippos
,
Pollefeys Marc
,
Fuchs Henry
“Temporally consistent reconstruction from multiple video streams using enhanced belief propagation,”
in Proc. of IEEE International Conference on Computer Vision
October 1421, 2007
1 
8
Kolmogorov Vladimir
,
Rabih Ramin
“Computing visual correspondence with occlusions using graph cuts,”
in Proc. of IEEE International Conference on Computer Vision
July 714, 2001
508 
515
DeMaeztu Leonardo
,
Mattoccia Stfano
,
Villanueva Arantxa
,
Cabeza Rafeal
“Linear stereo matching,”
in Proc. of IEEE International Conference on Computer Vision
November 613, 2011
1708 
1715