I built OpenCV 3.3.1 with OpenCV_Extra on Jetson TX2 like the following with CUDA disabled:
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DBUILD_PNG=OFF -DBUILD_TIFF=OFF -DBUILD_TBB=OFF -DBUILD_JPEG=OFF -DBUILD_JASPER=OFF -DBUILD_ZLIB=OFF -DBUILD_EXAMPLES=ON -DBUILD_opencv_java=OFF -DBUILD_opencv_python2=ON -DBUILD_opencv_python3=ON -DENABLE_PRECOMPILED_HEADERS=OFF -DWITH_CUDA=OFF -DWITH_OPENCL=OFF -DWITH_OPENMP=OFF -DWITH_FFMPEG=ON -DWITH_GSTREAMER=OFF -DWITH_GSTREAMER_0_10=OFF -DWITH_GTK=ON -DWITH_VTK=OFF -DWITH_TBB=ON -DWITH_1394=OFF -DWITH_OPENEXR=OFF -DINSTALL_C_EXAMPLES=ON -DINSTALL_TESTS=ON -DCPACK_GENERATOR_DEB=ON -DOPENCV_TEST_DATA_PATH=../opencv_extra/testdata ../opencv
And ran OpenCV test script, and found two test failures on 'Calib3d_Affine3f.accuracy' and 'match_bestOf2Nearest.bestOf2Nearest' testing like the following:
[opencv_test_calib3d] RUN : /usr/bin/opencv_test_calib3d --perf_min_samples=1 --perf_force_samples=1 --gtest_output=xml:opencv_test_calib3d.xml [opencv_test_calib3d] CTEST_FULL_OUTPUT [opencv_test_calib3d] OpenCV version: 3.3.1 [opencv_test_calib3d] OpenCV VCS version: 3.3.1 [opencv_test_calib3d] Build type: release [opencv_test_calib3d] Parallel framework: tbb [opencv_test_calib3d] CPU features: neon fp16 [opencv_test_calib3d] [ RUN ] Calib3d_Affine3f.accuracy [opencv_test_calib3d] /home/nvidia/build-opencv/opencv/modules/calib3d/test/test_affine3.cpp:57: Failure [opencv_test_calib3d] Expected: 0 [opencv_test_calib3d] To be equal to: cvtest::norm(cv::Mat(affine.matrix, false).colRange(0, 3).rowRange(0, 3) != expected, cv::NORM_L2) [opencv_test_calib3d] Which is: 441.673 [opencv_test_calib3d] [ FAILED ] Calib3d_Affine3f.accuracy (0 ms) [opencv_perf_stitching] RUN : /usr/bin/opencv_perf_stitching --perf_min_samples=1 --perf_force_samples=1 --gtest_output=xml:opencv_perf_stitching.xml [opencv_perf_stitching] Time compensation is 0 [opencv_perf_stitching] CTEST_FULL_OUTPUT [opencv_perf_stitching] OpenCV version: 3.3.1 [opencv_perf_stitching] OpenCV VCS version: 3.3.1 [opencv_perf_stitching] Build type: release [opencv_perf_stitching] Parallel framework: tbb [opencv_perf_stitching] CPU features: neon fp16 [opencv_perf_stitching] [----------] 1 test from match_bestOf2Nearest [opencv_perf_stitching] [ RUN ] match_bestOf2Nearest.bestOf2Nearest/0 [opencv_perf_stitching] Expected: [opencv_perf_stitching] [0.9970975582816909, 0.01136054503288174; [opencv_perf_stitching] -0.002557125266879237, 1.02781673911756; [opencv_perf_stitching] 0.0002463026627945361, -1.679576661348132e-05] [opencv_perf_stitching] Actual: [opencv_perf_stitching] [0.9986402872172184, -0.01796581492545124; [opencv_perf_stitching] -0.00230781842425846, 1.031312571169278; [opencv_perf_stitching] 0.0002375978666932171, 9.189439617496881e-05] [opencv_perf_stitching] /home/nvidia/build-opencv/opencv/modules/ts/src/ts_perf.cpp:571: Failure [opencv_perf_stitching] Failed [opencv_perf_stitching] Difference (=0.029326359958332979) between argument1 "R" and expected value is greater than 0.014999999999999999 [opencv_perf_stitching] params = "orb" [opencv_perf_stitching] termination reason: reached maximum number of iterations [opencv_perf_stitching] bytesIn = 96896 [opencv_perf_stitching] bytesOut = 0 [opencv_perf_stitching] samples = 1 [opencv_perf_stitching] outliers = 0 [opencv_perf_stitching] frequency = 1000000000 [opencv_perf_stitching] min = 306334005 = 306.33ms [opencv_perf_stitching] median = 306334005 = 306.33ms [opencv_perf_stitching] gmean = 306334005 = 306.33ms [opencv_perf_stitching] gstddev = 0.00000000 = 0.00ms for 97% dispersion interval [opencv_perf_stitching] mean = 306334005 = 306.33ms [opencv_perf_stitching] stddev = 0 = 0.00ms [opencv_perf_stitching] [ FAILED ] match_bestOf2Nearest.bestOf2Nearest/0, where GetParam() = "orb" (577 ms) [opencv_perf_stitching] [----------] 1 test from match_bestOf2Nearest (577 ms total)
I tried both Ubuntu 16.04-based image and Ubuntu 18.04-based image with different GCC versions. And below is the summary of my findings:
The OpenCV test failure on ‘Calib3d_Affine3f.accuracy’ was observed on different platforms w/ different images whenever the toolchain is based on gcc-7. And in any platform + image combination, this test always passes when gcc < 7 (i.e., lower than version 7, like gcc-6.4.0) is used.
The OpenCV test failure on ‘match_bestOf2Nearest.bestOf2Nearest’ was persistently observed whatever image and whatever version of gcc used.
My observation is that the failure on ‘Calib3d_Affine3f.accuracy’ test is because the test script compares the expected and the actual values with an equality sign (i.e., ==), which is to compare the exact values, not assuming any variations in floating-point operations in different platform with different compilers. With gcc < 7, the floating-point comparison test passes, but with gcc-7, it fails due to the two floating-point values are slightly different like 1e-17.
- Does the IEEE floating-point spec mandate across architectures regarding precision?
- Would the comparison method in OpenCV test script (i.e., exact value comparison instead of compare with a tolerable margin) be typical for floating-point values per the IEEE spec?
- Why would the floating-point precision behavior be different when using gcc-7 compared to gcc < 7?
- How to make the test passed with gcc-7 and moving forward?
The second failure case on ‘match_bestOf2Nearest.bestOf2Nearest’ is also related to the precision handling in floating-point operation in some of OpenCV functions. I found out that OpenCV functions sometimes cast the values between CV_32F and CV_64F internally. Assuming the CV_32F format is enough to handle the precision, this shouldn’t matter. For this case, the test script compares the expected and the actual values with a tolerable margin like an epsilon (instead of exact value equality testing). The observation is that the test fails depending on the random seed provided in the functions, implying that the pre-defined epsilon is not large enough, or the floating-point operations on our CPU has variability larger than what is assumed in OpenCV test script.
- Would there be anything related to the CPU architecture-specific characteristics on Jetson-TX2? What would that be?
- How to resolve this issue?
It would be wonderful if you could share your wisdom with me!