Commit Graph

4476 Commits

Author SHA1 Message Date
Alexander Alekhin
aece3e732e Merge pull request #18507 from sizeofvoid:openbsd 2020-10-05 17:02:38 +00:00
Mario Emmenlauer
102d8f67cd matrix.cpp::setSize(): fixed out-of-bounds access on cv::Mat steps 2020-10-05 10:19:53 +02:00
Rafael Sadowski
3acf8cfd63 Add an OpenBSD check 2020-10-05 08:23:23 +02:00
Alexander Alekhin
d34717d8c9 core: allow to disable including of unsupported/Eigen/CXX11/Tensor
- define OPENCV_DISABLE_EIGEN_TENSOR_SUPPORT
2020-10-04 15:14:46 +00:00
Alexander Alekhin
233030e417 core: force check for string literals are used in the message 2020-09-27 06:37:44 +00:00
Alexander Alekhin
5e90802b1a Merge pull request #18363 from alalek:issue_18349 2020-09-19 16:53:34 +00:00
Alexander Alekhin
261ad78122 core: emit more clear messages in OutputArray::create() 2020-09-18 15:25:29 +00:00
Alexander Alekhin
4fa82809df ocl: avoid rescheduling of async kernels 2020-09-18 14:53:50 +00:00
Alexander Alekhin
50ff40d684 pre: OpenCV 3.4.12 (version++) 2020-09-06 22:26:32 +00:00
Alexander Alekhin
efcf307b4c ocl: cleanup dead code in case of disabled OpenCL 2020-08-31 11:30:42 +00:00
Alexander Alekhin
f53ff0d01c Merge pull request #18151 from alalek:core_trace_fix_location 2020-08-21 18:54:40 +00:00
Clement Courbet
da555a2c9b Optimize opencv dft by vectorizing radix2 and radix3.
This is useful for non power-of-two sizes when WITH_IPP is not an option.

This shows consistent improvement over openCV benchmarks, and we measure
even larger improvements on our internal workloads.

For example, for 320x480, `32FC*`, we can see a ~5% improvement}, as
`320=2^6*5` and `480=2^5*3*5`, so the improved radix3 version is used.
`64FC*` is flat as expected, as we do not specialize the functors for `double`
in this change.

```
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, 0, false)                                1.239  1.153     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, 0, true)                                 0.991  0.926     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_COMPLEX_OUTPUT, false)               1.367  1.281     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_COMPLEX_OUTPUT, true)                1.114  1.049     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE, false)                      1.313  1.254     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE, true)                       1.027  0.977     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   1.296  1.217     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.039  0.963     1.08
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_ROWS, false)                         0.542  0.524     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_ROWS, true)                          0.293  0.277     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_SCALE, false)                        1.265  1.175     1.08
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_SCALE, true)                         1.004  0.942     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, 0, false)                                1.292  1.280     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, 0, true)                                 1.038  1.030     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_COMPLEX_OUTPUT, false)               1.484  1.488     1.00
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_COMPLEX_OUTPUT, true)                1.222  1.224     1.00
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE, false)                      1.380  1.355     1.02
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE, true)                       1.117  1.133     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   1.372  1.383     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.117  1.127     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_ROWS, false)                         0.546  0.539     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_ROWS, true)                          0.293  0.299     0.98
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_SCALE, false)                        1.351  1.339     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_SCALE, true)                         1.099  1.092     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, 0, false)                                2.235  2.123     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, 0, true)                                 1.843  1.727     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_COMPLEX_OUTPUT, false)               2.189  2.109     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_COMPLEX_OUTPUT, true)                1.827  1.754     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE, false)                      2.392  2.309     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE, true)                       1.951  1.865     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   2.391  2.293     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.954  1.882     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_ROWS, false)                         0.811  0.815     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_ROWS, true)                          0.426  0.437     0.98
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_SCALE, false)                        2.268  2.152     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_SCALE, true)                         1.893  1.788     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, 0, false)                                4.546  4.395     1.03
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, 0, true)                                 3.616  3.426     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_COMPLEX_OUTPUT, false)               4.843  4.668     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_COMPLEX_OUTPUT, true)                3.825  3.748     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE, false)                      4.720  4.525     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE, true)                       3.743  3.601     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   4.755  4.527     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    3.744  3.586     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_ROWS, false)                         1.992  2.012     0.99
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_ROWS, true)                          1.048  1.048     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_SCALE, false)                        4.625  4.451     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_SCALE, true)                         3.643  3.491     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, 0, false)                                4.499  4.488     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, 0, true)                                 3.559  3.555     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_COMPLEX_OUTPUT, false)               5.155  5.165     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_COMPLEX_OUTPUT, true)                4.103  4.101     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE, false)                      5.484  5.474     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE, true)                       4.617  4.518     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   5.547  5.509     1.01
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    4.553  4.554     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_ROWS, false)                         2.067  2.018     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_ROWS, true)                          1.104  1.079     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_SCALE, false)                        4.665  4.619     1.01
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_SCALE, true)                         3.698  3.681     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, 0, false)                                8.774  8.275     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, 0, true)                                 6.975  6.527     1.07
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_COMPLEX_OUTPUT, false)               8.720  8.270     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_COMPLEX_OUTPUT, true)                6.928  6.532     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE, false)                      9.272  8.862     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE, true)                       7.323  6.946     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   9.262  8.768     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    7.298  6.871     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_ROWS, false)                         3.766  3.639     1.03
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_ROWS, true)                          1.932  1.889     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_SCALE, false)                        8.865  8.417     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_SCALE, true)                         7.067  6.643     1.06
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, 0, false)                              10.014 10.141    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, 0, true)                               7.600  7.632     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_COMPLEX_OUTPUT, false)             11.059 11.283    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_COMPLEX_OUTPUT, true)              8.475  8.552     0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE, false)                    12.678 12.789    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE, true)                     10.445 10.359    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 12.626 12.925    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  10.538 10.553    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_ROWS, false)                       5.041  5.084     0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_ROWS, true)                        2.595  2.607     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_SCALE, false)                      10.231 10.330    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_SCALE, true)                       7.786  7.815     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, 0, false)                              13.597 13.302    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, 0, true)                               10.377 10.207    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_COMPLEX_OUTPUT, false)             15.940 15.545    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_COMPLEX_OUTPUT, true)              12.299 12.230    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE, false)                    15.270 15.181    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE, true)                     12.757 12.339    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 15.512 15.157    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  12.505 12.635    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_ROWS, false)                       6.359  6.255     1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_ROWS, true)                        3.314  3.248     1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_SCALE, false)                      13.937 13.733    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_SCALE, true)                       10.782 10.495    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, 0, false)                              18.985 18.926    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, 0, true)                               14.256 14.509    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_COMPLEX_OUTPUT, false)             18.696 19.021    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_COMPLEX_OUTPUT, true)              14.290 14.429    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE, false)                    20.135 20.296    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE, true)                     15.390 15.512    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 20.121 20.354    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  15.341 15.605    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_ROWS, false)                       8.932  9.084     0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_ROWS, true)                        4.539  4.649     0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_SCALE, false)                      19.137 19.303    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_SCALE, true)                       14.565 14.808    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, 0, false)                              22.553 21.171    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, 0, true)                               17.850 16.390    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_COMPLEX_OUTPUT, false)             24.062 22.634    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_COMPLEX_OUTPUT, true)              19.342 17.932    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE, false)                    28.609 27.326    1.05
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE, true)                     24.591 23.289    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 28.667 27.467    1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  24.671 23.309    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_ROWS, false)                       9.458  9.077     1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_ROWS, true)                        4.709  4.566     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_SCALE, false)                      22.791 21.583    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_SCALE, true)                       18.029 16.691    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, 0, false)                              25.238 24.427    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, 0, true)                               19.636 19.270    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_COMPLEX_OUTPUT, false)             28.342 27.957    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_COMPLEX_OUTPUT, true)              22.413 22.477    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE, false)                    26.465 26.085    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE, true)                     21.972 21.704    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 26.497 26.127    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  22.010 21.523    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_ROWS, false)                       11.188 10.774    1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_ROWS, true)                        6.094  5.916     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_SCALE, false)                      25.728 24.934    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_SCALE, true)                       20.077 19.653    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, 0, false)                              43.834 40.726    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, 0, true)                               35.198 32.218    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_COMPLEX_OUTPUT, false)             43.743 40.897    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_COMPLEX_OUTPUT, true)              35.240 32.226    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE, false)                    46.022 42.612    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE, true)                     36.779 33.961    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 46.396 42.723    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  37.025 33.874    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_ROWS, false)                       17.334 16.832    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_ROWS, true)                        9.212  8.970     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_SCALE, false)                      44.190 41.211    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_SCALE, true)                       35.900 32.888    1.09
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, 0, false)                              40.948 38.256    1.07
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, 0, true)                               33.825 30.759    1.10
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_COMPLEX_OUTPUT, false)             53.210 53.584    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_COMPLEX_OUTPUT, true)              46.356 46.712    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE, false)                    47.471 47.213    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE, true)                     40.491 41.363    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 46.724 47.049    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  40.834 41.381    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_ROWS, false)                       14.508 14.490    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_ROWS, true)                        7.832  7.828     1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_SCALE, false)                      41.491 38.341    1.08
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_SCALE, true)                       34.587 31.208    1.11
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, 0, false)                              65.155 63.173    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, 0, true)                               56.091 54.752    1.02
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_COMPLEX_OUTPUT, false)             71.549 70.626    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_COMPLEX_OUTPUT, true)              62.319 61.437    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE, false)                    61.480 59.540    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE, true)                     54.047 52.650    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 61.752 61.366    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  54.400 53.665    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_ROWS, false)                       20.219 19.704    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_ROWS, true)                        11.145 10.868    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_SCALE, false)                      66.220 64.525    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_SCALE, true)                       57.389 56.114    1.02
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, 0, false)                              86.761 88.128    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, 0, true)                               75.528 76.725    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_COMPLEX_OUTPUT, false)             86.750 88.223    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_COMPLEX_OUTPUT, true)              75.830 76.809    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE, false)                    91.728 92.161    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE, true)                     78.797 79.876    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 92.163 92.177    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  78.957 79.863    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_ROWS, false)                       24.781 25.576    0.97
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_ROWS, true)                        13.226 13.695    0.97
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_SCALE, false)                      87.990 89.324    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_SCALE, true)                       76.732 77.869    0.99
```
2020-08-21 14:06:09 +02:00
Alexander Alekhin
cd00d8f3f0 core(trace): lazy quering for OPENCV_TRACE_LOCATION
- fixes proper initialization of non-trivial variable
2020-08-20 21:48:05 +00:00
Alexander Alekhin
b3755e617c ocl: silence warning in case of async cleanup
- OpenCL kernel cleanup processing is asynchronous and can be called even after forced clFinish()
- buffers are released later in asynchronous mode
- silence these false positive cases for asynchronous cleanup
2020-08-20 19:33:37 +00:00
nhlsm
68f527267b
Merge pull request #18080 from nhlsm:improve-mat-operator-assign-scalar
* improve Mat::operator=(Scalar)

* touch

* remove trailing whitespace

* TEST: check if old code pass test or not

* remove CV_Error

* remove warning

* fix: is -> Scalar

* 1) Mat *mat -> Mat &mat 2) return bool, add output param

* add comment
2020-08-14 17:21:23 +00:00
Alexander Alekhin
00890aecdf core(ocl): fix ocl::Image2d::isFormatSupported()
in case of OPENCV_OPENCL_DEVICE=disabled
2020-08-13 18:33:18 +00:00
Alexander Alekhin
3f65c12d0c Merge pull request #17982 from nglee:dev_cudaGpuMatConvertToInplaceFix 2020-08-09 20:21:17 +00:00
Gabriel
96ce65f021 Document PatchNANs input type 2020-08-03 22:57:18 -03:00
Namgoo Lee
2241bfb0df Use "src" not "*this" for source GpuMat 2020-07-30 01:03:34 +09:00
Alexander Alekhin
284d26da05 Merge tag '3.4.11' 2020-07-17 02:06:19 +00:00
Alexander Alekhin
e8d4259f9a release: OpenCV 3.4.11 2020-07-17 00:34:46 +00:00
Alexander Alekhin
e54040d540 core: use lazy on-demand initialization for param_traceEnable 2020-07-12 11:53:46 +00:00
Alexander Alekhin
e0f9eac521 cmake: backport CUDA scripts 2020-07-08 07:33:54 +00:00
Alexander Alekhin
eb6678ebef
Merge pull request #17699 from alalek:build_core_cuda
* core(cuda): fix build

- MSVS 19.25.28612.0
- CUDA release 11.0, V11.0.167

* cmake(cuda): backport workaround for CUDA 11

* cmake(cuda): call CUDA_BUILD_CLEAN_TARGET() on finalize

* cmake(cuda): use CMAKE_SUPPRESS_REGENERATION with MSVS
2020-07-06 22:58:17 +00:00
dev-tronifier
9b727fa1f3 Increased portability of CV_Func 2020-06-26 19:45:58 +00:00
Yuriy Obukh
456e88a8a4 fix VS Windows build with eigen. https://github.com/opencv/opencv/issues/17548 2020-06-18 14:31:11 +03:00
Alexander Alekhin
9755ab160d Merge pull request #17556 from nglee:dev_optFlowTVL1Async 2020-06-16 20:06:56 +00:00
Namgoo Lee
2043e06102 cuda optflow tvl1 : async safety
also modify cuda canny to use createTextureObjectPitch2D, etc.
2020-06-17 01:04:22 +09:00
Alexander Alekhin
442999dcdb core: fix handling of ND-arrays in dumpInputArray() helpers 2020-06-12 10:23:32 +00:00
Rasmus
781fbde449
Merge pull request #17368 from themightyoarfish:cv2eigen-doc
* Add documentation about usage of cv2eigen functions in eigen.hpp

* Fixed Doxygen syntax.

Co-authored-by: Alexander Smorkalov <smorkalov.a.m@gmail.com>
2020-06-10 07:53:18 +00:00
Alexander Alekhin
a43e3bebe6 pre: OpenCV 3.4.11 (version++) 2020-06-08 18:46:27 +00:00
Maksim Shabunin
59608907b8 Added countNonZero test for big arrays and disable IPP for some cases 2020-06-03 18:58:41 +03:00
Alexander Alekhin
f68654a204 Merge pull request #17438 from alalek:fix_eigen_builds 2020-06-01 18:02:07 +00:00
Vadim Pisarevsky
5489735258
Merge pull request #17436 from vpisarev:fix_python_io
* fixed #17044
1. fixed Python part of the tutorial about using OpenCV XML-YAML-JSON I/O functionality from C++ and Python.
2. added startWriteStruct() and endWriteStruct() methods to FileStorage
3. modifed FileStorage::write() methods to make them work well inside sequences, not only mappings.

* try to fix the doc builder

* added Python regression test for FileStorage I/O API ([TODO] iterating through long sequences can be very slow)

* fixed yaml testing
2020-06-01 11:33:09 +00:00
Alexander Alekhin
74020a084b core: fix builds with eigen helper header 2020-05-31 15:41:42 +00:00
Egor Pugin
1bec7ca540
Merge pull request #17352 from egorpugin:patch-2
* Fix integer overflow in parseOption().

Previous code does not work for values like 100000MB.

* Fix warning during 32-bit build on inactive code path.

* fix build without C++11
2020-05-25 20:25:18 +00:00
Josh Bradley
9fef09fe89
Merge pull request #17320 from jgbradley1:add-eigen-tensor-conversions
* add eigen tensor conversion functions

* add eigen tensor conversion tests

* add support for column major order

* update eigen tensor tests

* fix coding style and add conditional compilation

* fix conditional compilation checks

* remove whitespace

* rearrange functions for easier reading

* reformat function documentation and add tensormap unit test

* cleanup documentation of unit test

* remove condition duplication

* check Eigen major version, not minor version

* restrict to Eigen v3.3.0+

* add documentation note and add type checking to cv2eigen_tensormap()
2020-05-23 18:25:01 +00:00
Alexander Alekhin
a3b109eca0 imgproc: enable GaussianBlur IPP parallel processing 2020-05-17 11:40:34 +00:00
Alexander Alekhin
74e4cfd1da core(MatExpr): fix warning in case of e.s == (0, 0, 0, 0) 2020-05-01 07:29:57 +00:00
Alexander Alekhin
3c14a8c507 Merge pull request #17149 from alalek:core_simd_suppress_coverity 2020-04-24 17:46:54 +00:00
Alexander Alekhin
cd7db168e0 core(SIMD): suppress coverity UNINIT_CTOR on SIMD vectors 2020-04-24 16:36:35 +00:00
Paul Jurczak
a748eba42e Added descriptions of randu and randn 2020-04-20 07:13:37 +00:00
Alexander Alekhin
acf1be547d Merge pull request #17046 from alalek:core_inputarray_matexpr_cleanup 2020-04-18 21:41:59 +00:00
Alexander Alekhin
fbaae7ac37 Merge pull request #17041 from alalek:core_simd_vector_ctors 2020-04-17 21:22:08 +00:00
Alexander Alekhin
dcf7eb972e core(SIMD): align behavior of vector constructors
- setzero() calls are dropped due low-level API nature
- initialization is mandatory if necessary (not an output of other calls)
2020-04-17 14:34:34 +00:00
Maksim Shabunin
f84cae833a TickMeter: added FPS and AvgTime, improved docs, reformatted 2020-04-16 21:33:29 +03:00
Alexander Alekhin
c8f1948d58 core: drop EXPR handing code in InputArray 2020-04-14 18:02:19 +00:00
Alexander Alekhin
49a75079f2 Merge pull request #17047 from alalek:fix_permissions 2020-04-13 12:34:08 +00:00
Alexander Alekhin
f0ffc52435 fix files permissions 2020-04-13 04:29:55 +00:00
Alexander Alekhin
9c58a7cb1e Merge pull request #16653 from alalek:core_inputarray_matexpr 2020-04-10 16:57:17 +00:00
Alexander Alekhin
d7abb641ca core(test): add InputArray(MatExpr) fetch test 2020-04-10 11:35:42 +00:00
Alexander Alekhin
936428cb3b core(MatExpr) fetch result before InputArray wrap
- avoid multiple expression evaluations
- avoid issues with reduced support of InputArray::EXPR
2020-04-06 15:28:32 +00:00
Adam Fowles
8334932a26
Merge pull request #16992 from afowles:fix-forEach-segfault
* Fixed divide by zero error in forEach

* Dedicated assertion for !empty
2020-04-06 14:49:02 +00:00
Alexander Alekhin
0812207db7 Merge tag '3.4.10' 2020-04-03 11:24:31 +00:00
Alexander Alekhin
1cc1e6fa56 release: OpenCV 3.4.10 2020-04-02 19:59:58 +00:00
Alexander Alekhin
54063c40de core(ocl): options to control buffer access flags
- control using of clEnqueueMapBuffer or clEnqueueReadBuffer[Rect]
- added benchmarks with OpenCL buffer access use cases
2020-04-02 11:11:06 +00:00
Alexander Alekhin
09134ac881 core: emit warning ONCE on ambiguous MatExpr processing 2020-04-01 18:34:20 +00:00
Alexander Alekhin
353273579b Merge pull request #16918 from alalek:build_warnings_3.4 2020-03-27 16:43:23 +00:00
Alexander Alekhin
e661ad2a67 eliminate build warnings 2020-03-27 11:39:07 +00:00
cyy
bdc29cccb6 fix freebsd build 2020-03-27 18:12:10 +08:00
Alexander Alekhin
c920b45fb8 core(persistence): fix resource leaks - force closing files
backporting commit 673eb2b006
2020-03-25 10:49:16 +00:00
Alexander Alekhin
377dd04224 core: fix .begin()/.end() of empty Mat 2020-03-20 14:08:45 +00:00
Alexander Alekhin
77d1c20fb7 core(buffer_area): handle 'OPENCV_ENABLE_MEMORY_SANITIZER=ON' case 2020-03-16 19:34:08 +03:00
RAJKIRAN NATARAJAN
3b2e409fa7
Merge pull request #16779 from saskatchewancatch:issue-16777
* Fixes issue 16777.

* core: update Concurrency getNumThreads()
2020-03-16 17:12:29 +03:00
Alexander Alekhin
71ec112093 Merge pull request #16786 from alalek:issue_16398 2020-03-15 19:49:50 +00:00
Sayed Adel
9ea62bfddb core:vsx reimplement v_broadcast_element()
There's no need to use `vec_perm()` instead of `vec_splat()`,
  since instruction `vperm` is quite heavy compared to `vsplt[b,h,w]`.
2020-03-14 22:54:22 +02:00
Alexander Alekhin
4e56c1326f core: adjust type of allocator_stats counter, allow to disable 2020-03-11 20:12:29 +03:00
Alexander Alekhin
9f82b74788 Merge pull request #16774 from alalek:core_update_cpus_detection 2020-03-10 22:39:30 +00:00
Alexander Alekhin
612746b4e5 Merge pull request #16744 from alalek:fix_mat_aug_operators_use_after_free 2020-03-10 22:02:47 +00:00
Alexander Alekhin
83e1d79403 core: update CPUs detection
- cache value, evaluate once
- better support for MINGW
- anything in 'cv' namespace
- test: dump number of active threads
2020-03-10 21:29:08 +00:00
Alexander Alekhin
b7ecaceda8 pre: OpenCV 3.4.10 (version++)
- Android Manager version is not increased (stuck on 3.49)
2020-03-10 14:53:43 +03:00
Alexander Alekhin
34530da66e core: fix coverity issues 2020-03-06 18:12:45 +00:00
Alexander Alekhin
3a2f40ac6f core: don't allow reallocation in add/div/sub/bitwise aug operators 2020-03-06 13:00:40 +00:00
Manoj Gupta
880d2afb67 Fix building with ToT libc++
ToT libc++ (LLVM) no longer includes <sstream>
as part of <complex> which breaks building opencv.
Include <sstream> header explcitly to fix this.
2020-03-05 17:10:43 -08:00
Alexander Alekhin
a694e5074f Merge pull request #16723 from jansol:master 2020-03-05 12:25:20 +00:00
Alexander Alekhin
4f288a1e28
Merge pull request #16704 from alalek:core_log_once_log_if
* core(logger): add CV_LOG_ONCE_xxx() CV_LOG_IF_xxx() macros

* core(logger): keep tests disabled
2020-03-04 20:42:41 +00:00
Jan Solanti
ad16c243ca core(ocl): Don't query image formats when none exist
clGetSupportedImageFormats returns CL_INVALID_VALUE if called with
num_entries 0 and a non-NULL image_formats pointer so let's not do that.
2020-03-04 14:15:33 +02:00
Alexander Alekhin
4d0f13544d
Merge pull request #16700 from alalek:fix_core_matexpr_size_gemm
core: fix MatExpr::size() for gemm()

* core(test): MatExpr::size() test for gemm()

* core: fix MatExpr::size() for gemm()
2020-03-02 17:13:02 +03:00
Peter Würtz
5012fc5d23
Merge pull request #16684 from pwuertz:ignore_clang_mat_inl
* Ignore clang warnings for deprecated enum+enum operations in mat.inl.hpp

* build: added customization macros, cmake flags for OpenCV build
2020-02-28 21:21:03 +03:00
Alexander Alekhin
af9ded89d0 core: fix build getNumberOfCPUs for JavaScript 2020-02-26 18:54:23 +03:00
Alexander Alekhin
c13a62ce10 Merge pull request #16638 from mshabunin:use-safe-buffers 2020-02-26 14:54:57 +00:00
Ganesh Kathiresan
09df7810d1
Merge pull request #16457 from ganesh-k13:bugfix/getCPUCount-fix
* Fixed getCPUCount

Minor new line changes

Android fix | efficient linux checks

Android fix 2

Fixed cpu logic for non linux platforms

Android fix 3

Android fix 4

* No v1 case handle | Refactor long lines

* Refined Cgroups logic | Combine Android and Linux

* Fixed directives

* Added support for --cpus | Fixed minor bug in Andriod | Change file read logic

* Added macro checks for apple errors

* Fixed macro to include android

* Addressed review comments

* Fixed android macro

* Refined return values

* Fixed apple warning

* Addressed review comments

* Fixed whitespace

* Android Fix try 1

* Android Fix try 2

* Android Fix try 3

* Removed unwanted endif

* Android Fix try 4

* Android Fix try 5

* Macro Restructure

* core: updates to CPUs detection (minor)
2020-02-26 17:48:50 +03:00
Alexander Alekhin
f48c84eaee Merge pull request #16656 from alalek:issue_16655 2020-02-26 12:47:46 +00:00
Maksim Shabunin
bf96d8239d Use BufferArea in more places 2020-02-26 11:45:19 +03:00
Alexander Alekhin
d54d01ca46 core(MatExpr): fix .type() bug 2020-02-23 17:05:05 +00:00
Alexander Alekhin
01048e5603
Merge pull request #16616 from alalek:dnn_fix_input_shape
* dnn: fix processing of input shapes

- importer: avoid using of .setInput() => .setInputShape()
- setInput: shape limitation check (partial)

* dnn(test): test .setInput() in readNet()
2020-02-21 22:39:54 +03:00
Alexander Alekhin
966c2191cb
Merge pull request #13928 from catree:add_matx_div_operations 2020-02-21 22:35:03 +03:00
Alexander Alekhin
a0f5eb282c Merge pull request #16635 from mshabunin:fix-avx512-cvt 2020-02-21 13:15:40 +00:00
Vadim Pisarevsky
07b475062f
Merge pull request #16608 from vpisarev:fix_mac_ocl_tests
* fixed several problems when running tests on Mac:
* OCL_pyrUp
* OCL_flip
* some basic UMat tests
* histogram badarg test (out of range access)

* retained the storepix fix in ocl_flip only for 16U/16S datatype, where the OpenCL compiler on Mac generates incorrect code

* moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is shared virtual memory (in OpenCL 2.x), not support vector machine)

* force OpenCL to use read/write for GPU<=>CPU memory transfers on machines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)
2020-02-21 16:13:41 +03:00
Maksim Shabunin
8b2c499be6 intrin: fixed int64->double conversion for AVX-512 2020-02-21 15:20:00 +03:00
Alexander Smorkalov
c87b99e82b Added test for new MatX division. 2020-02-21 10:08:55 +03:00
atinfinity
f81fdd58da
Merge pull request #16445 from atinfinity:fixed-typo
* fixed typo

* add compatibility code to handle migration
2020-02-16 19:16:33 +03:00
Pavel Rojtberg
e13a73d084 core: export getCPUFeaturesLine to bindings 2020-02-10 14:06:43 +01:00
Alexander Alekhin
eb14f9a464 Merge pull request #16463 from alalek:core_strong_ptr_alignment 2020-02-08 19:45:43 +00:00
Maksim Shabunin
55cdeaa6dd BufferArea: initial version, usage in StereoBM
New class BufferArea is used to hide complexity of buffers allocations and allow instrumentation with valgrind and sanitizers.
2020-02-07 14:57:36 +03:00
gapry
ac9f8c1f41 Fixed Compilation warnings | Issue #16336 2020-02-01 03:32:42 +08:00
Alexander Alekhin
591f427003 Merge pull request #16459 from nh2:patch-1 2020-01-30 14:25:18 +00:00
Alexander Alekhin
a4bd7506a5 core: CV_STRONG_ALIGNMENT macro
Should be used to guard unsafe type casts of pointers
2020-01-29 18:44:17 +03:00
Niklas Hambüchen
70cbc3d883 cvdef.h: Don't use C's limits.h under C++
Just like with the other headers in the rest of the file.

See e.g. https://stackoverflow.com/questions/36831465/what-difference-does-it-make-when-i-include-limits-or-limits-h-in-my-c-cod
for the reasons, the most important one being that limits.h does not respect
namespaces, which can make problems for downstream consumers of cvdef.h.
2020-01-29 16:41:31 +01:00
Sayed Adel
ec033330df core:vsx workaround for the unexpected results of vec_vbpermq in gcc4.9 2020-01-29 15:05:12 +02:00
Sayed Adel
bd531bd828 core:vsx fix inline asm constraints
generalize constraints to 'wa' for VSX registers
2020-01-28 15:48:00 +02:00
Alexander Alekhin
e83438c23d core(build): fix i386 compilation 2020-01-26 00:00:25 +00:00
Chip Kerchner
4d2da2debe Merge pull request #16375 from ChipKerchner:vectorizeMultTranspose
* Reduce LLC loads, stores and multiplies on MulTransposed - 8% faster on VSX

* Add is_same method so c++11 is not required

* Remove trailing whitespaces.

* Change is_same to DataType depth check
2020-01-24 18:00:49 +03:00
Alexander Alekhin
d42e04d0df core(SIMD): fix MSA build - add v_reduce_min/max for u8/s8 2020-01-20 15:10:03 +03:00
Chip Kerchner
301626ba26 Merge pull request #15488 from ChipKerchner:vectorizeMinMax2
Vectorize minMaxIdx functions

* Updated documentation and intrinsic tests for v_reduce

* Add other files back in from the forced push

* Prevent an constant overflow with v_reduce for int8 type

* Another alternative to fix constant overflow warning.

* Fix another compiler warning.

* Update comments and change comparison form to be consistent with other vectorized loops.

* Change return type of v_reduce_min & max for v_uint8 and v_uint16 to be same as lane type.

* Cast v_reduce functions to int to avoid overflow. Reduce number of parameters in MINMAXIDX_REDUCE macro.

* Restore cast type for v_reduce_min & max to LaneType
2020-01-17 19:37:35 +03:00
Alexander Alekhin
a9f3acb125 core(simd): fix NEON alignmnet issue 2020-01-11 18:39:50 +00:00
Alexander Alekhin
e180cc050b
Merge pull request #16236 from alalek:fix_core_simd_emulator
* core: fix intrin_cpp, allow to build modules with SIMD emulator

* core(arithm): fix v_zero initialization

* core(simd): 'strict' types for binary/bitwise operations

* features2d: avoid aligned load issue in GCC 5.4 with emulated SIMD

* core(simd): alignment checks in SIMD emulator
2020-01-10 21:31:02 +03:00
Nuzhny007
7d484d21f7 Fixed compilation on windows with openvx 2020-01-06 06:32:56 +03:00
Alexander Alekhin
523f081923 core(check): add Size_<int> 2019-12-28 13:50:39 +00:00
Brian Wignall
f9c514b391 Fix spelling typos
backport commit 659ffaddb4
2019-12-27 12:46:53 +00:00
Alexander Alekhin
5e2bcc9149 Merge tag '3.4.9' 2019-12-20 12:44:15 +03:00
Alexander Alekhin
64e6cf9fe5 release: OpenCV 3.4.9 2019-12-19 18:16:47 +03:00
Alexander Alekhin
dff8e29f98 Merge pull request #16139 from alalek:core_flip_avoid_unaligned 2019-12-19 10:29:07 +00:00
Alexander Alekhin
8d22ac200f core: workaround flipHoriz() alignment issues 2019-12-19 00:05:23 +00:00
Tatsuro Shibamura
971ae00942 Merge pull request #16027 from shibayan:arm64-windows10
* Support ARM64 Windows 10 platform

* Fixed detection issue for ARM64 Windows 10

* Try enabling ARM NEON intrin

* build: disable NEON with MSVC compiler

* samples(directx): gdi32 dependency
2019-12-17 00:23:30 +03:00
Alexander Alekhin
a45928045a
Merge pull request #16150 from alalek:cmake_avoid_deprecated_link_private
* cmake: avoid deprecated LINK_PRIVATE/LINK_PUBLIC

see CMP0023 (CMake 2.8.12+)

* cmake: fix 3rdparty list

- don't include OpenCV modules
2019-12-13 17:52:40 +03:00
RAJKIRAN NATARAJAN
e6ce752da1 Merge pull request #15966 from saskatchewancatch:issue-15760
Add checks for empty operands in Matrix expressions that don't check properly

* Starting to add checks for empty operands in Matrix expressions that
don't check properly.

* Adding checks and delcarations for checker functions

* Fix signatures and add checks for each class of Matrix Expr operation

* Make it catch the right exception

* Don't expose helper functions to public API
2019-12-12 19:23:57 +03:00
Alexander Alekhin
f2cce5fd8c Merge pull request #16125 from alalek:core_safe_xadd 2019-12-11 14:15:46 +00:00
Alexander Alekhin
7d61426279 Merge pull request #16124 from alalek:issue_13354 2019-12-11 14:15:23 +00:00
Alexander Alekhin
416848066c core: provide safe implementations of CV_XADD() only 2019-12-11 00:48:45 +00:00
Alexander Alekhin
76b5e19eb3 core: add "namespace cv" in CV_StaticAssert fallback implementation 2019-12-11 00:35:13 +00:00
Alexander Alekhin
a675c4937a core: OPENCV_INCLUDE_PORT_FILE for custom platform configuration 2019-12-11 00:31:45 +00:00
Alexander Alekhin
a2642d83d3 Merge pull request #16093 from alalek:core_itt_thread_name_16072 2019-12-09 18:29:53 +00:00
Paul Murphy
a011035ed6 Merge pull request #15257 from pmur:resize
* resize: HResizeLinear reduce duplicate work

There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.

Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.

The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this.  The performance
improvement is 10-50% depending on the scaling.

* imgproc: vectorize HResizeLinear

Performance is mostly gated by the gather operations
for x inputs.

Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.

While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.

For float types, this results in a more modest
1.2x improvement.

* Update U8 processing for non-bitexact linear resize

* core: hal: vsx: improve v_load_expand_q

With a little help, we can do this quickly without gprs on
all VSX enabled targets.

* resize: Fix cn == 3 step per feedback

Per feedback, ensure we don't overrun. This was caught via the
failure observed in Test_TensorFlow.inception_accuracy.
2019-12-09 14:54:06 +03:00
Alexander Alekhin
816f82682b core(trace/itt): avoid calling __itt_thread_set_name() by default
- don't override current application thread names
- set name for own threads only
2019-12-07 21:41:15 +00:00
Alexander Alekhin
76a27e3399 pre: OpenCV 3.4.9 (version++) 2019-12-05 18:28:38 +00:00
Alexander Alekhin
72f35e0626
Merge pull request #16052 from alalek:issue_16040
* calib3d: use normalized input in solvePnPGeneric()

* calib3d: java regression test for solvePnPGeneric

* calib3d: python regression test for solvePnPGeneric
2019-12-05 15:36:39 +03:00
Alexander Alekhin
f21bde4d9f
Merge pull request #16046 from alalek:issue_15990
* core: disable invalid constructors in C API by default

- C API objects will lose their default initializers through constructors

* samples: stop using of C API
2019-12-05 14:48:18 +03:00
Alexander Alekhin
95e36fd488 Merge pull request #16055 from alalek:issue_16041 2019-12-05 07:54:17 +00:00
Alexander Alekhin
4dfa0a0383 bindings: basic support for #if preprocessor directives
- #if 0
- #ifdef __OPENCV_BUILD
2019-12-04 18:42:31 +03:00
Alexander Alekhin
818585fd12 core(tls): unblock TlsAbstraction destructor call
- required to unregister callbacks from system
2019-12-04 08:27:01 +00:00
Vadim Levin
8d74101f07 Merge pull request #15955 from VadimLevin:dev/vlevin/generator_tests
Tests for argument conversion of Python bindings generator

* Tests for parsing elemental types from Python bindings

  - Add positive and negative tests for int, float, double, size_t,
    const char*, bool.
  - Tests with wrong conversion behavior are skipped.

* Move implicit conversion of bool to integer/floating types to wrong
conversion behavior.
2019-11-29 16:24:13 +03:00
Alexander Alekhin
70146700aa Merge pull request #15839 from alalek:core_simd_v_setall_template 2019-11-27 19:19:35 +00:00
Brian Wignall
af997529a1 Fix some typos 2019-11-26 18:41:19 +03:00
Alexander Alekhin
50ac880335 Merge pull request #15971 from alalek:core_kmeans_handle_overflow 2019-11-22 21:36:02 +00:00
Natsu
54e6f5c237 Merge pull request #15970 from akemimadoka:master
* Fix android armv7 c++_static init crash

* core: move initialization of 'ios_base::Init' for Android
2019-11-22 18:42:25 +03:00
Alexander Alekhin
3266ac7667 core(kmeans): bailout if can't select cluster center 2019-11-22 14:40:02 +00:00
Alexander Alekhin
ec55b6f6db core: fix MSA build 2019-11-21 18:59:41 +03:00
Everton Constantino
75315fb297 Merge pull request #15494 from everton1984:hal_vector_get_n
Improving VSX performance of integral function

* Adding support for vector get function on VSX datatypes so the
integral function gains a bit of performance.

* Removing get as a datatype member function and implementing a new HAL
instruction v_extract_n to get the n-th element of a vector register.

* Adding SSE/NEON/AVX intrinsics.

* Implement new HAL instruction v_broadcast_element on VSX/AVX/NEON/SSE.

* core(simd): add tests for v_extract_n/v_broadcast_element

- updated docs
- commented out code to repair compilation
- added WASM and MSA default implementations

* core(simd): fix compilation

- x86: avoid _mm256_extract_epi64/32/16/8 with MSVS 2015
- x86: _mm_extract_epi64 is 64-bit only

* cleanup
2019-11-20 13:41:07 +03:00
Alexander Alekhin
e07a488012
Merge pull request #15925 from alalek:core_test_simd_cpp_emulation
core(test): extending tests with SIMD C++ emulation code (intrin_cpp.hpp)

* core(test): test SIMD CPP emulation code (intrin_cpp.hpp)

* core(simd): eliminate build warnings from intrin_cpp.hpp
2019-11-19 21:08:45 +03:00
clunietp
2185bce4b7 Fix 13577 2019-11-18 07:41:34 -05:00
Alexander Alekhin
6773b938b3 Merge pull request #15896 from alalek:build_gcc_9 2019-11-14 14:22:02 +00:00
Christoph Bachhuber
c638f085aa Refactor for clarity and avoiding code duplication
Implement GArik's comments

Remove unnecessary c_str()

Fix brace position
2019-11-12 19:22:42 +01:00
Alexander Alekhin
7ecdcf6ca6 build: GCC9 compilation 2019-11-12 18:49:34 +03:00
Alexander Alekhin
e9dcecf9b4 Merge pull request #15826 from alalek:cmake_fix_itt_define_condition 2019-11-10 09:22:22 +00:00
Alexander Alekhin
d66aa2e0ff Merge pull request #15848 from alalek:backport_test_15842 2019-11-07 16:48:41 +00:00
Alexander Alekhin
54d9597522 Merge pull request #15814 from i-murzov:3.4-ocl-cleanup 2019-11-06 09:54:29 +00:00
Igor Murzov
cdbfdcc363 Fix OpenCL device detection when some OpenCL platform has no devices
It's not an error if some OpenCL platform has no devices. This makes
OpenCL device detection work correctly in the following scenario:

$ OPENCV_OPENCL_DEVICE=:GPU: ./opencv_test_dnn

OpenCV version: 4.1.2-dev
OpenCV VCS version: 4.1.2-80-g467748ee98-dirty
Build type: Debug
Compiler: /usr/bin/g++  (ver 7.4.0)
Parallel framework: pthreads
CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *FP16 *AVX *AVX2 *AVX512-SKX?
Intel(R) IPP version: ippIP AVX2 (l9) 2019.0.0 Gold (-) Jul 24 2018
OpenCL Platforms:
    AMD Accelerated Parallel Processing
    Portable Computing Language
        CPU: pthread-AMD Ryzen 7 2700X Eight-Core Processor (OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-znver1)
    NVIDIA CUDA
        dGPU: GeForce GTX 1080 (OpenCL 1.2 CUDA)
Current OpenCL device:
    Type = dGPU
    Name = GeForce GTX 1080
    Version = OpenCL 1.2 CUDA
    Driver version = 430.26
2019-11-05 20:02:39 +03:00
TH3CHARLie
2c2716de0f core(test): add test for YAML parse multiple documents
- added removal of temporary file
2019-11-05 18:39:07 +03:00
Igor Murzov
6d5b900324 Simplify OpenCL info dumping code:
* Reduce code nesting
* Drop redundant .c_str() calls
2019-11-05 14:49:49 +03:00