Pierre Chatelier
60b806f9b8
Merge pull request #22947 from chacha21:hasNonZero
...
Added cv::hasNonZero() #22947
`cv::hasNonZero()` is semantically equivalent to (`cv::countNonZero()>0`) but stops parsing the image when a non-zero value is found, for a performance gain
- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
This pull request might be refused, but I submit it to know if further work is needed or if I just stop working on it.
The idea is only a performance gain vs `countNonZero()>0` at the cost of more code.
Reasons why it might be refused :
- this is just more code
- the execution time is "unfair"/"unpredictable" since it depends on the position of the first non-zero value
- the user must be aware that default search is from first row/col to last row/col and has no way to customize that, even if his use case lets him know where a non zero could be found
- the PR in its current state is using, for the ocl implementation, a mere `countNonZero()>0` ; there is not much sense in trying to break early the ocl kernel call when non-zero is encountered. So the ocl implementation does not bring any improvement.
- there is no IPP function that can help (`countNonZero()` is based in `ippCountInRange`)
- the PR in its current state might be slower than a call to `countNonZero()>0` in some cases (see "challenges" below)
Reasons why it might be accepted :
- the performance gain is huge on average, if we consider that "on average" means "non zero in the middle of the image"
- the "missing" IPP implementation is replaced by an "Open-CV universal intrinsics" implementation
- the PR in its current state is almost always faster than a call to `countNonZero()>0`, is only slightly slower in the worst cases, and not even for all matrices
**Challenges**
The worst case is either an all-zero matrix, or a non-zero at the very last position. In such a case, the `hasNonZero()` implementation will parse the whole matrix like `countNonZero()` would do. But we expect the performance to be the same in this case. And `ippCountInRange` is hard to beat !
There is also the case of very small matrices (<=32x32...) in 8b, where the SIMD can be hard to feed.
For all cases but the worse, my custom `hasNonZero()` performs better than `ippCountInRange()`
For the worst case, my custom `hasNonZero()` performs better than `ippCountInRange()` *except for large matrices of type CV_32S or CV_64F* (but surprisingly, not CV_32F).
The difference is small, but it exists (and I don't understand why).
For very small CV_8U matrices `ippCountInRange()` seems unbeatable.
Here is the code that I use to check timings
```
//test cv::hasNonZero() vs (cv::countNonZero()>0) for different matrices sizes, types, strides...
{
cv::setRNGSeed(1234);
const std::vector<cv::Size> sizes = {{32, 32}, {64, 64}, {128, 128}, {320, 240}, {512, 512}, {640, 480}, {1024, 768}, {2048, 2048}, {1031, 1000}};
const std::vector<int> types = {CV_8U, CV_16U, CV_32S, CV_32F, CV_64F};
const size_t iterations = 1000;
for(const cv::Size& size : sizes)
{
for(const int type : types)
{
for(int c = 0 ; c<2 ; ++c)
{
const bool continuous = !c;
for(int i = 0 ; i<4 ; ++i)
{
cv::Mat m = continuous ? cv::Mat::zeros(size, type) : cv::Mat(cv::Mat::zeros(cv::Size(2*size.width, size.height), type), cv::Rect(cv::Point(0, 0), size));
const bool nz = (i <= 2);
const unsigned int nzOffsetRange = 10;
const unsigned int nzOffset = cv::randu<unsigned int>()%nzOffsetRange;
const cv::Point pos =
(i == 0) ? cv::Point(nzOffset, 0) :
(i == 1) ? cv::Point(size.width/2-nzOffsetRange/2+nzOffset, size.height/2) :
(i == 2) ? cv::Point(size.width-1-nzOffset, size.height-1) :
cv::Point(0, 0);
std::cout << "============================================================" << std::endl;
std::cout << "size:" << size << " type:" << type << " continuous = " << (continuous ? "true" : "false") << " iterations:" << iterations << " nz=" << (nz ? "true" : "false");
std::cout << " pos=" << ((i == 0) ? "begin" : (i == 1) ? "middle" : (i == 2) ? "end" : "none");
std::cout << std::endl;
cv::Mat mask = cv::Mat::zeros(size, CV_8UC1);
mask.at<unsigned char>(pos) = 0xFF;
m.setTo(cv::Scalar::all(0));
m.setTo(cv::Scalar::all(nz ? 1 : 0), mask);
std::vector<bool> results;
std::vector<double> timings;
{
bool res = false;
auto ref = cv::getTickCount();
for(size_t k = 0 ; k<iterations ; ++k)
res = cv::hasNonZero(m);
auto now = cv::getTickCount();
const bool error = (res != nz);
if (error)
printf("!!ERROR!!\r\n");
results.push_back(res);
timings.push_back(1000.*(now-ref)/cv::getTickFrequency());
}
{
bool res = false;
auto ref = cv::getTickCount();
for(size_t k = 0 ; k<iterations ; ++k)
res = (cv::countNonZero(m)>0);
auto now = cv::getTickCount();
const bool error = (res != nz);
if (error)
printf("!!ERROR!!\r\n");
results.push_back(res);
timings.push_back(1000.*(now-ref)/cv::getTickFrequency());
}
const size_t bestTimingIndex = (std::min_element(timings.begin(), timings.end())-timings.begin());
if ((bestTimingIndex != 0) || (std::find_if_not(results.begin(), results.end(), [&](bool r) {return (r == nz);}) != results.end()))
{
std::cout << "cv::hasNonZero\t\t=>" << results[0] << ((results[0] != nz) ? " ERROR" : "") << " perf:" << timings[0] << "ms => " << (iterations/timings[0]*1000) << " im/s" << ((bestTimingIndex == 0) ? " * " : "") << std::endl;
std::cout << "cv::countNonZero\t=>" << results[1] << ((results[1] != nz) ? " ERROR" : "") << " perf:" << timings[1] << "ms => " << (iterations/timings[1]*1000) << " im/s" << ((bestTimingIndex == 1) ? " * " : "") << std::endl;
}
}
}
}
}
}
```
Here is a report of this benchmark (it only reports timings when `cv::countNonZero()` is faster)
My CPU is an Intel Core I7 4790 @ 3.60Ghz
```
============================================================
size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=middle
cv::hasNonZero =>1 perf:0.353764ms => 2.82674e+06 im/s
cv::countNonZero =>1 perf:0.282044ms => 3.54555e+06 im/s *
============================================================
size:[32 x 32] type:0 continuous = false iterations:1000 nz=true pos=end
cv::hasNonZero =>1 perf:0.610478ms => 1.63806e+06 im/s
cv::countNonZero =>1 perf:0.283182ms => 3.5313e+06 im/s *
============================================================
size:[32 x 32] type:0 continuous = false iterations:1000 nz=false pos=none
cv::hasNonZero =>0 perf:0.630115ms => 1.58701e+06 im/s
cv::countNonZero =>0 perf:0.282044ms => 3.54555e+06 im/s *
============================================================
size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:5 continuous = false iterations:1000 nz=true pos=end
cv::hasNonZero =>1 perf:0.607347ms => 1.64651e+06 im/s
cv::countNonZero =>1 perf:0.467037ms => 2.14116e+06 im/s *
============================================================
size:[32 x 32] type:5 continuous = false iterations:1000 nz=false pos=none
cv::hasNonZero =>0 perf:0.618162ms => 1.6177e+06 im/s
cv::countNonZero =>0 perf:0.468175ms => 2.13595e+06 im/s *
============================================================
size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[32 x 32] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[32 x 32] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[64 x 64] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[64 x 64] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[128 x 128] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[128 x 128] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[320 x 240] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[320 x 240] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[512 x 512] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[512 x 512] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[640 x 480] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[640 x 480] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1024 x 768] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1024 x 768] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=true pos=end
cv::hasNonZero =>1 perf:895.381ms => 1116.84 im/s
cv::countNonZero =>1 perf:882.569ms => 1133.06 im/s *
============================================================
size:[2048 x 2048] type:4 continuous = true iterations:1000 nz=false pos=none
cv::hasNonZero =>0 perf:899.53ms => 1111.69 im/s
cv::countNonZero =>0 perf:870.894ms => 1148.24 im/s *
============================================================
size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=true pos=end
cv::hasNonZero =>1 perf:2018.92ms => 495.313 im/s
cv::countNonZero =>1 perf:1966.37ms => 508.552 im/s *
============================================================
size:[2048 x 2048] type:6 continuous = true iterations:1000 nz=false pos=none
cv::hasNonZero =>0 perf:2005.87ms => 498.537 im/s
cv::countNonZero =>0 perf:1992.78ms => 501.812 im/s *
============================================================
size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[2048 x 2048] type:6 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:0 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:0 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:2 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:2 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:4 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:4 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:5 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:5 continuous = false iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:6 continuous = true iterations:1000 nz=false pos=none
============================================================
size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=begin
============================================================
size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=middle
============================================================
size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=true pos=end
============================================================
size:[1031 x 1000] type:6 continuous = false iterations:1000 nz=false pos=none
done
```
2023-06-09 13:37:20 +03:00
Tinson Lai
f8f425e34c
Change custom_hal.hpp output location
2023-02-03 18:21:15 +08:00
Yuantao Feng
c63d79c5b1
Merge pull request #23095 from fengyuentau:fix_omp_macos
...
* fix openmp include and link issue on macos
* turn off have_openmp if OpenMP_CXX_INCLUDE_DIRS is empty
* test commit
* use condition HAVE_OPENMP and OpenMP_CXX_LIBRARIES for linking
* remove trailing whitespace
* remove notes
* update conditions
* use OpenMP_CXX_LIBRARIES for linking
2023-01-16 12:44:13 +03:00
Alexander Alekhin
2ebdc04787
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2022-08-14 15:50:42 +00:00
Alexander Alekhin
d0d115321d
Merge pull request #22350 from alalek:rework_psabi_warning
2022-08-13 15:05:41 +00:00
Alexander Alekhin
44b2f9637a
Revert "suppress warning on GCC 7 and later"
...
This reverts commit a630ad73cb
.
2022-08-07 15:43:10 +03:00
Tomoaki Teshima
b3269b08a1
neon: add dotprod dispatch implementation
...
* read vector at runtime
* add enum
2022-07-20 19:25:39 +09:00
Tomoaki Teshima
a630ad73cb
suppress warning on GCC 7 and later
2022-07-06 23:31:31 +09:00
Joel Winarske
0769bf416f
highgui Wayland xdg_shell
...
-enable using -DWITH_WAYLAND=ON
-adapted from https://github.com/pfpacket/opencv-wayland
-using xdg_shell stable protocol
-overrides HAVE_QT if HAVE_WAYLAND and WITH_WAYLAND are set
Signed-off-by: Joel Winarske <joel.winarske@gmail.com>
Co-authored-by: Ryo Munakata <afpacket@gmail.com>
2022-06-26 12:11:09 -07:00
Alexander Alekhin
c80b270678
cmake: force lowercase plugins internal names
2021-12-21 16:34:48 +00:00
Alexander Alekhin
d24befa0bc
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2021-12-11 15:18:57 +00:00
Alexander Alekhin
65392d5e6b
cmake: fix OPENGL_LIBRARIES handling
2021-12-07 12:12:42 +00:00
Francesco Petrogalli
d29c7e7871
Merge pull request #20392 from fpetrogalli:aarch64-semihosting
...
AArch64 semihosting
* [ts] Disable filesystem support in the TS module.
Because of this change, all the tests loading data will file, but tat
least the core module can be tested with the following line:
opencv_test_core --gtest_filter=-"*Core_InputOutput*:*Core_globbing.accuracy*"
* [aarch64] Build OpenCV for AArch64 semihosting.
This patch provide a toolchain file that allows to build the library
for semihosting applications [1]. Minimal changes have been applied to
the code to be able to compile with a baremetal toolchain.
[1] https://developer.arm.com/documentation/100863/latest
The option `CV_SEMIHOSTING` is used to guard the bits in the code that
are specific to the target.
To build the code:
cmake ../opencv/ \
-DCMAKE_TOOLCHAIN_FILE=../opencv/platforms/semihosting/aarch64-semihosting.toolchain.cmake \
-DSEMIHOSTING_TOOLCHAIN_PATH=/path/to/baremetal-toolchain/bin/ \
-DBUILD_EXAMPLES=ON -GNinja
A barematel toolchain for targeting aarch64 semihosting can be found
at [2], under `aarch64-none-elf`.
[2] https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads
The folder `samples/semihosting` provides two example semihosting
applications.
The two binaries can be executed on the host platform with:
qemu-aarch64 ./bin/example_semihosting_histogram
qemu-aarch64 ./bin/example_semihosting_norm
Similarly, the test and perf executables of the modules can be run
with:
qemu-aarch64 ./bin/opecv_[test|perf]_<module>
Notice that filesystem support is disabled by the toolchain file,
hence some of the test that depend on filesystem support will fail.
* [semihosting] Remove blank like at the end of file. [NFC]
The spurious blankline was reported by
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31158 .
* [semihosting] Make the raw pixel file generation OS independent.
Use the facilities provided by Cmake to generate the header file
instead of a shell script, so that the build doesn't fail on systems
that do not have a unix shell.
* [semihosting] Rename variable for semihosting compilation.
* [semihosting] Move the cmake configuration to a variable file.
* [semihosting] Make the guard macro private for the core module.
* [semihosting] Remove space. [NFC]
* [semihosting] Improve comment with information about semihosting. [NFC]
* [semihosting] Update license statement on top of sourvce file. [NFC]
* [semihosting] Replace BM_SUFFIX with SEMIHOSTING_SUFFIX. [NFC]
* [semihosting] Remove double space. [NFC]
* [semihosting] Add some text output to the sample applications.
* [semihosting] Remove duplicate entry in cmake configuration. [NFCI]
* [semihosting] Replace `long` with `int` in sample apps. [NFCI]
* [semihosting] Use `configure_file` to create the random pixels. [NFCI]
* [semihosting][bugfix] Fix name of cmakedefine variable.
* [semihosting][samples] Use CV_8UC1 for grayscale images. [NFCI]
* [semihosting] Add readme file.
* [semihosting] Remove blank like at the end of README. [NFC]
This fixes the failure at
https://pullrequest.opencv.org/buildbot/builders/precommit_docs/builds/31272 .
2021-07-21 18:46:05 +03:00
Francesco Petrogalli
b928ebdd53
Merge pull request #19985 from fpetrogalli:disable_threads
...
* [build][option] Introduce `OPENCV_DISABLE_THREAD_SUPPORT` option.
The option forces the library to build without thread support.
* update handling of OPENCV_DISABLE_THREAD_SUPPORT
- reduce amount of #if conditions
* [to squash] cmake: apply mode vars in toolchains too
Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
2021-07-08 20:21:21 +00:00
Alexander Alekhin
a41394c885
core(parallel): fix plugins handling if no filesystem available
2021-03-18 23:05:12 +00:00
Alexander Alekhin
cbfd38bd41
core: rework code locality
...
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)
backport is done to improve merge experience (less conflicts)
backport of commit: 65eb946756
2021-03-02 23:24:28 +00:00
Alexander Alekhin
65eb946756
core: rework code locality
...
- to reduce binaries size of FFmpeg Windows wrapper
- MinGW linker doesn't support -ffunction-sections (used for FFmpeg Windows wrapper)
- move code to improve locality with its used dependencies
- move UMat::dot() to matmul.dispatch.cpp (Mat::dot() is already there)
- move UMat::inv() to lapack.cpp
- move UMat::mul() to arithm.cpp
- move UMat:eye() to matrix_operations.cpp (near setIdentity() implementation)
- move normalize(): convert_scale.cpp => norm.cpp
- move convertAndUnrollScalar(): arithm.cpp => copy.cpp
- move scalarToRawData(): array.cpp => copy.cpp
- move transpose(): matrix_operations.cpp => matrix_transform.cpp
- move flip(), rotate(): copy.cpp => matrix_transform.cpp (rotate90 uses flip and transpose)
- add 'OPENCV_CORE_EXCLUDE_C_API' CMake variable to exclude compilation of C-API functions from the core module
- matrix_wrap.cpp: add compile-time checks for CUDA/OpenGL calls
- the steps above allow to reduce FFmpeg wrapper size for ~1.5Mb (initial size of OpenCV part is about 3Mb)
2021-03-02 11:27:58 +00:00
Alexander Alekhin
3dd55d284d
core(libva): use dynamic loader
2021-02-19 10:32:59 +00:00
Alexander Alekhin
cc73c36e32
core(parallel): plugins support
2021-02-15 17:07:36 +00:00
Alexander Alekhin
37c12db366
Merge pull request #19365 from alalek:parallel_api
2021-01-27 18:12:15 +00:00
Alexander Alekhin
b73bf03bfc
core: parallel backends API
...
- allow to replace parallel_for() backend
2021-01-27 14:15:33 +00:00
Alexander Alekhin
cd68cc1f46
Merge pull request #19195 from diablodale:win32AlignAlloc
2020-12-23 17:33:58 +00:00
Dale Phurrough
109255a730
add windows native aligned malloc + unit test case
...
* implements https://github.com/opencv/opencv/issues/19147
* CAUTION: this PR will only functions safely in the
4+ branches that already include PR 19029
* CAUTION: this PR requires thread-safe startup of the alloc.cpp
translation unit as implemented in PR 19029
2020-12-23 14:59:28 +01:00
Alexander Alekhin
de385009ae
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-12-09 18:09:00 +00:00
Alexander Alekhin
26e8048a0a
core: update handling of allocator stats type
...
- don't use OPENCV_ALLOCATOR_STATS_COUNTER_TYPE definition in non C++11 builds
- don't use with MinGW
2020-12-05 20:54:47 +00:00
Odianosen Ejale
862fc06b6f
Fixed and updated OpenCL-VA interoperability
2020-09-25 16:11:50 +03:00
Giles Payne
02385472b6
Merge pull request #17165 from komakai:objc-binding
...
Objc binding
* Initial work on Objective-C wrapper
* Objective-C generator script; update manually generated wrappers
* Add Mat tests
* Core Tests
* Imgproc wrapper generation and tests
* Fixes for Imgcodecs wrapper
* Miscellaneous fixes. Swift build support
* Objective-C wrapper build/install
* Add Swift wrappers for videoio/objdetect/feature2d
* Framework build;iOS support
* Fix toArray functions;Use enum types whenever possible
* Use enum types where possible;prepare test build
* Update test
* Add test runner scripts for iOS and macOS
* Add test scripts and samples
* Build fixes
* Fix build (cmake 3.17.x compatibility)
* Fix warnings
* Fix enum name conflicting handling
* Add support for document generation with Jazzy
* Swift/Native fast accessor functions
* Add Objective-C wrapper for calib3d, dnn, ml, photo and video modules
* Remove IntOut/FloatOut/DoubleOut classes
* Fix iOS default test platform value
* Fix samples
* Revert default framework name to opencv2
* Add converter util functions
* Fix failing test
* Fix whitespace
* Add handling for deprecated methods;fix warnings;define __OPENCV_BUILD
* Suppress cmake warnings
* Reduce severity of "jazzy not found" log message
* Fix incorrect #include of compatibility header in ios.h
* Use explicit returns in subscript/get implementation
* Reduce minimum required cmake version to 3.15 for Objective-C/Swift binding
2020-06-08 18:32:53 +00:00
Alexander Alekhin
ca23c0e630
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2020-03-17 13:23:33 +03:00
Alexander Alekhin
4e56c1326f
core: adjust type of allocator_stats counter, allow to disable
2020-03-11 20:12:29 +03:00
Alexander Alekhin
ba7b0f4c54
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-12-15 11:23:46 +00:00
Alexander Alekhin
a45928045a
Merge pull request #16150 from alalek:cmake_avoid_deprecated_link_private
...
* cmake: avoid deprecated LINK_PRIVATE/LINK_PUBLIC
see CMP0023 (CMake 2.8.12+)
* cmake: fix 3rdparty list
- don't include OpenCV modules
2019-12-13 17:52:40 +03:00
Alexander Alekhin
b6a58818bb
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-11-11 20:25:42 +00:00
Alexander Alekhin
657c17bb8c
cmake: fix ITT define condition
2019-11-01 15:07:49 +03:00
Alexander Alekhin
65573784c4
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-10-09 19:46:18 +00:00
Sayed Adel
f2fe6f40c2
Merge pull request #15510 from seiko2plus:issue15506
...
* core: rework and optimize SIMD implementation of dotProd
- add new universal intrinsics v_dotprod[int32], v_dotprod_expand[u&int8, u&int16, int32], v_cvt_f64(int64)
- add a boolean param for all v_dotprod&_expand intrinsics that change the behavior of addition order between
pairs in some platforms in order to reach the maximum optimization when the sum among all lanes is what only matters
- fix clang build on ppc64le
- support wide universal intrinsics for dotProd_32s
- remove raw SIMD and activate universal intrinsics for dotProd_8
- implement SIMD optimization for dotProd_s16&u16
- extend performance test data types of dotprod
- fix GCC VSX workaround of vec_mule and vec_mulo (in little-endian it must be swapped)
- optimize v_mul_expand(int32) on VSX
* core: remove boolean param from v_dotprod&_expand and implement v_dotprod_fast&v_dotprod_expand_fast
this changes made depend on "terfendail" review
2019-10-07 22:01:35 +03:00
Alexander Alekhin
19a4b51371
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-08-16 18:48:08 +03:00
Alexander Alekhin
5ef548a985
cmake: update initialization
2019-08-08 15:23:16 +03:00
Alexander Alekhin
f3de2b4be7
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-06-05 19:11:52 +03:00
Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
...
AVX512 wide universal intrinsics (#14210 )
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
2019-06-03 18:05:35 +03:00
Alexander Alekhin
142a524d2f
cmake: fix CUDA world build
2019-03-31 19:56:30 +00:00
Alexander Alekhin
c7c987843c
cmake: emit error if CUDA is enabled without opencv_contrib
2019-03-28 16:51:11 +03:00
Alexander Alekhin
8c25a8eb7b
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-03-22 19:31:31 +03:00
Sayed Adel
f41359688b
core:vsx Add support for VSX3 half precision conversions
2019-03-20 10:19:42 +02:00
Alexander Alekhin
c3cf35ab63
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-02-26 17:34:42 +03:00
Alexander Alekhin
fd49ee5f39
core: dispatch merge.cpp
2019-02-23 15:42:26 +00:00
Alexander Alekhin
91d152e2c2
core: dispatch split.cpp
2019-02-22 09:54:31 +00:00
Alexander Alekhin
8bde6aea4b
Merge remote-tracking branch 'upstream/3.4' into merge-3.4
2019-02-19 19:49:13 +00:00
Alexander Alekhin
dc84cf9914
core: dispatch mean.cpp
2019-02-19 16:58:32 +03:00
Alexander Alekhin
cd66f6e3db
core: dispatch matmul
...
- gemm: keep baseline only (lapack is 10x+ faster, lets reduce binary size)
- transform / distTransform
- scaleAdd (32f/64f only)
- Mahalanobis: keep baseline only (no perf tests)
- mulTransposed: keep baseline only (no perf tests)
- dot
2019-02-18 14:36:46 +03:00
Alexander Alekhin
e3633ec4a2
core: dispatch count_non_zero
2019-02-14 13:16:20 +03:00