04-22-2024 04:22 PM - edited 04-22-2024 04:26 PM
@paul_a_cardinale wrote:
More lookup, less multiply:
...
By the way your new algorithm produces slightly different result than the old one (at least comparizon not equal, somethig may be wrong with rounding), and more slow than the previous (rougly 20% on my PC - 120 ms vs 100 ms before when 256x256 image scaled by factor 0,9):
Anyway, I checked this with checkerboard and test image, and now I see effect of the gamma (on the screenshot it may wrong appear if scaled in browser), this is gamma 2,2 and scale factor 0,9:
and regular image looks normal with such gamma, at least visually:
So, you're on the right way.
04-22-2024 07:52 PM
I took a picture of a gray scale with my Nikon D780, then checked the RGB values:
% Reflectance RGB value
2.5 27
5 38
10 68
20 113
40 157
80 203
Which works out to a gamma of about 1.9
04-23-2024 07:56 AM - edited 04-23-2024 07:59 AM
@paul_a_cardinale wrote:
More lookup, less multiply:
I didn't noticed that you turned parallelization for RGB loop on, therefore overall performance was a bit slower. And also you have 1024 weighting steps, now everything clear for me.
As an idea for small performance improvement I would like to recommend to replace Threshold 1D Array where you reapplying gamma with binary-search based version like was shown in DLL. This will give you a little bit. In addition, if you don't need a very high precision, then you can omit "fractional part calc" and output direct integer index, which will almost the same (max difference in compared with "reference Threshold 1D will be only 1).
As exercise, I've made Malleable VI, which can accept both U32 and U64. The only "challenged" part was - how to switch between fast and accurate versions, so I added "Switch" which is numeric, and if not connected, then fast version used, and if BOOLEAN connected here, then accurate, because we will be in "Ignored" case:
back side of the medal:
Advantage is that we can avoid comparison (which also takes time). But I'm not sure that this is most elegant way to do this, but I like this more than polymorphic VI (welcome for comments and notes).
On my PC this VI running 3-5 times faster than original. Source in attachment.
04-23-2024 08:27 AM
@Andrey_Dmitriev wrote:
@paul_a_cardinale wrote:
More lookup, less multiply:
I didn't noticed that you turned parallelization for RGB loop on, therefore overall performance was a bit slower. And also you have 1024 weighting steps, now everything clear for me.
As an idea for small performance improvement I would like to recommend to replace Threshold 1D Array where you reapplying gamma with binary-search based version like was shown in DLL. This will give you a little bit. In addition, if you don't need a very high precision, then you can omit "fractional part calc" and output direct integer index, which will almost the same (max difference in compared with "reference Threshold 1D will be only 1).
As exercise, I've made Malleable VI, which can accept both U32 and U64. The only "challenged" part was - how to switch between fast and accurate versions, so I added "Switch" which is numeric, and if not connected, then fast version used, and if BOOLEAN connected here, then accurate, because we will be in "Ignored" case:
back side of the medal:
Advantage is that we can avoid comparison (which also takes time). But I'm not sure that this is most elegant way to do this, but I like this more than polymorphic VI (welcome for comments and notes).
On my PC this VI running 3-5 times faster than original. Source in attachment.
Thank you. You anticipated my next question (I was going to ask if anyone already had a replacement for "Threshold 1D Array" that used a binary search and didn't interpolate).
Note however that in this case, the "accurate" version is essential. The fast version gives wrong results near the edges:
Fast version:
Accurate version:
04-23-2024 08:39 AM
I think I'm done.
Many thanks to all those who contributed.
04-23-2024 09:23 AM
@paul_a_cardinale wrote:
I think I'm done.
Many thanks to all those who contributed.
You're welcome and congratulations!
The only question — why do you have array miltiplication here:
and not there:
?
04-23-2024 10:51 AM
Probably neglectable but:
Why build a 12 byte (3XI32) array and convert it to a 3 byte array (3XU8)?
Using Split Array or Array Subset will be more efficient. Delete From Array always copies the array to the outputs, where the alternatives return subarrays.
The last 3 Delete From Arrays are redundant, as the for loop won't loop over the 4th element anyway.
04-23-2024 11:33 AM
wiebe@CARYA wrote:
Using Split Array or Array Subset will be more efficient. Delete From Array always copies the array to the outputs, where the alternatives return subarrays.
This one will not improve much:
But this roughly 10% for performance improvement:
04-23-2024 01:23 PM - edited 04-23-2024 01:26 PM
@Andrey_Dmitriev wrote:
But this roughly 10% for performance improvement:
Nitpicking: To remove some diagram clutter, all we need is a single "1" diagram constant the the "3" can be left unwired. 😄
I also doubt that the "delete from array adds any value, because the deleted element is probably zero.
04-23-2024 01:53 PM
OK. Maybe now I'm done.