Why is C code so much faster? (Image Processing)

altenbach · ‎04-10-2016

@altenbach wrote:
One of the big expenses is the operation on DBLs. All values are quantized to 256 possibilities so instead of all these mutiplications, all you need is a tiny LUT (lookup table) for each color that gives an U8 result for all possible 256 multiplications. This keeps everything in U8. I am sure it would be faster.

OK, here's what I had in mind. I have not benchmarked it but speed should be OK. Try it! Npote that the code is 100% U8.

(the subVI is inlined and the outer loop parallelized. Also test without parallelization to see if it gains anything)

LabVIEW Champion.

crossrulz · ‎04-10-2016

@altenbach wrote:
OK, here's what I had in mind. I have not benchmarked it but speed should be OK. Try it! Npote that the code is 100% U8.

I thought you were going to use a 3D array and then just index the values with a single Index Array and you magically have the final value. No need to add either.

There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5

altenbach · ‎04-10-2016

@crossrulz wrote:
I thought you were going to use a 3D array and then just index the values with a single Index Array and you magically have the final value. No need to add either.

Then the LUT would no longer be tiny (24 bits!) 😮 This is 16MB (compared to my 768bytes) and won't fit into the cache of a typical CPU.

Alternative to my above solution, we can also index directly into the 2D array. Not sure what's better.

LabVIEW Champion.

Yamaeda · ‎04-11-2016

@altenbach wrote:
One of the big expenses is the operation on DBLs. All values are quantized to 256 possibilities so instead of all these mutiplications, all you need is a tiny LUT (lookup table) for each color that give an U8 result for all possible 256 multiplications. This keeps everything in U8. I am sure it would be faster.

In this case converting to SGL should be plenty and a bit faster. It'd be interesting to compare the SGL, DBL and your lookup table.

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

Hooovahh · ‎04-11-2016

@altenbach wrote:

Alternative to my above solution, we can also index directly into the 2D array. Not sure what's better.

I'd guess the above example would be better, since constant folding will just start with 3 1D arrays, but who knows the compiler can be crazy. I do like the 3D array idea. In either case all of these options have to be faster than anything we've come up with so far involving doubles (or floating point in general).

Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.

17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord

Yamaeda · ‎04-11-2016

@Hooovahh wrote:
I'd guess the above example would be better, since constant folding will just start with 3 1D arrays, but who knows the compiler can be crazy. I do like the 3D array idea. In either case all of these options have to be faster than anything we've come up with so far involving doubles (or floating point in general).

Sometimes logic plays a trick on us. When Doom 3 (i think) was delevoped they'd created a lookup table for Sin to improve frame rate. By that time the FPUs was so good they were quite a bit faster than the memory access to read the table.

The SGL idea could be surprisingly fast, especially if the lookup table gets large.

Though, this sounds like the type of idea that should be run on a GPU. 🙂

/Y

G# - Award winning reference based OOP for LV, for free! - Qestit VIPM GitHub

Qestit Systems

LabVIEW

Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)

Re: Why is C code so much faster? (Image Processing)