08-31-2023 07:22 AM
Greetings,
I'm currently working on converting an image into a 2D array. The image size is 8600 x 2600, and I've noticed that the conversion process is taking a considerable amount of time. While using the image-to-array conversion method is efficient for smaller images, it's taking around 500 ms for larger images. I'm exploring alternatives to speed up this conversion process for both directions. If anyone has suggestions for a faster approach to convert images to 2D arrays and vice versa, I would greatly appreciate it.
08-31-2023 07:50 AM - edited 08-31-2023 12:41 PM
Can you define what you mean by "image"? (2D picture datatype? Something from the vision toolkit? Some external image in a specific format (PNG, GIF, TIFF, BMP, etc.).
Where does the image come from? What type is it (B&W? 8 bit? Paletted? RGB, etc.).
The code of "unflatten pixmap" is visible and ancient. There might be places that can be optimized, but don't expect miracles.
08-31-2023 10:31 AM
I'm guessing you use a loop with RGB to color? Copy out the code, or better make a copy of the function that's Inlined, then the loop can be parallelized.
This might give some ideas:
08-31-2023 12:30 PM - edited 08-31-2023 12:51 PM
I did some testing, and the only thing holding the back the conversion to RGB is the fact the that unflatten tool has debugging enabled. Once I disable it, it speeds up by a factor of about two and is competitive to e.g. a loop free versions e.g. as follows.
Yes, things can probably be parallelized done right, but be careful, there might be some landmines. Even seemingly small changes can seriously impact the speed, so do some testing.
As I said, the stock code is quite good, except for some cosmetic issues, such as the shift by zero NOOP. and the not-resized index array, which is probably a legacy issue. 😄
On my ancient 2-core laptop (W=8600, H=2600):
Stock ~320ms (debugging enabled)
Stock ~170ms (debugging disabled)
CA1: ~160ms (debugging disabled or enabled)
Debugging has a large impact on tight loop stacks because it adds additional code to allow probing the wires anywhere. I has no effect on my loop-free alternative. I have not tested on a more powerful CPU. Will do that later.