10-29-2010 09:48 AM
I have a multi-rate filter that has over 120 local variables in because
I need to move data between SCTLs that are running at different ratess.
I also need to build up to a much larger filter (made up of the 32 section
with its 120+ local variables). I can't build a subVI because the local
variables create ports and control status indicators on the front panel.
I don't need most (maybe not even any) of the control/indicators that were
created with the local variable, but I don't know how to handle them so that
I can create the subV.
The project is attached in case that helps. resourceShareWithBuffers is the
32 tap section. resourceShareWithBufferV2 is a larger (128 tap) section.
I need to get to a 512 tap design. My original intention was to create a
32 tap subVi and instantiate it 16 times.
10-29-2010 10:25 AM
Generally using local variables to move data around is a bad idea. Shift registers or queues are much better.
To reduce the number of controls I would make arrays of the FXP data. Arrays are automatically scalable to any size. You likely can refactor your code to handle the data as arrays and make it much simpler to read.
The style guides recommend that the size of a panel or diagram be kept to the size of one screen. No scrolling to see everything!
Lynn
10-29-2010 10:37 AM
Thanks. I'll take a look at using arrays. I was dissapoointed to find out that I needed ANY
memory element to move the data around. I choose the local variable because I didn't want
to deal with addressing for the RAM apporach or keeping track of the order for the FIFO
solution.
I don't think I can use a shift register because I need the data from the shift registers to
go to a SCTL running at a faster rate. Can I really use a shift register (somehow)? That
seems like thebest solution since it wouldn't require any additional memory or add any
latency to the filter.
Thanks
10-29-2010 11:13 AM
I do not have any experience with the FPGA stuff so you will need to ask someone else about that. My comments were about LV in general.
Lynn
10-29-2010 12:05 PM
I don't quite understand what you're trying to do, but I do have some FPGA experience, so maybe I can help. My first thought is, why do you have so many separate SCTLs running at the same clock rate? If you combine the 160Mhz loops you could eliminate a lot of the local variables. If you can't do the sum operation in the same clock cycle as the cmult4x, then take the output of cmult4x, make it an array, put that in a shift register, and do the sum in the next loop cycle in parallel with cmult4x. Although arrays are discouraged as front panel FGPA controls, they are efficient in code. Many array operations are free, especially when working with constant indices.
Also, this might just be my preference, but I like the standard add function unless there's a compelling reason to use the "high-throughput" version.
10-29-2010 12:25 PM
So the design is a FIR filter. The 40MHz SCTL across the top is my tapped delay line. The 160MHz SCTL below it
is my 4:1 resource shared complex multipler. The subsequent 160MHz SCTLs below that are my adder tree. I
don't know that I need the extra stages of pipleining for the final desing, but at somepoint, I wasn't able to make
timing through the adder tree (the original/target desgin is 512 taps). Putting the adder tree in the same SCTL as the
complex multiplier adds to the timing path. Similarly, I choose the "high throughput" adder thinking that it was
faster and was better from a timing perspective.
I'll try the array approach.
Thanks.
10-29-2010 12:33 PM
@creed wrote:
Putting the adder tree in the same SCTL as the complex multiplier adds to the timing path. Similarly, I choose the "high throughput" adder thinking that it was
faster and was better from a timing perspective.
If you pipeline it - put the results of the complex multiplier into a shift register, then do the additions in the same loop but in parallel - then you are not increasing the timing path.
My suspicion, although I can't verify it, is that the high-throughput addition is just there for completeness with the other high-throughput math functions and executes identically to the standard addition node; it may also be a holdover from the time when fixed-point math was first introduced on the FPGA and the standard math functions did not accept fixed-point data.