LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

FPGA Adder tidbit - Optimise using DSP Adders with additional functionality

I think most people are aware that we can use DSPs for multiplication, right?  Did you know they also have built-in adders?  They can add numbers up to 48 bit in width.  The path to unlocking the functionality is using the XIlinx IPCores (I use a target compiled with the ISE compiler so I don't know if the same functions are available in Vivado or if they are different).  It's worth the learning curve.

 

How many people have done the following in FPGA code?

 

2016-09-30 18_10_01-Conditional Adder.vi Block Diagram on Nanonis V5.lvproj_FPGA V5 _.png

The above code usilised 33 LUTs.  The following code uses zero LUTs (1DSP).

2016-09-30 18_10_44-Conditional Adder.vi Block Diagram on Nanonis V5.lvproj_FPGA V5 _.png

 

It gets even better, the DSP also supports a Bypass node where Input b is simply copied to Output s (respecting latency settings).

2016-09-30 18_11_51-Conditional Adder.vi Block Diagram on Nanonis V5.lvproj_FPGA V5 _.png

 

 

And if you knew that, did you know it can be configured for Adder / Subtracter operation also?

2016-09-30 18_12_41-Conditional Adder.vi Block Diagram on Nanonis V5.lvproj_FPGA V5 _.png

 

 

Finally (or at least the final piece I am going to write today), the DSP has input and output REgister built-in which can be used for pipelining at no LUT cost (Registers are purely internal to the DSP).

2016-09-30 18_14_06-Conditional Adder.vi Block Diagram on Nanonis V5.lvproj_FPGA V5 _.png

 

The LUT examples listed are for 32 bit values, if the values were actually 48 bit, the resource savings would be even greater (50% more).

 

So if you find that your code is running out of LUTs, it might be worthwhile looking at palces where you may be doing some of these operations.  Especially the conditional add is a nice function to be able to implement fully with a DSP.

Message 1 of 7
(4,654 Views)

Nice tip!

 

Historically I always found that DSPs were the limited resource but the series 7 targets are such a step up in DSP numbers this could be useful.

James Mc
========
CLA and cRIO Fanatic
My writings on LabVIEW Development are at devs.wiresmithtech.com
0 Kudos
Message 2 of 7
(4,631 Views)

Our design is the opposite. Desperate for more LUTs but DSPs to spare....

0 Kudos
Message 3 of 7
(4,622 Views)

Nice FPGA tip. Moving FPGA functionality from using one resource to another is always much easier said than done so I appreciate the comparisons between the two implementations (with actual #LUTs too).

 

Do you find yourself often using these DSP blocks from the start when you program on FPGA or is this just something you move to when you realize you are running out of one particular resource?

Matt J | National Instruments | CLA
0 Kudos
Message 4 of 7
(4,619 Views)

We have only one perpetually improving FPGA code so the beginning was like 10 or 12 years ago......  Every time we think we have filled the FPGA target, refactoring breathes life back into it.  I have started generously using DSPs because we dont actualy use many DSPs otherwise.

Message 5 of 7
(4,599 Views)

Thanks for the nugget!!

 

I did not need to refactor or save resources but the tip led me to the extremely useful "Binary Counter" function that I could reconfigure with a built in reset.

 

Using this function in a SCTL I was able to count pulses using the master clock for the synchronized Delta-Sigma modules. I used this counter to generate an occurence every Nth cycle to start data acquisition on the SAR module. Now I have modules running at different sample rates using the same clock, 50kSa/s and 800kSa/s.  (The timer functions, ie the 40MHz FPGA clock would drift relative to the Delta-Sigma clock.)

 

Cheers,

mcduff

Message 6 of 7
(4,558 Views)

The DSP48 Macro lets you do even more with a DSP. Here's an example that performs a linear scaling operation. The input is a +/-20,5 fixed-point (range -16..+16), the slope is an I16, and the intercept and output are a +/-16,11 fixed-point values after some manipulation. The functions to add bits to the intercept, and remove them from the output, consume no FPGA resources (according to the documentation), so the entire operation occurs on the DSP in a single clock cycle. Note that this does need to be wrapped in a single-cycle timed loop.

DSP48 linear scale.png

0 Kudos
Message 7 of 7
(3,626 Views)