09-30-2016 11:20 AM
I think most people are aware that we can use DSPs for multiplication, right? Did you know they also have built-in adders? They can add numbers up to 48 bit in width. The path to unlocking the functionality is using the XIlinx IPCores (I use a target compiled with the ISE compiler so I don't know if the same functions are available in Vivado or if they are different). It's worth the learning curve.
How many people have done the following in FPGA code?
The above code usilised 33 LUTs. The following code uses zero LUTs (1DSP).
It gets even better, the DSP also supports a Bypass node where Input b is simply copied to Output s (respecting latency settings).
And if you knew that, did you know it can be configured for Adder / Subtracter operation also?
Finally (or at least the final piece I am going to write today), the DSP has input and output REgister built-in which can be used for pipelining at no LUT cost (Registers are purely internal to the DSP).
The LUT examples listed are for 32 bit values, if the values were actually 48 bit, the resource savings would be even greater (50% more).
So if you find that your code is running out of LUTs, it might be worthwhile looking at palces where you may be doing some of these operations. Especially the conditional add is a nice function to be able to implement fully with a DSP.
09-30-2016 04:09 PM
Nice tip!
Historically I always found that DSPs were the limited resource but the series 7 targets are such a step up in DSP numbers this could be useful.
09-30-2016 05:50 PM
Our design is the opposite. Desperate for more LUTs but DSPs to spare....
09-30-2016 06:04 PM - edited 09-30-2016 06:06 PM
Nice FPGA tip. Moving FPGA functionality from using one resource to another is always much easier said than done so I appreciate the comparisons between the two implementations (with actual #LUTs too).
Do you find yourself often using these DSP blocks from the start when you program on FPGA or is this just something you move to when you realize you are running out of one particular resource?
10-01-2016 03:53 AM
We have only one perpetually improving FPGA code so the beginning was like 10 or 12 years ago...... Every time we think we have filled the FPGA target, refactoring breathes life back into it. I have started generously using DSPs because we dont actualy use many DSPs otherwise.
10-03-2016 11:26 AM
Thanks for the nugget!!
I did not need to refactor or save resources but the tip led me to the extremely useful "Binary Counter" function that I could reconfigure with a built in reset.
Using this function in a SCTL I was able to count pulses using the master clock for the synchronized Delta-Sigma modules. I used this counter to generate an occurence every Nth cycle to start data acquisition on the SAR module. Now I have modules running at different sample rates using the same clock, 50kSa/s and 800kSa/s. (The timer functions, ie the 40MHz FPGA clock would drift relative to the Delta-Sigma clock.)
Cheers,
mcduff
11-13-2017 06:17 PM
The DSP48 Macro lets you do even more with a DSP. Here's an example that performs a linear scaling operation. The input is a +/-20,5 fixed-point (range -16..+16), the slope is an I16, and the intercept and output are a +/-16,11 fixed-point values after some manipulation. The functions to add bits to the intercept, and remove them from the output, consume no FPGA resources (according to the documentation), so the entire operation occurs on the DSP in a single clock cycle. Note that this does need to be wrapped in a single-cycle timed loop.