07-21-2015 01:14 PM
Hi,
I'm hoping for help on a timing violation.
I have a relatively small SCTL that I need to run at a derived clock of 200 MHz on our cRIO 9082. When compiling our (largish) project I almost always get a timing violation during compilation on this loop (I think 1 out 15 times it worked). I have tuned the logic (pipelining etc.) as much as I can but I go over the 5 nS clock not with logic (~1nS) but with routing (~4nS) according to the Labivew violation breakdown. However, we're only using about 50% of all our FPGA resources. I was under the impression that placing/routing only get's problematic when you're running out of gates(?) Indeed, when compiling smaller test programs with the same SCTL they compile fine.
This target does not support removing the implicit enable signal.
Any ideas? Any tricks to locating a subVI to change its placement/routing (e.g., reentrancy)?
Thanks!
Steve
07-21-2015 01:44 PM - edited 07-21-2015 01:45 PM
To give you any meaningful feedback we'd have to see your code. Can you create a simple test VI that duplicates the problem?
Have you tried adjusting the compile settings for timing optimization? (right click the build, select properties, Xilinx Options page)
07-21-2015 02:15 PM
@shansen1 wrote:
To give you any meaningful feedback we'd have to see your code. Can you create a simple test VI that duplicates the problem?
Have you tried adjusting the compile settings for timing optimization? (right click the build, select properties, Xilinx Options page)
Thanks for the response.
Code is attached. Simple test programs compile without problem (just repeated this now). I only have issues when compiling the same VI within a larger project. Yes I have tried the timing optimization many times.
I'm using all 2014 SP1 and I believe latest Xilinx BTW.
Steve
07-21-2015 02:57 PM
Are you including this VI within another top level VI? If so, is this VI in another loop or is it just standalone within the top level VI? The SCTL should be a standalone loop within the top level VI.
When you compile, where is it complaining that the timing violation occurs? You said that the violation is in routing, but does it tell you where specifically? I assume it is failing within your 25 MHz case. Their may be some opportunity to pipeline here (assuming you can handle additional latency). You will need to adjust your case selector logic so the 25MHz case runs multiple times in a row until the logic has completed. At a minimum you can try adding feedback nodes after the increment/decrement operations.
07-21-2015 03:04 PM
@shansen1 wrote:
Are you including this VI within another top level VI? If so, is this VI in another loop or is it just standalone within the top level VI? The SCTL should be a standalone loop within the top level VI.
It is included in a top level and is a standalone loop.
When you compile, where is it complaining that the timing violation occurs? You said that the violation is in routing, but does it tell you where specifically? I assume it is failing within your 25 MHz case. Their may be some opportunity to pipeline here (assuming you can handle additional latency). You will need to adjust your case selector logic so the 25MHz case runs multiple times in a row until the logic has completed. At a minimum you can try adding feedback nodes after the increment/decrement operations.
I will recreate issue and respond.
Thanks
07-21-2015 05:54 PM
Attached is from a fresh compile. I've highlighted the Tunnel Controller called out. Funny how my total delay less than the required 5 nS on this compile (!)
I can look into pipelining more, but - speaking out loud here - why do I need to work so hard when there are so many resources available (see attachment) and this compiles fine in smaller projects? I don't want to create unmaintainable uber-pipelined code if I don't need to.
Thanks for the help.
Steve
07-21-2015 06:33 PM
Hopefully someone else can help you understand why your code fails timing even though you have <5ns total delay. That is a mystery to me. What is the "non-diagram component" shown in the timing violation? It shows the largest fanout, and this will affect your timing.
If you are short for resources, it is true that it can be more difficult to meet timing requirements. However, timing also depends on routing because the top level clock (200 MHz in this case) must be routed to each logic element. There are inherent delays in the routing process, and these delays add up. At 200 MHz, it doesn't take very much routing to overrun your 5ns requirement.
07-23-2015 01:13 PM
maherhome wrote:
I can look into pipelining more, but - speaking out loud here - why do I need to work so hard when there are so many resources available (see attachment) and this compiles fine in smaller projects? I don't want to create unmaintainable uber-pipelined code if I don't need to.
Sorry, I'm not on LabVIEW 2014 yet so I can't open your code. Have you considered slowly removing elements from the larger VI until it compiles? You might find that there's some particular element that's causing the problem.
It's not simply an issue of having the resources available, it's also where they're located on the chip. Especially at higher clock speeds, the distance between elements starts to make a difference. When all you have in your VI is the fast loop, the compiler has complete freedom as to where to place the logic, and can keep it grouped together to minimize delays. When you put that into a larger VI, you may have some contention for specific locations on the chip, and there's no way to prioritize that.
07-26-2015 07:15 AM
Don't have LV to open your code here, but at a guess, it could be a fan-out issue. Are you summing or otherwise combining a lot of parallel logic routes at any stage?
If so, instead of, for example adding four streams at a single point, sum them into two sets of two, then put it through a shift register (flip-flop) then sum the two sub-sets. This gives the place and route a bit of extra connection length to play with.
08-09-2015 08:45 AM
So I dug in and found there was a lot more pipeling I could do. Attached is what I've come up with - which uses six stages of pipelining.
For a week or so it compiled fine within our larger project. However, now I'm getting timing violations again - after some small modifications to other parts of the project. Interestingly, the timing violations are not for my 200MHz SCTL, but instead are for other onboard clock (40 MHz) non-SCTL code. I understand that there are cases where timing violations can occur at 40 MHz (e.g. large types), but this code has compiled fine in the past. We are getting concerned that the compilation is getting very fragile in terms of timing.
Side question: we are procuring a new cRIO 9038. Will life be easier with its shiny-new FPGA?
Steve