08-16-2012 08:29 PM
Hi all,
My application involves grabbing images from a 3-taps, 16-bit camera using FlexRIO. The PXI controller I am using is Windows-based while the FlexRIO module that I have is a PXI-7954 + NI 1483 adapter. The size of the image I am grabbing is 2560 x 2160, U16 and the clock cycle is 100 MHz. I've been trying for over a week and up to today, I still am not able to get the image from the camera as I kept on getting the DMA Write Timeout error. Right now, the DMA size in the FPGA is set at 130k but whenever I tried to increase this further, I get a compilation error. I've tried to have the host program to grab 100k data points from the FPGA DMA at every milisecond but it seems that, the minimum I could do is at about 10-15ms. Perhaps, Windows has its own limitation... Attached is the program that I am using, modified from the LabVIEW shipped example. Please advice, how do move forward from here? or, is it possible to further increase the DMA buffer size up 10x higher than the current limit?
Also, has anybody uses the example previously? I did another test with the example using the same camera. This time, I ignored two of the taps and assume that I have only one tap and it is 8-bit. I would like to see whether it is possible or not to stream data (even if it is 8-bit) from the camera to Windows. On the host example program, I deleted the portion where the 1D array of data being resize into 2D and tranformed into an image. Moreover, I added a second DMA read - the first DMA read reads 0 element at 0 timeout and pass the number of remaining elements to the second DMA... only to find out that I still get a write timeout error. Has anybody manage to get this working?
Please advice.
08-17-2012 07:35 PM
Hi,
I guess by now I have realized that I may not be able to acquire all the pixels in one shot. I'm currently exploring on the usage of the onboard memory as DRAM - this will be my first time. I've studied the logic flow from one of the examples and I think I may have some basic understanding of how it works. However, when I tried to compile the program, I kept on getting the timing violation error. Attached is the modified code, though may not be the final one but close. I really appreciate it if somebody who is somewhat familiar with FPGA & DRAM to comment about the code? After I've put so many feedback loops here and there, I got even more confused.
Many thanks!
08-20-2012 08:49 AM
You'd need to attach the whole DRAM project for someone to directly replicate your results and see your timing violation. If you are still having trouble working that out and need another pair of eyes on it, you'll need to post it all.
On the subject of DRAM in general, you'll find that the DRAM primitive does not perform batching of transactions. The way this loop is constructed, it will send the reads and writes to the DRAM one at a time, and in a round-robin fashion (one read, one write, one read, etc). This will cause the DRAM access to be inefficient because there are penalties to changing your access type and to accessing data that isn't "local" in its address, to the point that it is unlikely that the DRAM will be able to respond to the access pattern quickly enough to work for your application. In order to efficiently use DRAM, you have to perform reads and writes in bursts to adjacent addresses. You also need to work on the logic that decides what to do when the DRAM primitive tells you to not read/write (not ready for data). You will want this to not advance the address, and to buffer data up.
All that said: In general, you might find it easier to start from the DRAM FIFO CLIP. It seems that you are storing and accessing the data in a linear fashion, and using this will largely abstract away a lot of these problems for you. You may still need to locally buffer data because the DRAM FIFO cannot read and write simultaneously at 100 MHz. However, you can further optimize this by bundling your i16 data up in to 4 i16s at a time in one 64-bit integer to write to the DRAM, which will result in only writing data in 1 out of 4 clock cycles. At that point, the DRAM should easily be able to keep up.
As a plug, the FlexRIO Instrument Development Library has a few elements that you might find helpful in building this. It does not contain examples of using the DRAM FIFO CLIP, but it contains a set of data repackers in its Data Manipulation library that would make it a lot easier to use the DRAM FIFO efficiently. It also contains an acquisition engine that uses the DRAM to store multiple, segmented records - but that may be overkill for what you are doing right now.
https://decibel.ni.com/content/docs/DOC-15799
If you build a stream that looks like this:
Write side: 1 x i16 -> Packer.vi -> 4 x i16 -> u64 -> DRAM FIFO
Read side: DRAM FIFO -> u64 -> 4 x i16 -> Unpacker.vi -> 1 x i16
... you will not have to deal with addresses at all.