NIDAQmx threading multichannel streaming lag

v_pondfroth · ‎08-20-2024

I am building a GUI in Python 3.12.4 to stream data from a BNC-2090A in real time via NI PCIe-6363. I want to sample at 1000 Hz from 16 channels, and it works for 1 channel. The problem: when I add more than one aivoltage channel, the stream lags, intersperses gaps in the data, and can no longer keep up. With each additional channel, the problem gets worse [Case A: SampleRate = 1000]. I can get around this problem by reducing the sampling rate by a factor equal to the number of channels [Case B: SampleRate = 1000/n_chan]. But the hardware should be able to keep up with 16 channels at 1000 Hz without issue, so I think this is an operator, coding error. Open to suggestions and any advice, please!

I am using the main thread to update a matplotlib canvas with tkinter and listen to a toggle button callback (to start or stop acquisition). I created a background thread to run a DAQ function that uses AnalogMultiChannelReader functionality to record at least 1 channel at 1 kHz sampling rate. In the comparison below, both cases use n_chan=3.

Case A: SampleRate = 1000

Size of data array being plotted ("ai_data"): (3,3000)

Mean iteration duration at 20 seconds: 212.7 ms

Size of buffer queue ("buff_data"): (4, 30)

Case B: SampleRate = 1000/n_chan

Size of data array being plotted: (3,3000)

Mean iteration duration at 20 seconds: 5.2 ms

Size of buffer queue ("buff_data"): (4, 30)

In writing this, I realized (perhaps incorrectly and fortunate coincidence?) the size of the buffer queue should not be the same between Case A and B if the sampling rates are different. So, I changed the number of samples per channel in the task cfg_samp_clk_timing (DAQ func, parameter is "Nsamples") to make the acquisition duration consistent (i.e. grabbing samples 3x faster needs to acquire 3x the number of samples to allow the task the same amount of time). Changing the definition from "Nsamples = 30" to "Nsamples = 30*n_chan" helped speed up the n_chan=3 lag, but the solution does not scale to 7 or 16 channels.

Snippet below, full code attached.

with open(os.path.join('meta_data','config.yml'), 'r') as file:
    yaml_config = yaml.safe_load(file)
for each_var in yaml_config:
    for k,v in each_var.items(): locals()[k] = v

def now():
    return Quantity(time.time_ns(),'ns',scale='ms')

last_update = now()

def DAQ(q,n_chan,chan_names,stop_event):
    global last_update
    with nidaqmx.Task() as task:
        # task.ai_channels.add_ai_voltage_chan(
        #     physical_channel = "Dev3/ai0",
        #     name_to_assign_to_channel = "Dummy channel"
        # )
        for nc,cn in zip(range(n_chan), chan_names):
            task.ai_channels.add_ai_voltage_chan(
                physical_channel = "Dev3/ai{0}".format(nc),
                name_to_assign_to_channel = cn
            )

        task.timing.cfg_samp_clk_timing(SampleRate, source="", sample_mode=AcquisitionType.CONTINUOUS, samps_per_chan=Nsamples)
        reader = AnalogMultiChannelReader(task.in_stream)
        d_out = np.zeros([n_chan, Nsamples])
        task.start()

        total_read=0
        itercount=0
        old_time = Quantity(time.time_ns(),'ns',scale='ms')
        
        while not stop_event.is_set():
            # try:
            reader.read_many_sample(data=d_out, number_of_samples_per_channel=Nsamples)# read from DAQ
            now_time = Quantity(time.time_ns(),'ns',scale='ms')
            t_out = np.linspace(0,(now_time-old_time).real,np.shape(d_out)[1]).reshape(1,np.shape(d_out)[1])
            out = np.around(np.append(t_out,d_out.astype(np.uint8),axis=0), 2) #Round all values to 6 decimals to avoid overflow
            q.put_nowait(out)
            old_time = Quantity(time.time_ns(),'ns',scale='ms')
            itercount+=1
            total_read+=np.shape(d_out)[1]
            # except:
            #     task.stop()
            #     print('Task crashed after {0} samples.'.format(total_read))
            #     var_sizes(list(locals().items()))
            #     task.start()
            
            last_update = now()

ft_06 · ‎08-20-2024

My 2 cents

Seems to me you are in same case than https://forums.ni.com/t5/Multifunction-DAQ/Slow-python-DAQmx-reading-amp-writing/td-p/4389116

30 samples per channel at 1kHz means you execute every 30ms. This is for a GUI, go for 300ms for example (well, I think this is a GUI to configure the streaming, not to display so you don't even care, go higher)

nidaqmx-python examples do:

task.timing.cfg_samp_clk_timing(1000, sample_mode=AcquisitionType.CONTINUOUS)

task.register_every_n_samples_acquired_into_buffer_event(1000, callback)

Thus every second

v_pondfroth · ‎08-20-2024

Thank you for the feedback! The GUI does two things, configure streaming, and display data in real time. I want a fast execute rate. Regarding the "register_every_n_samples_acquired_into_buffer_event", how would this work with a callback? I currently am writing to a buffer Queue() [https://docs.python.org/3/library/queue.html#queue.Queue.put_nowait] that runs inside a while loop until an event occurs (event triggered by button push on GUI).

I can't understand from the documentation [https://nidaqmx-python.readthedocs.io/en/latest/task.html#nidaqmx.task.Task.register_every_n_samples...] how "register_every_n_samples_acquired_into_buffer_event" and its callback are expected to work. Would I still use a queue to transfer the data between threads? It seems like this is introducing a second buffer step that could be slowing things down or creating problems for myself.

ft_06 · ‎08-20-2024

Sorry, I did not mean to use register_every_n_samples_acquired_into_buffer_event(), I had in fact put in bold the values used in the API calls: sample rate is 1kHz and callback is registered for 1000 samples meaning data handling is performed every second. So basically handling big chunks of data not too often

I should have used the other example i nnidaqmx-python (sampling rate is 1kHz) showing same conclusion:

while True:
data = task.read(number_of_samples_per_channel=1000)

I don't know enough of API internals to recommend one API or the other but I would expect the read_many_sample(Nsample) to be efficient, that is your program is not polling but is woken up once Nsample have been acquired.

As you want to display data in real time, you will have a tradeoff on Nsample value (typical stuff for my Power&Perf engineer background). You are probably too low and you should go for hundreds of ms, this should still give good user experience (this is not a First Person Shooter, always challenge the need 😉 ) and relax the processing contraints

The display data framework must also be efficient, a lot of factors come in

At the end, fast execute rate may simply not be possible in python, C & efficient data display framework would be needed (dev/setup complexity vs perf)

At least you should start with like too big refresh period, like 1s, to ensure you can acquire all the channels and reduce this period by steps to see the most reactive you can have in python. This may suit you finally

Multifunction DAQ

NIDAQmx threading multichannel streaming lag

NIDAQmx threading multichannel streaming lag

Re: NIDAQmx threading multichannel streaming lag

Re: NIDAQmx threading multichannel streaming lag

Re: NIDAQmx threading multichannel streaming lag