DIAdem

cancel
Showing results for 
Search instead for 
Did you mean: 

Why are my TDMS files so large?

I’ve been generating a lot of data lately which I am saving in the TDMS format and then analyzing with DIAdem.

My question is about the file size…why are they so large?

For example, I generated a .tdms file that had about 83K rows x 45 columns of data. This weighed in at a staggering 247 Meg for the .tdms file and 217 MEG for the index file! The same amount of data in a flat file format weighs in at about 29 MEG…and order of magnitude less. Is there something I’m doing wrong?

I was under the impression that the .tdms file format had a fairly small footprint. At this rate, for a 4000 hour test (our target run time) the files will be approximately 80 Gigabytes combined. Something doesn’t seem right here…

Just to give you the background...I am running a test where I am appending data to a TDMS file once a second. I am writing the same data each time (i.e. same columns). I can furnish the .vi I am using to generate the data if it will help.

I read somewhere that .tdms files are written in segments and that each segment contains index and header information. Could part of the problem be that each time that I write new data to the file (every second) it is treating this as a new segment and storing all of the header info with it (even though it would be identical to the previous point)? I read that "The index file is an exact copy of the *.tdms file, except in that it does not contain any raw data and every segment starts with a TDSh tag instead of a TDSm tag" (

http://zone.ni.com/devzone/cda/tut/p/id/5696). If I subtract the file size of the index file from the tdms file, it comes in at around 30 Megs of raw data, which corresponds with the flat files I was comparing it to earlier. Is there a way to minimize this other meta, index, and header data within the file?

Message 1 of 19
(10,346 Views)
Every time you call TDMS Write, you will create a header on disc, which for single-value acquisition is not very efficient. In order to deal with that, TDMS has a built-in buffer feature. You can set the property NI_MinimumBufferSize for each channel to the number of values this channel is supposed to buffer before actually storing to disc. That way, you can keep your code logging scan after scan, but internally, your values will be collected and written in larger chunks. Good ballpark numbers for NI_MinimumBufferSize are 1000 or 10000. If data is still left in memory by the time you call TDMS Close or you abort VI execution, it will automatically be written to disc.

Hope that helps,
Herbert
Message 2 of 19
(10,341 Views)
Do i set this using the "TDMS Set Properties" VI within my labview program?
Message 3 of 19
(10,335 Views)
Yes. Before you start writing to the file, when you have the list of channel names you'll be working with, you loop over all of them and set the property (the point of setting the property on a by-channel basis is that there are situations where you might want different buffer sizes on different channels).

Herbert
Message 4 of 19
(10,328 Views)

Excellent.  Thanks for the help!

I just ran a 2 minute (120 rows) sample with the buffer set at 1000 and compared it to a run with no buffer.  My TDMS file size went from 358K to 52K and the index file went from 315K to 9K.  It seems that the larger the buffer I set the smaller the file will ultimately be, correct?  You stated that in the case of TDMS File Close being called or the VI being aborted that the remaining data in the buffer is written to the file.  Im assuming that in the event of a power loss, all the data in the buffer is lost.  Therefore, would you recommend setting the buffer size to the ammount of data that I can tolerate losing in the event of some unforseen event (power loss, etc.)?

Message 5 of 19
(10,324 Views)
You're right in both cases.

A larger buffer will make for a smaller file, plus it will speed up reading from the file.

In case of a power loss, you will loose the buffered data. You can enforce writing the buffer to disc by calling TDMS Flush, but that will bring you back to having a more fragmented, larger file.

Herbert
Message 6 of 19
(10,320 Views)
What would defragmenting the TDMS file afterwards mean for the file size?

Ton
Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!
Message 7 of 19
(10,285 Views)
Defragmenting should make the file even smaller, except if it had the contents of only one buffer in it.
Herbert
Message 8 of 19
(10,280 Views)
How does one access this "NI_MinimumBufferSize" property?  I'm running LV RT 8.2.  I don't think I have the 8.21 upgrade... is it only available on 8.21?  Is this upgrade free?

Thanks.
0 Kudos
Message 9 of 19
(10,222 Views)
You'll need 8.2.1 for that. It's not enough to just switch runtime engines. You also need to compile your app with 8.2.1.

Not sure this is free. You need to contact your local sales person for that.

Herbert
0 Kudos
Message 10 of 19
(10,216 Views)