10-13-2015 12:28 AM
I was using datalog for this purpose but just for the reference purpose, I would like to know what could be the effcient methods to store and retrieve data to/from a large array structure on disk. Some of the constraints are:
- Must not load the whole file to memory due to its large size.
- Must be able to walk though the elements in the data structure, eg, retrieve element #123 from the array.
- Able to perform array like operations, like read, write, replace, ...
- Element data size is highly undermined and varies wildly (like flattened strings...)
- Effcient storage (minimum space wastes)
Regards,
10-13-2015 02:19 AM
I expect that existing formats like the datalog you mentioned or TDM already handle this better than you would, so I would not suggest rolling your own. A database also comes to mind as something which will probably be more efficient than you will. There are specialists who deal with this. Let them do their work.
If you still want your reference, I have no particular material for you (never did anything like this myself), but the most obvious choice for balancing your somewhat conflicting requirements is to use a method similar to how data is stored in memory, so you would have a dedicated space at the beginning of the file for all the "pointers" and you use that to index into specific positions in the file (or multiple files, if that works for you). Then you balance between having multiple files (having to open handles, etc.) and having to track the space in a single file (find empty blocks, resize the file, do defragging, etc.). You could also try searching to see what people have suggested in the past (I'm guessing there's some material on this), but my recommendation would probably still be to use established tools.
10-13-2015 02:23 AM
Would you be so kind to comment on one of my other topics Datalog Write can only append and not replace record ?
10-13-2015 07:23 AM
No, because I know absolutely nothing about the datalog format.