12-15-2010 07:59 AM
Hi,
As always I hope some one can help.
I need to read a file and each line of the file put the contents into an array.
This has been done in Python:
firstline = 1
aline = inline1.split() : Splits the inline1 on the spaces into aline
atime = int(aline[2]) : Item 3 of the array is time
areg0 = hex2dec(aline[3]) converts item 4 of aline to dec and stores it to areg0
areg1 = hex2dec(aline[4])
this is the first line which has 600 elements in the array
It then moves to the next line and reads it into a second array, also 600 elements.
Element 3 of the first array is compared with element 3 of the second array
This is repeated for all 220000 lines of the file
I have a simple file which I can read the line but cannot manage to split it on the spaces and save into an array.
I have been looking through the Discussion Forums but unable to get this sorted.
I have seen mention of .ini files but I do not have the structure that they might have.
How do I split a line and store the result into an array for use?
Thanks for the help
Simon
12-15-2010 09:48 AM
Hey Simon -
One way to do this would be to use the strtok function from the ANSI C library. It will traverse a string looking for a set of characters you define, and when it finds one, it will "tokenize" the string by replacing the character with a NUL character. By saving off the return value of strtok, you're able to split the string based on any set of separators you choose.
I've written and attached a little example that takes an ASCII file, splits each line based on spaces or tabs, assigns each word to an array element, and then prints each array element. It could be extended to meet your needs. It's not multibyte safe, but could easily be extended to be so. Also, If this makes it's way into something official, you'll want to be sure to add some error handling... 🙂
NickB
National Instruments
12-15-2010 10:29 AM
NickB,
Thanks that appears to be the start of something I can use.
There is one problem that I can see so far.
Apart from the first 600 lines which would have 6 items each other line of the file has 600 items and it would appear I can only get a fraction of this longer line.
I can only get to 636 with the #define MAX_LINE_SIZE above which I get an error.
Any thoughts how this can be increased to get the full line.
Thanks for the help
Simon
12-15-2010 10:39 AM
Simon,
MAX_LINE_SIZE is used to dimension the buffer into which lines are read. If you know your lines are longer than 512 characters you can simply use a larger buffer setting a different value for that macro. Nowadays computers have a lot of memory, so the dimension of that single buffer will have no major impact on system resources.
12-15-2010 10:41 AM
Eeek - there was a bit of a bug in my example code. I mistakenly defined 'i' as a char in the function SplitLine. It should be an int instead. My guess is that making that change should fix things for you.
NickB
National Instruments
12-16-2010 03:54 AM
Nick,
Changing the assignment on “i” did the job as I can now read in a whole line, thanks a lot for that.
At the risk of pushing my luck can I also ask for direction on the following?
As I need to compare line 1 with line 2, then compare line 3 with line 2 no through the file I need to copy the arrays and compare.
I am once more struggling to copy the original array into a copy.
I only want two arrays on the go as a file can run into 200000 lines.
Would it be better to perform the read line twice and then do the compare, assuming I can compare element of one array with the second.
Thanks for the help
Simon
Learning and struggling with C!!!
12-16-2010 09:35 AM
Hey Simon -
Do you need to compare each individual item, or would it be sufficient to compare each line as a whole? Will strcpy work for comparison, or do you need to do more complex comparison for dates/times, etc? Do you only need to compare line x to line x+1?
NickB
National Instruments
12-16-2010 10:08 AM
NickB,
Thanks for the interest and the help.
Line 1 element 1 compared to line 2 element 1 and so on till the end of the line.
The result of the compare, in this case a difference, for each element is checked.
Then line 2 element 1 compared to line 3 element 1 and so on till the end of the line.
Same check as above.
This is done till the end of the file.
Result of ALL compares must be a pass, all fails are counted and this count is displayed at the end
The reading in of lines and dividing into columns is a core requirement of the data we use, in some cases it is all the data in one column of all lines in the file is what is required.
I have come to the realisation this is no easy task, in python it would appear a lot more straight forward so once more thanks for the help and direction on this problem.
A very small snippet from a file (600 entries across with 200000 lines):
1 0 99 d6b0 d6b1 d6ba d714 d715 d716 d717 d718
1 0 99 d909 d90a d913 d96d d96e d96f d970 d971
1 0 99 db62 db63 db6c dbc6 dbc7 dbc8 dbc9 dbca
1 0 99 ddbb ddbc ddc5 de1f de20 de21 de22 de23
1 0 99 e014 e015 e01e e078 e079 e07a e07b e07c
1 0 99 e26d e26e e277 e2d1 e2d2 e2d3 e2d4 e2d5
Thanks for the help
Simon
Learning and struggling with C!!!
12-16-2010 11:19 AM
Srm27:
How often do you expect failures?
If you don't have many failures, and due to the relatively large size of your dataset, you may save yourself some execution time if you compare the entire line, as NickB suggested, using strcmp().
Do you have any reason to split the lines other than to identify failures?
If finding failures is your only reason to split the line, then I'd suggest comparing the whole line first, and only if they don't match should you split the line and start looking for individual failures.
12-16-2010 05:07 PM - edited 12-16-2010 05:11 PM
Hey Simon -
I'm still a little unclear on exactly what your requirements are. It seems as if you do have to do an item by item compare, but not for all the items in the line? Anyhow, I modified the example I attached above to remember a previous line and compare each of its items against the items of each new line. I've attached my modified example and the file I used to test it. I also use some of the List API from the programmers toolbox in this modified version. I suspect you will find this API to be very useful in your project.
I'm sure what I've attached below is far from what you actually need, but hopefully you'll be able to modify it to end up with a good result.
Let me know if you have any questions,
NickB
National Instruments
PS: Al, looks like you were able to understand my intentions even though I used the wrong function name. Thanks for clarifying for me 🙂