06-23-2014 04:40 PM
I'm developing on LabWindows/CVI 2013.
I'm having trouble with the Timer() function.
I'm using the Timer() function as a "time-out".
Before I go into a while loop I compute a timeout time. Each time around the loop the code calls Timer().
If Timer() returns a time that is greater than the timeout it exits with an error.
Very simple and I've used this often.
The problem is that every once in a while the Timer() function adds 2^32 (4294967296) to the value it returns.
This causes the timeout to happen when it should not.
It does not happen on a Win7 64 bit computer that I'm testing on.
But it does happen on a Win7 32 bit computer.
Another thing that may be significant is that this loop runs in and Async timer callback.
Anyone have any ideas on what is causing this?
Thanks,
Kirk
06-24-2014 02:41 AM
Hmmm, Timer returns a double, so there shouldn't be any rollover. There must be some bug in the function with an internal rollover.
06-24-2014 03:28 AM
Well, it used to exist a rollover problem in the very past (before CVI5.5 if I'm not wrong), when timer rolled over in about 49.5 days (2^32 msecs). It has been corrected in Timer () function and in UI timers. See this thread and this one for reference.
In both threads, however, Luis says that async timers are implemented differently and could be subject to this problem. I don't know if it has been resolved since 2007.
In the past I used to use async timers for long-lasting tests and I had properly handled the rollover, next I moved to a different approach so I can't tell if and when it has been corrected.
06-24-2014 09:47 AM
Since then, I don't have any new information regarding async timers. However, it sounds as if the problem in this case is wiith the Timer() function. If so, it doesn't matter that the main loop is taking place within an async timer callback, since that does not impact the values returned by Timer().
Also, there is no indication in the problem description that this program or loop runs for a very long time (49 days or greater), so I'd be disinclined to suspect a rollover problem.
At this point I'm stumped as to what might be causing this. If it is system-dependent, I suppose it could be being caused by some misfiring of the performance counter (I've seen that happen only once before, because of some faulty hardware, but that's still vague speculation on my part). Is this problem consistently reproducible, and does the failure happen relatively early in the process? If so, would it be possible to test a simple program in this computer that does nothing but call the Timer function in a loop and logs the return values, in order to analyze the bad behavior in a more deterministic manner, and also hopefully rule out possible interference from other areas of the program?
Luis
06-24-2014 12:29 PM
Roberto, Luis,
Luis you are correct, the problem is with the return from a call to Timer(). In about 2 hours of running it jumped up by 2^32 about 20 times.
For now I've coded a work-around that looks for the jump and recalculates my timeout time so it doesn't exit the loop early.
This seems to work but I'd like to know why this is happening.
Here's a sample of what I'm seeing in my application for Timer() values when it "jumps":
getAccLogTimeNow = 4295079.367815
getAccLogTimePrevious = 112.071812
getAccLogDelta Now - Previous = 4294967.296003
getAccLogTimeNow = 8590396.950761
getAccLogTimePrevious = 4295429.371964
getAccLogDelta Now - Previous = 4294967.578797
getAccLogTimeNow = 12885613.959814
getAccLogTimePrevious = 8590646.663812
getAccLogDelta Now - Previous = 4294967.296002
getAccLogTimeNow = 17181056.510927
getAccLogTimePrevious = 12886088.959695
getAccLogDelta Now - Previous = 4294967.551232
getAccLogTimeNow = 21476063.514672
getAccLogTimePrevious = 17181096.218672
getAccLogDelta Now - Previous = 4294967.296000
getAccLogTimeNow = 25771130.848408
getAccLogTimePrevious = 21476163.552405
getAccLogDelta Now - Previous = 4294967.296003
Today I created a simple application and I'm running it at the same time as the program that has the problem. So far the simple app. does not show any problem.
One thing to note is that the Timer() value starts at zero when the program starts.
This means each app that runs must save some kind of offset from the "master" timer.
I thought maybe that once I saw the Timer() value jump in the problem app that my test application would also see the jump but that is not the case. It continues normally.
I wonder if there is something my code is doing to mess up the offset or something???
Also if it matters the app with the problem is doing lots of TCP communication. My simple test app is not.
Any other things you can think of that I can do to dig in and diagnose the cause of this?
Thanks,
Kirk
06-25-2014 02:14 AM
Kirk,
Timer () starting from zero is the expected behaviour, as the function returns the time difference from the first call to any time-related function. See the help for a detailed explanation. If you need some absolute time reference you have a series of resources to choose from, from C time () function up to CVI Absolute Time functions like GetCurrentCVIAbsoluteTime.
I'm curious about your timing function: I am using this paradigm very often and never had any strange behaviour:
double tini; tini = Timer (); while (Timer () - tini < timeoutValue) { doSomething (); // Your code here }
Can you post your code? It's not necessary to post the complete function, just the time-related instructions.
06-25-2014 01:01 PM
Roberto,
Here is the code that exhibits the problem. You can see the logPrintf(..) calls that generated the data in my previous post.
I can post the "simple" test app but it does not exhibit the problem.
Let me know what you think.
Thanks,
Kirk
TStat getAccLogSamples( TStreamControl * pSc, unsigned int numSamplesToGet, short int * * pSampleBlock, double timeoutSecs )
{
TStat status = STS_OK;
double timeOutTime;
time_t currentTime;
int i;
// this do while is not really a loop. It is used so the code can exit usging a break statement at any point if there is a function that returns status
// notice the break; at the bottom just above the while(1)
do
{
getAccLogElapsedTime = 0.0;
getAccLogStartTime = Timer();
// the caller specifies how long to wait before it gives up
timeOutTime = Timer() + timeoutSecs;
pSc->totalSamplesToSave = numSamplesToGet;
// figure out how many blocks to transfer
if( (numSamplesToGet % NUM_ACCEL_SAMPLES_PER_XFER) == 0 )
{
// if it goes in evenly
pSc->totalStreamBlocksToTransfer = numSamplesToGet / NUM_ACCEL_SAMPLES_PER_XFER;
}
else
{
// to make sure at least the required number are transferred have to add one block
pSc->totalStreamBlocksToTransfer = (numSamplesToGet / NUM_ACCEL_SAMPLES_PER_XFER) + 1;
}
// start the data moving
status = startStreamAccLog( pSc );
if( status ) break;
// initialize the timer that paces the polling so it doesn't happen too often
accLogPollTimeNow = Timer();
accLogPollNextPollTime = accLogPollTimeNow + ACCLOG_POLL_INTERVAL_SEC;
// **** this is the while loop that times out prematurely when the timer jumps by 2^32
// wait for the sample to complete
while( pSc->streamingActiveFlag )
{
// is it time to poll the streaming data?
accLogPollTimeNow = Timer();
if( accLogPollTimeNow >= accLogPollNextPollTime )
{
// yes, advance the poll time
accLogPollNextPollTime = accLogPollTimeNow + ACCLOG_POLL_INTERVAL_SEC;
// keep the data moving while waiting for the block to complete
status = pollStreamAccLog( &streamControlAccLog );
if( status )
{
// attempt to shut it down
stopStreamAccLog( &streamControlAccLog, 1.0 );
break;
}
}
// debug log print when timer jumps
getAccLogTimeNow = Timer();
if( (getAccLogTimeNow - getAccLogTimePrevious) > 10000.0 )
{
// here is where it recognizes the jump in the return from Timer()
logPrintf( "getAccLogTimeNow = %f\n", getAccLogTimeNow );
logPrintf( "getAccLogTimePrevious = %f\n", getAccLogTimePrevious );
logPrintf( "getAccLogDelta Now - Previous = %f\n", getAccLogTimeNow - getAccLogTimePrevious );
// the timer did the "skip up by 2^32 thing so reset it and keep going
timeOutTime = getAccLogTimeNow + timeoutSecs;
}
getAccLogTimePrevious = getAccLogTimeNow;
// timeout if it waits too long (this is the "real" timeout)
if( getAccLogTimeNow > timeOutTime )
{
status = STS_WAIT_ACCLOG_SAMPLES_TIMEOUT;
// go clear the protocol and wait long enough for any extraneous data to come in
stopStreamAccLog( &streamControlAccLog, 1.0 );
break;
}
}
if( status ) break;
// give the caller the pointer to the samples in malloc'ed RAM
*pSampleBlock = pAccLogSamps;
// if it falls through here, then everything was successful
break;
}
while(1);
return( status );
}
06-26-2014 01:48 AM - edited 06-26-2014 01:54 AM
I see nothing that warns anything in your code.
I was in the line of some overflow due to different data types (like using time_t to calculate time instead of doubles. time_t in a 32-bit environment is an unsigned int which means it will overflow at 2^32) but I see all your variables are doubles. Except for checking type of variables that are not defined in this callback and that you are not manipulating your globals elsewhere in your code I have no suggestions for your problem.
06-26-2014 04:26 PM
OK,
I realize this is a tough problem. It is Intermittent, doesn't happen on all computers, takes several minutes (at least) between failures.
I'm continuing to run tests and it is still showing the problem. But not on my simplified example which is more like your example. Even when it is running at the exact same time along with the problem code.
There must be something else going on in the code that is messing up the timer values.
I'll keep trying to narrow it down.
Thanks much for taking a look and trying to help out,
Kirk
06-27-2014 11:43 PM - edited 06-27-2014 11:44 PM
Instead of looping on Timer(), can you try calling SyncWait function to determine the next time to poll?