01-15-2020 02:34 PM
So back to the original crash:
An application written entirely in LabVIEW (i.e. no external code such as third party dlls of various qualities) should never crash. If it does, something is wrong and NI should look into it.
@msoflaherty wrote:
I added the little bit of code attached, but now my program intermittently crashes about once a day. Disable it, and the problem goes away.
So how regular is that? Does it depend on the amount of user interaction or uptime?
If you look at the task manager, is there a constant increase in memory use?
Does the program misbehave in any way before the crash happens?
Is there anything in the windows log?
Is there anyway you could reduce the problem to a very simple example that still causes a crash?
That said, I still have that feeling that you are doing this in a weird way and there are better ways to achieve whatever you are trying to do.
01-15-2020 02:59 PM
Sorry for the sidetracking. Often on the boards people are trying to find patches to the problem rather than fixing the root problem.
@altenbach wrote:
That said, I still have that feeling that you are doing this in a weird way and there are better ways to achieve whatever you are trying to do.
I agree with this.
01-15-2020 03:34 PM
an authorisation error failed to post my message 😠
the system auto-stored a post... which it didn't.
I lost a pretty long comment explaining all the details. Pissed. Off.
Why should i expect the program to work if the website doesn't? Argh!
01-15-2020 03:45 PM
So excuse this message for being terse, but let's try all that again.
Labview crashed again. That's about 23 hours it ran before dying. As per all previous crashes, there's nothing in the application logs or windows logs to suggest it's about to go down, as far as I can tell. All standard application logging was going along as expected until the timestamp of the crash. No user was touching the system - UI interaction is pretty infrequent.
I checked the resource usage a few hours ago and it was about 30% CPU and 30% RAM. I'd need to monitor it more regularly over time to see whether that's going up, but if so it's very slowly - there was no discernible increase while i watched over about 10 minutes.
There are 3 windows events in the event viewer following each crash. They are:
The previous system shutdown at 1:10:36 PM on 1/15/2020 was unexpected.
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="EventLog" />
<EventID Qualifiers="32768">6008</EventID>
<Level>2</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-01-15T19:12:21.000000000Z" />
<EventRecordID>55885</EventRecordID>
<Channel>System</Channel>
<Computer>annieHV</Computer>
<Security />
</System>
- <EventData>
<Data>1:10:36 PM</Data>
<Data>?1/?15/?2020</Data>
<Data />
<Data />
<Data>189901</Data>
<Data />
<Data />
<Binary>E407010003000F000D000A0024005002E407010003000F0013000A0024005002600900003C000000010000006009000000000000B00400000100000000000000</Binary>
</EventData>
</Event>
The computer has rebooted from a bugcheck. The bugcheck was: 0x000000d1 (0x00000000000000b8, 0x0000000000000002, 0x0000000000000000, 0xfffff88005331161). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 011520-6084-01.
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Microsoft-Windows-WER-SystemErrorReporting" Guid="{ABCE23E7-DE45-4366-8631-84FA6C525952}" EventSourceName="BugCheck" />
<EventID Qualifiers="16384">1001</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2020-01-15T19:12:22.000000000Z" />
<EventRecordID>55889</EventRecordID>
<Correlation />
<Execution ProcessID="0" ThreadID="0" />
<Channel>System</Channel>
<Computer>ANNIEHV</Computer>
<Security />
</System>
- <EventData>
<Data Name="param1">0x000000d1 (0x00000000000000b8, 0x0000000000000002, 0x0000000000000000, 0xfffff88005331161)</Data>
<Data Name="param2">C:\Windows\MEMORY.DMP</Data>
<Data Name="param3">011520-6084-01</Data>
</EventData>
</Event>
The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
<Provider Name="Microsoft-Windows-Kernel-Power" Guid="{331C3B3A-2005-44C2-AC5E-77220C37D6B4}" />
<EventID>41</EventID>
<Version>2</Version>
<Level>1</Level>
<Task>63</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000002</Keywords>
<TimeCreated SystemTime="2020-01-15T19:12:17.607600000Z" />
<EventRecordID>55890</EventRecordID>
<Correlation />
<Execution ProcessID="4" ThreadID="8" />
<Channel>System</Channel>
<Computer>annieHV</Computer>
<Security UserID="S-1-5-18" />
</System>
- <EventData>
<Data Name="BugcheckCode">209</Data>
<Data Name="BugcheckParameter1">0xb8</Data>
<Data Name="BugcheckParameter2">0x2</Data>
<Data Name="BugcheckParameter3">0x0</Data>
<Data Name="BugcheckParameter4">0xfffff88005331161</Data>
<Data Name="SleepInProgress">false</Data>
<Data Name="PowerButtonTimestamp">0</Data>
</EventData>
</Event>
The application is too extensive to post here - there are many vis and it does a whole lot.
It uses just one third party library that makes external dll calls - the ZMQ for labview library. Previous stability for the last 2-3 years suggests there's no reason to suspect this, though.
As a minimal reproducer, I suppose I would try the snippet posted on the first post and just that, or that with another event structure handling other buttons (that's the top-level structure of the application). Unfortunately, though, this is an actively used system, and I can't really run it 24-28 hours doing nothing just to see if it crashes. It going down several times over the last week is bad enough, and at worst i know how to "fix" that - get rid of the new bit of code.
If there are better ways to achieve this, i'm open to any options that don't require a substantial amount of work. I'd like to remove that restriction, but needs must.
01-15-2020 03:50 PM - edited 01-15-2020 03:58 PM
@johntrich1971 wrote:
Sorry for the sidetracking. Often on the boards people are trying to find patches to the problem rather than fixing the root problem.
@altenbach wrote:
That said, I still have that feeling that you are doing this in a weird way and there are better ways to achieve whatever you are trying to do.
I agree with this.
Setting aside the details about the context, it doesn't seem to me such an odd thing to want to execute an action after 10 minutes of UI inactivity.
And there is a 'Wait on FP Activity' block, which does exactly that. Then, surely this is not such a "weird way" of doing this? The snippet I posted is very simple, and it does that job well. ... until it crashes the OS.
01-15-2020 04:41 PM
Bug Check 0xD1: DRIVER_IRQL_NOT_LESS_OR_EQUAL. It would seem that a low-level driver had coughed up a hairball. Maybe the whole "Wait For Front Panel Activity" thing is a red herring. Maybe it's what you do when it's been inactive is what's crashing the computer.
I would turn off the auto-reboot on blue screen thing so you can catch the blue screen.
01-15-2020 04:41 PM
"The first step in solving a problem is clearly defining what the problem is."
I did not see anything in that dump that pointed at LabVIEW being the cause.
If you suspect the offending code is the user "Wait on ..." then toss it temporarily and wait 2-3 the crash interval. If it stops crashing we know what the "problem is". If it still crashes then we have to look elsewhere.
The next step would run just that "special" code on another machine and wait for it to crash. If it does not crash, then the problem may be an interaction issue ... which takes us in yet another direction.
I am still leaning toward an issue with power since that is the only thing that jumps out of the log you posted.
I agree with you that the web-site provided by "Khoros" sucks.
If Khoros wrote LV, I would be coding in C++.
Ben
01-15-2020 04:49 PM
@billko wrote:
Bug Check 0xD1: DRIVER_IRQL_NOT_LESS_OR_EQUAL. It would seem that a low-level driver had coughed up a hairball. Maybe the whole "Wait For Front Panel Activity" thing is a red herring. Maybe it's what you do when it's been inactive is what's crashing the computer.
I would turn off the auto-reboot on blue screen thing so you can catch the blue screen.
That prompts the question "what else was changed" since when it worked. Did the LV version or OS version change?
Still smells like a hardware (related ) issue and not a LV issue.
Ben
01-15-2020 04:51 PM
@Ben wrote:
@billko wrote:
Bug Check 0xD1: DRIVER_IRQL_NOT_LESS_OR_EQUAL. It would seem that a low-level driver had coughed up a hairball. Maybe the whole "Wait For Front Panel Activity" thing is a red herring. Maybe it's what you do when it's been inactive is what's crashing the computer.
I would turn off the auto-reboot on blue screen thing so you can catch the blue screen.
That prompts the question "what else was changed" since when it worked. Did the LV version or OS version change?
Still smells like a hardware (related ) issue and not a LV issue.
Ben
I'm with you on the hardware thing.
01-15-2020 05:23 PM
After more thought about the link Bill provided...
Cut-n-paste from that link;
"The DRIVER_IRQL_NOT_LESS_OR_EQUAL bug check has a value of 0x000000D1. This indicates that a kernel-mode driver attempted to access pageable memory while the process IRQL that was too high."
Interupts are initiated by hardware. When they occur, the current instruction is stopped and an interupt vector is accessed to determine where to point the program counter to handle that interupt. This operation is performed in Kernal mode where the OS does the work of making the hardware work. ( Reacting to mouse moves, discovering if a disk I/O has completed, handling a page fail etc).
In Kernal mode, virtual memory is not available and the interupt service routine MUST be in memory and available.
So it seems that hardware has to be involved somewhere and somehow.
Which prompts me to ask more about the details of this old hardware and what it has to do with a VT 100?
Do you actually have a VT 100 wired to the PC?
Is it the weird hardware that emulates a VT 100?
Curious but not sure what to suggest next.
Ben