"Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Taggart · ‎11-17-2024

Stephen is very smart and has thought about this a lot. He is the AF guru and this is the AF channel, so I would probably listen to him.

Someone needs to be contrarian though, so I'll pick up the mantle.

I think asynchronous is way oversold. I watch people do backflips to synchronize things in actor-oriented systems and I always come back to why did you make it asynchronous in the first place? I've seen DQMH modules where every message was a request and wait for reply. I've seen plenty of small/midsize projects where they have a whole bunch of actors and again jump through hoops to synchronize things and I'm like "You could do all that in one loop." Why would you add all that complexity? --- Usually its because they read somewhere (maybe here) that advanced programmers do asyncronous programming and they think they need to in order to be advanced.

Sync vs Async, neither is write or wrong. They are both tools. Use them appropriately. Yes AF/DQMH are great hammers for Async, but that doesn't mean everything is an Async nail.

Unfortunately AF is contagious and makes you want to turn everything into Actors. I'm not sure that's a great design principle. I'm a firm believer that not everything needs to be an Actor and that not everything should be.

Sam Taggart
CLA, CPI, CTD, LabVIEW Champion
DQMH Trusted Advisor
Read about my thoughts on Software Development at sasworkshops.com/blog

AristosQueue · ‎11-17-2024

I'll be honest: I've extremely rarely had to jump through any hoops to handle asynchronicity in messaging applications, AF or otherwise. At some point, could someone post some actual code they've got where they had to do something exotic to maintain async messaging so we can talk about a concrete scenario? Hardware is generally required to be async because there's a network link in between two devices. UI is inherently async in event handling. Data processing is so much more laminar as dataflow, so handing off between processors is a lot more natural than multiple processors trying to manipulate the same object at the same time.

Overall, the only barrier I ever faced in getting things to be generally async was having a language that made expressing the async message easy. With AF, I finally had a tool that worked well for that, at least for me. Once the syntax was as easy as "Send this message", I've never had a parallel application that wasn't *far easier* with async.

Your mileage may vary. I'd like to know where those potholes are so I adjust the teaching I do to account for them.

drjdpowell · ‎11-18-2024

@AristosQueue wrote:

Hardware is generally required to be async because there's a network link in between two devices. UI is inherently async in event handling. Data processing is so much more laminar as dataflow, so handing off between processors is a lot more natural than multiple processors trying to manipulate the same object at the same time.

There is a concept of "System State" that arises with Actors: the collective state of multiple actors working together. The stuff you're describing here is steady-state message flow, with unchanging System State. With Messenger Library, I would impliment all this with asynchronous messaging, just as you would with AF. Actor "A" takes data and passes to "B", who analyses and passes to "C", who saves to disk. All Async.

Synchronous comes in in handling changes to System State. Here I think all-async can be difficult to pull off easily without getting into problems.

For example, if "Main", the Caller of A, B and C above, needs to make a change in the parameters of A and B, while changing the file saved to in C, and ensuring all data is analysed and saved to the correct file. This is a "System State" change.

I would:

Tell A to stop collecting and wait till it replies (Sync): tells me all data is in B's Queue
Ping B and wait (Sync): this tells me that all data has been analysed and in C's Queue
Tell C to change it's file (this could be Async, but I would wait to see if there was an error opening the new file)
Tell A and B their new parameters (Async)
Tell A to start collecting (Async)

Now, one CAN do this in an all-async fashion, but I think it is more difficult and it is easy to introduce bugs (at least from those not very experienced).

drjdpowell · ‎11-18-2024

FYI: I went into System State in this talk of many years ago:

https://youtu.be/pZ8w1AhDApE?si=1rlJEwklYbu8UGl3&t=762

Here I was showing how the NI Template example could be made simpler with Syncronous messaging to shut down the system. And that was a very simple example that just needed to stop two actor-like things in parallel (I didn't mention it in the talk, but the original design would be vulnerable to race conditions if the example was ever further developed).

AristosQueue · ‎11-18-2024

Still not discussing actual code, but I'll take a shot at explaining why this doesn't strike me as a synchronous messaging situation.

> Actor "A" takes data and passes to "B", who analyses and passes to "C", who saves to disk.

And there's the bug from AF perspective. There is no "A passes to B passes to C where Main can message all three directly". That's arbitrary graph topology of actors, not a tree. If you abandon the tree structure, yes, I agree, the state of the whole system is nearly impossible to define.

The path would be either

"A passes to Main who passes to B who passes to Main who passes to C"

or

"Main passes down to A who passes down to B who passes down to C who passes up to B who passes up to A who passes up to Main".

In both cases in AF, Main tells A "parameter has changed" which instigates the next step of the state transition. Either Main is brokering the whole thing (case 1) or Main doesn't even know that B and C exist (case 2). The state transition of the system is well defined for each of the nested actors by its caller.

In both cases, there's a strong chance that instead of changing the parameter, I just kill A, B, and C and spin up new A2, B2, and C2 actors. Let A, B, and C come to a natural stop async.

AristosQueue · ‎11-18-2024

@drjdpowell wrote:

FYI: I went into System State in this talk of many years ago:

https://youtu.be/pZ8w1AhDApE?si=1rlJEwklYbu8UGl3&t=762

Here I was showing how the NI Template example could be made simpler with Syncronous messaging to shut down the system. And that was a very simple example that just needed to stop two actor-like things in parallel (I didn't mention it in the talk, but the original design would be vulnerable to race conditions if the example was ever further developed).

This example also violates the tree. It's not an AF example. I agree -- the state explosion into six states indicates a hard-to-implement/hard-to-maintain design. But a simple example like this in AF does exist in the project template, and its stop behavior is trivial, both for normal stop and emergency stop.

drjdpowell · ‎11-18-2024

AQ: question: how would your two tree designs handle receiving two parameter changing messages in short succession? So, "change to A3,B3" being received while "change to A2,B2" is still in progress? Would there be potential for a bug. It is difficult for me to tell. I think your A calls B calls C design should be OK, but I'm unsure about the other design. My design obviously works, as the second change isn't handled till the first is finished.

BertMcMahan · ‎11-18-2024

I'll throw out an actual example in hopes that someone can either validate me or educate me on a better way to do it.

I have a test executive that runs a variety of user-selectable tests. Each test is an Actor that runs the test start to finish. When the user selects a test to run, it's loaded into a subpanel. The user can adjust parameters, run the test, view the data, etc. All of this is contained within the test itself, since each test has its own particular parameters to tweak and plots to display.

Each of these actors are kept alive so the user can view the results interactively. They also accept "Lock" messages that prevent the user from changing the data accidentally. These tests are shown in a list that the user can browse to view the test history of the DUT.

When the user wants to save this to a file, they click "Save" and select a path to a new TDMS file location. The test executive sends this path to each Test actor, which exports itself to that TDMS file. Each Test gets its own subgroup and can handle its own data.

Here's the asynchronous part: when saving to this file, the test system should be locked so that the user can't do anything until the Save is complete. This means the user can't close the program, or run a new test, or delete something from the list, etc. So, I implemented this functionality as a synchronous message that's intentionally blocking until all of the test actors have completed their Save operation.

This behavior is inherently synchronous, so I used synchronous messages. I considered using async messages to handle this. The Main actor could maintain a "Saving" or "Not saving" state. When you trigger a Save, it would lock out all buttons and make a list of all of the test actors. When each one returned, it would check them off the list, and would allow button presses once again. That would work, but it adds yet another state to the "main" executive, and blocks all incoming user interactions anyway. With sync messages, I could call a modal popup, wait for all of the tests to return "Yep, I'm done saving", then move along. That seemed like a much cleaner solution, and it's been working so far and hasn't caused me any headaches.

Loading these back into the viewer from a TDMS file works synchronously as well, but that synchronization happens by making sure all tests are loaded successfully in Pre-launch Init. The "Loader" gets a list of all tests to load in a given TDMS file and launches them serially.

As I understand it, PLI is intentionally non-reentrant and is blocking so the caller can know the child launched correctly. This seems like one of the only parts of AF that's synchronous between a parent and a child. I feel like I could argue that "launch" could be viewed as a type of "message" that needs to be handled synchronously, which the AF does. (Otherwise, Launch Actor would be always non-blocking and would require the parent to handle the Last Ack message to detect when a child failed to start.)

DoctorAutomatic · ‎11-18-2024

@AristosQueue wrote:

I'll be honest: I've extremely rarely had to jump through any hoops to handle asynchronicity in messaging applications, AF or otherwise. At some point, could someone post some actual code they've got where they had to do something exotic to maintain async messaging so we can talk about a concrete scenario? Hardware is generally required to be async because there's a network link in between two devices. UI is inherently async in event handling. Data processing is so much more laminar as dataflow, so handing off between processors is a lot more natural than multiple processors trying to manipulate the same object at the same time.

Overall, the only barrier I ever faced in getting things to be generally async was having a language that made expressing the async message easy. With AF, I finally had a tool that worked well for that, at least for me. Once the syntax was as easy as "Send this message", I've never had a parallel application that wasn't *far easier* with async.

Your mileage may vary. I'd like to know where those potholes are so I adjust the teaching I do to account for them.

AQ, while I'm sorry to disappoint with not having code to share, I think a very common circumstance people encounter, especially just starting out with AF, has to do with configuring a hardware interface actor and then "starting" the hardware interaction.

Lets say you have actor "Main" and actor "HW Controller". Main actor launches HW Controller actor. Main must use a combination of config file parameters and user inputs to configure HWC actor before HWC actually runs the hardware. Main sends these config parameters, as they become available to Main, on to MWC. When is it safe to "start" MWC hardware interaction? Assume MWC can never know whether the last config message sent was the final one, and thus know for itself whether it is fully configured to execute it's "run". Main can know whether it's done sending configuration messages to MWC but because there's no guarantee of message order, Main can't simply send the last config message immediately followed by a "run" message (well it can, but "technically" there's no guarantee the "run" message will arrive after the last config message). I realize this is totally pedantic because, messages do in fact generally arrive in the order in which they're sent, especially in small to mid sized systems, and you could implement either some sort of config echo message that confirms the last config message was received back to Main before sending the start message, or if the scenario allows (and it certainly does NOT always allow), you simply configure everything on the static actor object before launch (using, ahem, synchronous accessor methods 😉 and the actor jumps right into "running" upon start of actor core. It's just that to most people, something like stringing together synchronous config VI's that return right there on your calling BD gives you an instant guarantee of order and completeness.

Well, at least I feel like that illustrates at least one scenario where the asynchronous way feels more complex/convoluted/difficult than if one were using a bunch synchronous of subvi's instead of an actor. I think partially it's difficult to receive an asynchronous response for something. It is received out of context of the exact request. You have to plan for how to interpret the response and possibly "remember" why you wanted it in the first place (though making these sorts of arrangements can be a significant code smell in many instances, IMO).

I agree asynch is always preferred, but it's true that devs find it harder to reason about.

D_Hooks · ‎11-18-2024

The example that comes to mind for me is actors that coordinate activity between other actors, but are not their direct caller. (Similar apologies for no demo code AQ).

So you might have a system with a XYZ Stage actor and a Camera actor. In my mind those are pretty clearly asynchronous devices that you might want to use several ways within an application. They run in parallel, they do things independently. Great.

Where I've seen developers get hung up "jumping through hoops" is when you want to introduce something like an Autofocus actor into the system. This is just one possible use of the XYZ Stage and Camera, and I'm going to say all three of these actors are launched by a common Main caller. Autofocus doesn't own the XYZ Stage or Camera, but it uses them.

When an autofocus operation starts, the Autofocus actor will send a message intended for the XYZ Stage actor to move to a new z-axis location. Once that move is complete it will send a message intended for the Camera actor to acquire an image. When the image is returned it will need to perform some processing and determine next steps, repeating until the autofocus operation has completed with some result.

The way I've often seen it coded up is the Autofocus actor sends a message intended for XYZ Stage to move, then sets a state enum in its private data to "Waiting on XYZ Stage Update". When a stage update message arrives at the Autofocus actor it might need to check the contents to see if the update was sent after the move was requested, if the stage is stopped, if the z-axis position matches the requested position (maybe within some limits?), if the x-axis and y-axis positions have remained unchanged, etc. Since Autofocus isn't the parent of XYZ Stage it can't know for certain that no one else has asked the XYZ Stage to do something else, either before or while it was trying to use it.

If all that checks out then the Autofocus actor sends a message intended for Camera, and sets the state enum to "Waiting on Camera Image". When a camera image message arrives at the Autofocus actor another round of verification needs to happen to try and figure out if the image received was the one it wanted, etc. If the image checks out then some processing is done, and additional messages are sent out for the remaining steps in the process.

The more hardware resources you want to use in a concerted manner, the more complicated all of the state transitions and validation checks need to be. Plus the Autofocus actor might need to check on some sort of timeout, where if no acceptable update is received within a given time it needs to bail out of the autofocus operation and inform its caller that the autofocus routine failed.

If a new developer comes onboard and needs to understand the autofocus routine, the logic and state transitions are often spread out through numerous message payload methods and other subVIs, and it can be tough to sort it all out.

I use AF all the time and love it, but I definitely understand how developers can look at something like my theoretical Autofocus actor and think its all too much headache. They want to see a while loop with blocking control of the XYZ Stage and Camera, where you can just call Move Stage, Snap Image, and Process, and continue looping until you decide that focus is complete. That doesn't allow for all of the benefits of parallelism that you can get by controlling your devices with actors, but I have to admit that it's vastly more readable code and easier to understand.

I'm personally not always certain how to implement and document the asynchronous logic to make it easy for people to understand and work with. Maybe others have some good suggestions?

Actor Framework Discussions

"Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?

Re: "Acceptable" or extremely "low risk" situations for the use of synchronous messaging?