One of the common questions I receive relates to why a process does not wait for the right amount of time when using a standard Wait activity – some times it will wait for the right amount of time, some times a little bit longer, and some times a lot longer (maybe up to an hour!). Today, I’ll look at how the Wait activity works and why you get the different times for how long you need to wait before the process continues.
The way that the standard Wait activity works (WF_STANDARD.wait) is that it firstly determines whether this is the first or second time that the activity has been visited. If it’s the first time that the activity has been visited by the process, then the different activity attributes are checked and the “wake up” time is calculated (i.e. when should the process next do something). Once the time has been calculated, the procedure sets the result to deferred until that time and ends. If it’s the second time that the activity has been visited, then the process sets the result to complete and ends.
So, rather than a “Wait” activity, it is actually a “Defer Until” activity. That may not sound like too much of a difference, but there is a fundamental one. If I suggested that the process will “wait”, then the inference is that as soon as the process has waited for a given period of time, then it will continue – if I was telling a PL/SQL piece of code to wait for a minute then I would call DBMS_LOCK.sleep(60); and as soon as that minute was up, the process will automatically continue.
However, the standard Wait activity doesn’t actually wait and then continue, it defers processing and the Workflow Engine moves on and processes something else instead. That’s because rather than waiting, the thread has been deferred until a certain point in time which corresponds to the wake up time.
Unfortunately, this means that you need to run something to process those deferred activities – the Background Engine (WF_ENGINE.Background). One or more Background Engines will need to be scheduled to run at the right frequency to process your deferred activities – which is why sometimes you wait for the right period of time (your Workflow got to the Wait activity just before the Background Engine ran), a short amount of time (the process got there and waited a little while until the next scheduled run of the Background Engine started), or you had to wait a long time (the Background Engine isn’t scheduled at the right frequency to meet the process needs).
Determining the number of Background Engines required, and their frequency of scheduling, is a bit of a dark art – each enterprise will need to monitor that the Background Engines are run at a suitable period but one that isn’t going to thrash the database into submission. If, for example, you need the process to wait for 10 seconds before continuing, then you need to schedule a background engine to run every 9 seconds or less, so that the wait cannot get to more than 10 seconds.
Having something wait for a short period of time sounds wonderful in theory (“Oh, we’ll just have the process pause for a few seconds while XYZ happens”) but can have a significant impact on the overall performance of the system. As soon as I hear someone suggest such a thing (which is fairly often!) then the question I always ask is “Why?” What is so important to the process that you need to wait for 10 seconds, but can’t wait for 5 minutes? This is particularly appropriate if you are then sending notifications which require a response – you want to pause the process for a minute, to get all the data, so you can send it in an email??
If you really (and I do mean really) need a short wait, then you need to consider what the triggering mechanism is – exactly what are you waiting for? If that’s something that can be determined by a change of state somewhere then you are better off including a pause in the process via a Business Event and then have your triggering condition raise the event to show that it has completed the work and the process can restart. Some developers (more than I care to remember!) have a tendency to include a short wait, then a check to see whether the job has completed, and if not, then loop back and wait again. If you change the development approach to an event-driven one, then this becomes redundant – and it means that if something happens to prevent the action taking place, your Workflow isn’t going to loop and loop and loop, which adds lots more audit and history to the process than you really want.
As a final thought – I have NEVER used a standard Wait activity, since I started with Workflow development. There are better ways to do it, and always have been (even before Business Event System was invented).