What’s the best way to archive Workflow data?

Some customers have a requirement that for audit purposes, they need to keep the Workflow runtime data for quite some time – I worked with a bank in the UK which wanted to keep everything in the Workflow tables for seven years! So here’s a couple of tips for consideration.

Firstly, if you need to keep runtime data for completed workflows for more than a month or so, then you shouldn’t do it in your Workflow system – archive it somewhere else. If you leave large quantities of data in tables which are being hit frequently during run-time, then there will be a significant performance hit. I’ve seen a number of posts in various lists which say something like “I’ve got 2 million rows in my item attribute values table, we’ve never purged anything and the system is slow – what can I do?”

By the time you have this many records, purging the data will also take a long time (some customers have reported that it takes over a day on some systems!!) – a Catch-22 situation, since you can’t purge to remove the data because it takes too long, and in the meantime lots more data is being written to the Workflow tables…

If you are archiving the data elsewhere, there are two different ways you can approach this – either push the data from the Workflow tables into the archive system, or you can pull it into the archive system from the Workflow system.

For a push, you would need to write triggers to each table that workflow writes to, so that whenever data is inserted or updated, the change is replicated in the archive system. This would be very processor intensive, since you are effectively running the same job twice. I would not recommend this method.

For a pull, you need to write something that can run at the end of the day to copy everything from the Workflow system into the archive. This is significantly better for performance, since it can be scheduled to run when the system workload is low, and so should not impact the operation of the system. Once you have successfully archived the data, you should then purge the Workflow data. If the requirement is just to archive the runtime data (and not to archive changes to the workflow definition), then you can use the queries which are executed in the standard wfstat.sql script for any completed workflows to determine what information you need to keep.

My recommendation would be to take the second option, and pull the data from Workflow into your archive system. The archive can even be kept in the same database as the workflow system, but in completely separate tables which are not used during regular operation, or it could be on a completely separate system using something like a database link to connect to the different database. You could even write something to pull all the information from the runtime tables and create a payload which can be enqueued onto an Oracle Advanced Queue – from there the you can determine what other environments pick up the message, and you could just leave the message on the queue for processing / viewing later.

As ever, any comments or views on my suggestions are more than welcome!

By continuing to use the site, you agree to the use of cookies. more information

In common with almost all professionally run websites, this website logs the IP address of each visitor in order to keep it running reliably. This is also essential for protecting the website and its visitors from malicious attacks, including infection with malware.

This website provides information as a service to visitors such as yourself, and to do this reliably and efficiently, it sometimes places small amounts of information on your computer or device (e.g. mobile phone). This includes small files known as cookies. The cookies stored by this website cannot be used to identify you personally.

We use cookies to understand what pages and information visitors find useful, and to detect problems such as broken links, or pages which are taking a long time to load.

We sometimes use cookies to remember a choice you make on one page, when you have moved to another page if that information can be used to make the website work better. For example:
- avoiding the need to ask for the same information several times during a session (e.g. when filling in forms), or
- remembering that you have logged in, so that you don’t have to re-enter your username and password on every page.

You can prevent the setting of cookies by adjusting the settings on your browser (see your browser Help for how to do this). Be aware that disabling cookies will affect the functionality of this and many other websites that you visit.