Activity Log or Activity Lock pattern?

Today at NAV TechDays 2015 in Antwerp, I attended a presentation by Nikola Kukrika, Ciprian Iordache, and Gary Winter, at which Ciprian presented a new design pattern based on a new feature in NAV 2016: Activity Log.

In a nutshell, this new feature, and thus the design pattern, aim at helping trace issues and facilitate troubleshooting by leaving information in the database about what was done, by whom, if it failed, and if it did, why.

All pretty and nice, but – as Ciprian pointed out – there is a bad aspect about it: when an error happens, and everything is rolled back – the activity entry is rolled back together with everything else. Nikola later explained how this could be solved through writing into temporary tables, but this either requires redesign by Microsoft or special coding techniques by you.

However, there is a bigger caveat than this. Non-persisting data in case of errors is a problem, but the bigger problem lies elsewhere. Let me explain that bigger problem here, and then give a silver-bullet solution to solve it all.

The biggest issue here is locking and the biggest price we pay here is concurrency.

When you write to the Activity Log table, you lock that table so nobody else who wants to write to Activity Log can’t do that until you release your locks.

Concurrency suffers the most when there are more locked tables. Think of this: if there are five people each writing to a random table, and there are around 1100 tables in NAV, you have one in 220 chance two of them will want to write to the same table. If each user is locking five tables, you have one in 44 chance. That’s if writes are done randomly.

And we all know writes are not done randomly. Some tables are busier than others, and locking them causes more chances that your lock will cause timeout for somebody else.

Tables like Sales Line, Purchase Line, Gen. Journal Line, Item Journal Line, are busier than Company Information, Report Selection, Vendor Posting Group or Payment Terms.

If you have five users working in five different functional areas, they are less likely to encounter locking issues because they work on different underlying tables. Five users all working in the same functional area, for example all entering sales lines at the same time, are more likely to encounter these issue.

Let’s call these vertical and horizontal tables. Vertical tables are those that pertain to a single feature, whereas horizontal pertain to many features. You could say that Production Order Line is very vertical,  Sales Line is fairly vertical, G/L Entry is fairly horizontal.  In this landscape, Activity Log is extremely horizontal.

Horizontal tables have potential to cause more locks, and the likelihood that you’ll encounter a locking issue is proportional to the amount of time the locks stays in place.

To achieve better concurrency, you need to place locks as late as possible into the transaction. This is called optimistic locks (you don’t lock just in case; in fact you are optimistic about possibility that when you want to lock, you will be able to lock). Microsoft has invested great deal of effort, time and money into making G/L posting procedure as optimistic as possible, and they have done fairly good progress with it.

As opposed to this we have a concept of pessimistic locking: placing a lock as early as possible, just in case, so that you make sure immediately at the beginning of a transaction that you will have access to the table once you need it. Pessimistic locking causes a ton of concurrency issues, as chances that somebody else will want to access exactly the same table are higher.

Activity logging itself is not an expensive operation. It takes next to no time itself, but the issue is that in a long transaction a lock on Activity Log will remain there and prevent other users from logging their activities until you complete your whole transaction.

Activity Log is progressively more useful as it logs more information from more processes – because then your troubleshooting is simpler. However, this also means that likelihood of it causing locking issue is higher with every new feature you build it into.

The other problem, as Ciprian and Nikola pointed out, is that if there is an error – something you’d want to see in the log – the log is not written to the database because it itself is rolled back together with everything else.

As I said earlier – I have a simple solution to propose for this – which takes care of both of those issues in a single go. However, since I don’t want this post to be too long, and I want the solution (which applies to more scenarios than just activity logging) to be searchable by itself, I’ll write a separate post for that, too. I’ll post it right away.

Vjeko

Vjeko has been writing code for living since 1995, and he has shared his knowledge and experience in presentations, articles, blogs, and elsewhere since 2002. Hopelessly curious, passionate about technology, avid language learner no matter human or computer.

This Post Has 5 Comments

  1. GrumpyNAVdev

    Why do you say this cause lock issues? There is no problem in 2 users inserting records in a table with AutoIncrement at the same time. Have you tried it? It works fine in NAV2009R2.

    1. Vjeko

      Whoops 🙂 I didn’t notice it was an AutoIncrement field. You are absolutely right – it won’t cause concurrency issues.

  2. mgm

    Statistics has never been my cup of tea, but the simple conclusion that ‘the chance of 2 people writing to the same table is 1 in 220 when 5 people are writing to a database with 1100 tables at the same time’ got my attention.
    So I did some ‘research’. Let’s assume you can compare this question to the birthday problem: https://en.wikipedia.org/wiki/Birthday_problem.
    If we follow this theory, the above will not be approximately 0,5%, but 1.0%!
    Note that we do not take into account the time-factor: what is ‘at the same time’ ?
    In reality it makes a difference if the write transaction takes a few milliseconds, or a couple of seconds. So in fact, it is very hard to make any estimation of chances in writing to the database ‘at the same time’.
    Just a note.

    1. Vjeko

      “Same time” in my example was simultaneously. Yes, time factor should be taken into account, but all my examples here are illustrative. The purpose was not to show hard-core statistics, but to show that “horizontal” tables have much higher chances of causing locks than “vertical” tables. But thanks for pointing out the Birthday Problem.

Leave a Reply