Updating triggers online with MySQL and MariaDB

Landslide danger
This is what can happen if triggers are updated while applications are running

Updating triggers online means that we are able to do this in production, while our applications are running, without causing a disruption.

To achieve this, we need to run CREATE TRIGGER and DROP TRIGGER inside a transaction, but this is not supported by MySQL and MariaDB. If we try to do so, the current transaction is implicitly committed. MariaDB has a variable called in_transaction that allows us to demonstrate this clearly:

MariaDB [test]> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

MariaDB [test]> SELECT @@in_transaction;
+------------------+
| @@in_transaction |
+------------------+
| 1                |
+------------------+
1 row in set (0.000 sec)

MariaDB [test]> DROP TRIGGER bi_t;
Query OK, 0 rows affected (0.001 sec)

MariaDB [test]> SELECT @@in_transaction;
+------------------+
| @@in_transaction |
+------------------+
| 0                |
+------------------+
1 row in set (0.000 sec)

The problem is actually more generic: MySQL and MariaDB don’t support transactional DDL. But let’s focus on triggers. I’ll explain what problems this causes, and then I will propose a solution.

The problems

The problems are:

  • Phantom trigger – it’s about every single updated trigger;
  • Inconsistent triggers – it’s about a table’s triggers as a whole;
  • Atomicity – it’s about failures.

Phantom triggers

If I want to replace one trigger with another, as mentioned, I will need to run DROP TRIGGER followed by CREATE TRIGGER. This implies that, in the (hopefully brief) interval between these expressions, the trigger will simply not exist.

If I run these statements in a transaction, this is not a problem. In fact, for other users, both the statements will take effect together when I COMMIT the transaction. But again, this cannot happen in MySQL and MariaDB. On a heavily used server, we should definitely expect some users to write the table without activating a trigger.

However, MariaDB has a solution for this problem: CREATE OR REPLACE TRIGGER. In this way the existing trigger (if any) will be destroyed and the new one will be created, but this operation is atomic. No other user can see the table without a trigger.

Inconsistent triggers

Even with the first problem solved, in most cases we have a second problem: we have multiple triggers associated to a table, and we always want them to be consistent each other.

This is better explained with an example. Suppose that we have triggers BEFORE INSERT and UPDATE. These triggers perform some data validation, and return an error if some data in not valid. If you want to know how to do this, take a look at an article I wrote some years ago in my personal blog: Validating data using a TRIGGER.

Anyway, the problem here is that, if we change the validation rules in one trigger, we want to change them in the other trigger also. In no point in time we want the triggers to be inconsistent.

Atomicity

Our triggers operations should completely succeed or completely fail. If, for example, we lose our connection in the middle of these changes, all the changes we’ve done should be undone, to avoid inconsistencies.

The solution

If you used MySQL for centuries like me, you may remember the LOCK TABLES statement. And probably you remember it with horror. It was developed in ancient times as an alternative to transactions, when MyISAM (or probably its ancestor, ISAM) was the default storage engine and InnoDB simply didn’t exist. Basically, since you could not use a real transaction, you locked a whole table with a read lock or a write lock. Then you did some stuff, and then you ran UNLOCK TABLES. Operations executed inside theLOCK block were not atomic, but at least other users could not interfere with them. Nowadays, I only see that command in legacy code – but yes, I still see it.

Why doesLOCK TABLES help in this case? Because the lock survives statements like DROP TRIGGER and CREATE TRIGGER.

So basically what you would do is:

  • LOCK TABLE something WRITE;
  • the command is queued until all existing metadata locks (transactions or DDL in progress) are released; this would happen even with a DROP TRIGGERS;
  • you DROP old triggers;
  • you CREATE new triggers;
  • UNLOCK TABLES.

EDIT: LOCK TABLES and Galera

In the comments, a concern was raised about LOCK TABLES not working as expected as Galera.

Actually, LOCK TABLES seems to work perfectly on the node where it is executed. The problem is that the lock is not propagated to other nodes.

So I would advise it, if one of these best practices is followed:

  • All writes go to the same node;
  • Writes to each set of databases (or at least tables) go to the same node.

If so, you are reasonably safe. If you have automated failover and you want to be extra-safe, run LOCK TABLES and UNLOCK TABLES on all nodes.

See also

Related articles:

Related courses:

Conclusions

What are the drawbacks of this solution?

  • If the connection is lost at some point, no rollback happens. In other word, the Atomicity problem mentioned above is not solved in any way.
  • If a long transaction is running, our LOCK TABLE statement will be queued for a long time. And it will cause several other statements (including SELECTs) to be queue. But this also happens with a simple DROP TRIGGER.
  • For a brief moment, the table will be completely write-locked.

What are the pro’s?

  • No disruption.
  • Isolation between concurrent sessions.
  • Once the lock is acquired, the operations on triggers will be fast.

Other methods are possible, but they are more complicated and not necessarily less risky. For example, if you use row-based replication, you could update the triggers on a slave, promote it to master, and then update the triggers on the master. But then you have a risk of conflicts during the failover.

All in all, I believe that this method is good enough for most situations.

If you found this article useful,
you may consider to contact me for the training:
MySQL Stored Procedures

Toodle pip,
Federico Razzoli

Photo credit: Stefano Trucco

Comments (5)

    1. Good point Karl. I’ll explain what can be done with Galera (at the expense of making triggers slow) in another post. Unfortunately, I don’t have a good and clean solution.

    2. Karl, I’ve made a test. It is half-true that LOCK TABLES is not supported in Galera. Actually it works on the same node, but the lock is not propagated to other nodes. So, you can use it in these situations:
      * You write to only one node;
      * You write to multiple nodes, but each set of databases/tables is written by a single node.
      Of course there is still a risk of a crash followed by an automated failover after you’ve ran LOCK TABLES. But I think it’s ok to take this risk.

Leave a Reply

Your email address will not be published. Required fields are marked *