How Postgres Audit Tables Saved Us From Taking Down Manufacturing

Audit tables file modifications that happen to rows in one other desk. They’re like commit logs for database tables, they usually’re sometimes used to determine who made what modifications when. However surprisingly, we have discovered them helpful for protecting our distributed Postgres cluster steady.
To persuade you of the stability-related worth of audit tables, we’ll cowl how audit tables helped us keep away from a severe incident. Then, we’ll briefly cowl how one can create audit tables in your personal Postgres databases.
The virtually incident
Heap is a product analytics device that robotically captures net and cell conduct like web page views, clicks, and cell touches. We’re working at a scale of billions of occasions per day, which we retailer throughout a distributed Postgres cluster.
The cluster has a coordinator node, which accommodates metadata about what information lives on which employee nodes. All our information is both a consumer
or an occasion
, so our metadata seems like this:
Our write path makes use of this metadata to find out the place incoming customers and occasions must be written.
Discover that every node is the house for a customers desk and a corresponding occasions desk. Every customers desk is positioned with an occasions desk that accommodates the occasions carried out by these customers. It’s essential that these two tables are co-located on the identical machine, in order that we will be part of them throughout queries with out having to ship information over the community. The Kafka customers that write to those employees test that this invariant is glad earlier than doing any writing to employees.
This makes altering the coordinator metadata a doubtlessly harmful affair. If we by accident violate the invariant, the customers received’t ingest any extra information till the metadata is mounted. It is a downside as a result of we would like our customers to have the ability to reply questions on how their product is getting used in real-time; they should not have to attend hours for his or her occasion information to indicate up in Heap analyses. We name this delay “ingestion latency,” and we try to maintain it on the order of a minute or two.
Though altering the coordinator metadata is doubtlessly harmful, it is unavoidable. Generally we have to re-balance information throughout the cluster, or, in our case, carry out a schema migration throughout many shards within the cluster. It was throughout a schema migration that we almost precipitated some extreme ingestion latency.
We needed to carry out our schema migration on our distributed occasions desk with no down-time and no ingestion latency. Since you possibly can’t write to a desk whereas altering its schema, we determined that for every shard, we should always:
-
Copy information to a brand new shard with the brand new schema on the identical employee
-
Add the brand new shard id to the coordinator metadata desk
-
Drop the outdated shard from the coordinator metadata desk and the employee
Sadly, the Node code we wrote to do that had a bug: it handled a string as if it was a quantity. After migrating a shard to a brand new schema, we created a brand new shard id by including a hard and fast quantity to the outdated shardid
after which dropping the entry for the outdated shard from the coordinator. Nonetheless, as a result of the shardid
was a string, including the offset resulted in concatenation, which in flip resulted in a brand new shardid
that was rejected by the coordinator as a result of it was too giant to suit into an bigint
:
Though we use Typescript to mitigate a majority of these bugs, after we’re working with exterior techniques, it is simple to search out your self in a scenario the place the kinds are mendacity to you. Thankfully, our code didn’t delete any shard information except metadata for the migrated shard was added to the coordinator. Sadly, due to this bug, the coordinator metadata nonetheless wound up in a state the place it was lacking occasion shard metadata for a specific employee.
A couple of minutes later, the on-call engineer was paged with a hyperlink to our ingestion latency metrics:

Ingestion latency was spiking, however after we found the bug in our code, we had been capable of rapidly reverse the removing of those shard metadata entries utilizing our audit tables. The SQL to revive the deleted metadata seemed somewhat like this:
With out the audit desk, this minor incident would have been way more extreme as a result of it could have been way more tough to trace down which shards had been deleted and to rapidly re-add these shards again to the desk. Logs would have been our solely option to repair this, and we will all think about how tough it could have been to reconstruct the proper desk utilizing logs.
Methods to create audit tables
To make an audit desk, begin by creating an odd desk with:
-
Columns that match the columns of the desk you wish to audit
-
Some extra columns that describe the modifications made to the audited desk.
For instance, an audit desk for the above pg_dist_shard
desk would seem like this:
The set off process would seem like this:
This code makes use of a number of particular variables which can be robotically outlined for triggers:
After defining a set off just like the one above, we will wire it up with the next SQL:
Conclusion
So, audit tables aren’t simply helpful for seeing who modified what in your database. Additionally they make it simple to undo and/or redo modifications which will trigger points downstream in your system. If that sounds interesting and also you’re okay with the overhead of introducing a number of triggers to the desk you’re auditing, think about creating audit tables in your personal system. They’re a terrific match for metadata tables like these, that are small and comprise key data.
In the event you like nerding out about underrated Postgres options, @-me on Twitter @philosohacker, and if you happen to like engaged on giant information techniques, we’re hiring! Try our team and open roles.
I’d prefer to thank John Krauss and Daniel Robinson for his or her important suggestions on earlier drafts of this submit.