1. Overview
PostgreSQL supports many System Information Functions, such as Session Information Functions, Access Privilege Inquiry Functions, Schema Visibility Inquiry Functions, System Catalog Information Functions, Transaction ID and Snapshot Information Functions, etc. However, you may want build some special functions and integrate them into the PostgreSQL. This blog is going to walk through the steps about how to build your own System Information Functions into PostgreSQL.
2. Analyze the requirement
Since there are so many functions built-in PostgreSQL already, you should perform some research and analysis before you decide to create a new one. In this blog, I want to check the transaction id after each savepoints in an ongoing transaction, so that I can perform some visibility check before the whole transaction is committed. For example, I have a query like below, but I can’t figure out the transaction id after each savepoint if I use existing System Information Function .
1 | postgres=# CREATE TABLE tbl (data text); |
As you can see, I always get the same Transaction ID even after a savepoint using existing txid_current() function. In this case, I decide to create my own system information function to retrieve the information I want.
3. System Information Function template
To create your own System Information Function, you need to check the System Catalogs to see in which category it can fit. For the case mentioned above, I chose the catalog pg_proc which stores information about functions, procedures, aggregate functions, and window functions. The document System Catalog Initial Data provides more detailed description and examples about the .dat
file format and the rules to define your own OID.
Now, let’s define a function, say current_xid_list();
. Below is an example about how the function initial data may look like.
1 | # function to get the top and sub transactions XIDs |
Here, you need to generate your OID which doesn’t create any conflict. In the official PostgreSQL document, 69.2.2. OID Assignment, it descripts how the OIDs are managed and also provides a script src/include/catalog/unused_oids
to list the unused OIDs. For example, if you run the script like below, you will get a list of OIDs that are not used by PostgreSQL yet.
1 | $ src/include/catalog/unused_oids |
In my case, I picked the OID 5566 just for easy to remember. However, if you are planning to generate a System Information Function as patch and later submit to PostgreSQL community, then you better follow the rule to minimize the risk of OID collisions with other patches. What the PostgreSQL community recommended is a random OID number between 8000—9999
. Here how it is described in the official document, When choosing OIDs for a patch that is not expected to be committed immediately, best practice is to use a group of more-or-less consecutive OIDs starting with some random choice in the range 8000—9999.
After this system information initial data definition, we need to define the actual c function which can help to retrieve the information we need. In the case above, there might be a bunch of SAVEPOINTs in one transaction, so we need return a list of those ongoing transaction IDs. Below is an example, which refer to the way that how the function txid_current_snapshot
was built.
1 | /* |
In the above function, you can define your data format. We decided to define a function which can help retrieve the transaction IDs within current transaction.
1 | /* |
Now, if you recompile PostgreSQL with above changes (of course, you need to declare the functions in proper files), you should be able to use your own System Information Function to see the difference based on what we has discussed at the beginning.
4. Testing
Once build success, restart the PostgreSQL server. Now, let’s try the previous SQL query again. As you can see, a different transaction ID 489
after a checkpoint is showing up.
1 | postgres=# INSERT INTO tbl VALUES('HelloWorld-1'); |
If you add another savepoint, and then compare your newly added function current_xid_list()
with existing txid_current()
, you can see that current_xid_list()
displays not only the top transaction ID, but also all the transaction IDs after each savepoint.
8. Summary
In this blog, we discussed PostgreSQL System Information Functions, and perform a simple walk-through about how to create your own System Information Function following the official document, and run some simple tests.