1. Overview
Similar to PostgreSQL, Lustre file system is also an open source project which started about 20 years ago. According to Wikipedia, Lustre file system is a type of parallel distributed file system, and is designed for large-scale cluster computing with native Remote Direct Memory Access (RDMA) support. Lustre file systems are scalable and can be part of multiple computer clusters with tens of thousands of client nodes, tens of petabytes (PB) of storage on hundreds of servers, and more than a terabyte per second (TB/s) of aggregate I/O throughput. This blog will explain how to setup a simple Lustre file system on CentOS 7 and run PostgreSQL on it.
2. Lustre file system
To deliver parallel file access and improve I/O performance, Lustre file system separates out metadata services and data services. From high level architecture point of view, Lustre file system contains below basic components:
Management Server (MGS), provides configuration information about how the file system is configured, notifies clients about changes in the file system configuration and plays a role in the Lustre recovery process.
Metadata Server (MDS), manages the file system namespace and provides metadata services to clients such as filename lookup, directory information, file layouts, and access permissions.
Metadata Target (MDT), stores metadata information, and holds the root information of the file system.
Object Storage Server (OSS), stores file data objects and makes the file contents available to Lustre clients.
Object Storage Target (OST), stores the contents of user files.
Lustre Client, mounts the Lustre file system and makes the contents of the namespace visible to the users.
Lustre Networking (LNet) - a network protocol used for communication between Lustre clients and servers with native RDMA supported.
If you want to know more details inside Lustre, you can refer to Understanding Lustre Internals.
3. Setup Lustre on CentOS 7
To setup a simple Lustre file system for PostgreSQL, we need to have 4 machines: MGS-MDS-MDT server, OSS-OST server, Lustre client1 and client2 (Postgres Servers). In this blog, I used three CentOS 7 virtual machines with below network settings:
1 | MGS-MDS-MDT: 10.10.1.1 |
3.1. Install Lustre
To avoid dealing with Firewall and SELinux policy issues, I simply disabled them like below,
Set SELINUX=disabled
in /etc/selinux/config, and run commands,
1 | systemctl stop firewalld |
Add Lustre release information to /etc/yum.repos.d/lustre.repo
1 | [lustre-server] |
Then update yum and install the filesystem utilities e2fsprogs
to deal with ext4
1 | yum update && yum upgrade -y e2fsprogs |
If there is no errors, then install Lustre server and tools with yum install -y lustre-tests
3.2. Setup lnet network
Depends on your network interfaces setup, add the lnet configuration correspondingly. For example, all my 3 CentOS 7 has a network interface enp0s8
, therefore, I added the configuration options lnet networks="tcp0(enp0s8)"
to /etc/modprobe.d/lnet.conf
as my Lustre lnet network configuration.
Then we need to load the lnet driver to the kernel, and start the lnet network by running below commands,
1 | modprobe lustre |
You can check if the lnet network is running on your Ethernet interface using command lctl list_nids
, and you should see something like below,
1 | 10.10.1.1@tcp |
You can try to ping other Lustre servers over the lnet network by running command lctl ping 10.10.1.2@tcp1
. If the lnet network is working, then you should see below output,
1 | 12345-0@lo |
3.3. Setup MGS/MDS/MDT and OSS/OST servers
To set up the storage for MGS/MDS/MDT server, I added one dedicated virtual disk (/dev/sdb), created one partition (/dev/sdb1) and formatted it to ext4.
1 | fdisk /dev/sdb |
You need to repeat the same process on OSS/OST server to add actual files storage disk.
If everything goes fine, then it is time to mount the disk on Lustre servers. First, we need to mount the disk on MGS/MDS/MDT server by running below command,
1 | mkfs.lustre --reformat --fsname=lustrefs --mgs --mdt --index=0 /dev/sdb1 |
Second, we mount the disk on OSS/OST server using below commands,
1 | mkfs.lustre --reformat --ost --fsname=lustrefs --mgsnode=10.10.1.1@tcp1 --index=0 /dev/sdb1 |
3.4. Setup Lustre clients
After the Luster server’s setup is done, we can simply mount the lustre file system on client by running below commands,
1 | mkdir /mnt/lustre |
If no error, then you can verify it by creating a text file and entering some information from one client, and check it from another client.
3.5. Setup Postgres on Lustre file system
As there are some many tutorials about how to setup Postgres on CentOS, I will skip this part. Assume you have installed Postgres either from an “official release” or compiled from the source code yourself, then run below tests from client1,
1 | initdb -D /mnt/lustre/pgdata |
pg_ctl -D /mnt/lustre/pgdata -l /tmp/logfile start
select count(*) from test;
pg_ctl -D /mnt/lustre/pgdata -l /tmp/logfile stop
From the above simple tests, you can confirm that the table created and records inserted by client1 are stored on remote Lustre file system, and if Postgres server stop on client1, then you can start Postgres server on client2 and query all the records inserted by client1.
#### 4. Summary
In this blog, I explained how to set up a parallel distributed file system - Lustre on a local environment, and verify it with PostgreSQL servers. I hope this blog can help you if you want to evaluate some distributed file systems.