Self Managed High Availability Clusters

END OF LIFE

Support for GreenArrow’s High Availability Cluster ends on June 30, 2024.

Table of Contents
Overview
Failover
Reboots
IP Address Management
Data Synchronization
- PostgreSQL Streaming Replication
- DRBD
  - DRBD Troubleshooting

Overview

This page provides additional technical details of how GreenArrow’s High Availability Cluster works, beyond what’s described in the High Availability Cluster Administration page.

This information is intended for experienced Linux systems administrators who are tasked with managing the cluster. The cluster manager should be comfortable working with DRBD and PostgreSQL, managing RPM packages, working with kernel modules, reading shell scripts and log files, and configuring a Linux server’s IP addresses.

Note for Managed Customers

As part of our Managed Services, we perform the tasks described on this page for you. You should only perform the tasks described on this page if pre-approved by, and scheduled with GreenArrow.

After-hours support for issues determined not to be Critical Malfunctions may be subject to our after-hours service rate. This includes being paged by our monitoring system as a result of unscheduled maintenance.

Failover

Failover is the process of switching which server is running GreenArrow’s production services. This might be done if, for example, you need to perform hardware maintenance on the server that’s currently acting as Primary.

Failover usually takes about a minute to complete, during which GreenArrow services are offline.

Here’s how to failover. All commands should be run as the root user:

Demote the server that’s currently filling the Primary role so that it becomes a Secondary. This step is required because only one server can fill the Primary role at a time:
```
hvmail_make_secondary
```
Wait for the above command to finish executing before moving onto the next step.

If the hvmail_make_secondary command fails for some reason, it’s safe to correct whatever the underlying issue was, then run the command a second time.
Promote the desired server to the Primary role:
```
hvmail_make_primary
```
Complete the steps in the “Diagnostics” section of the High Availability Cluster Administration page to verify that services are running normally.

Reboots

The hvmail_make_secondary or hvmail_make_primary script (described in this document’s “Failover” section) will need to be run as the root user each time a server in the cluster is rebooted.

It’s safe to automatically run the hvmail_make_secondary script at boot time, but caution should be used when automating the execution of the hvmail_make_primary script. This is because only one server should be Primary at any given time.

IP Address Management

All of GreenArrow’s IPs should be assigned to whichever server is acting as the Primary at any given point. This means that your Linux distribution’s standard IP address configuration files should only be used for IPs that don’t move between servers. You should list all IPs that should be assigned to the server that’s acting as the Primary in the /etc/hvmail_primary_ips file, along with each IP’s subnet mask in CIDR format.

Here’s an example of what that file would look like if you were assigning the 1.2.3.4 and 1.2.3.5 IPs, each with a /24 subnet mask (255.255.255.0):

1.2.3.4/24
1.2.3.5/24

Be sure to synchronize any changes made to the /etc/hvmail_primary_ips file between servers. For example, you might use scp to copy the file between servers after each edit.

After adding or removing IPs in the configuration file, apply the changes manually by using the ip command. This way you won’t create downtime by failing over to apply the changes. Here are a few examples:

To add 1.2.3.4/24 to the eth0 interface, run:
```
ip addr add 1.2.3.4/24 dev eth0
```

To add 1.2.3.2 through 1.2.3.100 to the eth0 interface, run:

for i in $(seq 2 100); do
 ip addr add 1.2.3.$i/24 dev team1;
done

To remove 1.2.3.4/24 from the eth0 interface, run:
```
ip addr del 1.2.3.4/24 dev eth0
```

Remember also to configure (or remove) the IP Address VirtualMTAs.

Data Synchronization

A High Availability Cluster synchronizes its data using the following methods:

PostgreSQL Streaming Replication
DRBD

The following sections provide more details on each.

PostgreSQL Streaming Replication

GreenArrow’s PostgreSQL database is synchronized using PostgreSQL streaming replication. PostgreSQL 16 is used in recent installations. Some legacy installations use older PostgreSQL releases and can be migrated to 16 upon request.

As part of the initial configuration process, GreenArrow configures streaming replication by updating the following configuration files:

The PostgreSQL Network Service page contains more details on GreenArrow’s PostgreSQL installation.

DRBD

GreenArrow’s non-PostgreSQL data is synchronized using DRBD 8.4.

DRBD’s configuration file is located at /etc/drbd.conf.

DRBD Troubleshooting

If DRBD fails with a “Can not load the drbd module” error following a package update, then the kmod-drbd83 or kmod-drbd84 package may be incompatible with the running kernel.

There are two known ways to resolve this:

Downgrade to a kmod-drbd83 or kmod-drbd84 package that’s compatible with the running kernel.
Reboot so that you’re running a later kernel.

Here’s an example of how to upgrade other packages while skipping kmod-drbd84:

yum update --exclude=kmod-drbd84