Troubleshooting Disk Space Issues

Table of Contents
Overview
Warning Banner
Get Notifications When Free Disk Space Is Low
View Available Space
What to Do If GreenArrow’s Queue Is Causing a Filesystem to Be at Risk of Filling Up
What to Do If PostgreSQL Is Causing a Filesystem to Be at Risk of Filling Up
What to Do If Logs Are Causing a Filesystem to Be at Risk of Filling Up
Remove Bloat from Postgres Tables
What to Do If Redis Is Causing a Filesystem to Be at Risk of Filling Up

Overview

These instructions are meant to help you prevent and detect disk space issues, identify the cause(s), and implement corrective actions. Feel free to contact GreenArrow technical support if you’d like help with a disk space issue.

It is important that you follow these instructions carefully. If you have a low disk situation, please do not start deleting files. Read this documentation and follow all the mitigation actions that match your situation. If you feel unsure of what actions to follow, then it is a good sign that you should contact GreenArrow technical support.

GreenArrow can display a warning banner in its web interfaces when disk space runs low on a filesystem. The banner has two inputs:

If Studio throttles sending to prevent disk space exhaustion, it displays a notice.
If a configurable ui_disk_space_warning_threshold is reached, both Engine and Studio display a notice.

When thresholds are crossed, warnings are displayed in the following places:

All Studio pages for System Admin users.
All Engine pages except for the Send Statistics Overview Page and its child pages, regardless of which user is logged in.

You may optionally disable both types of warnings by setting ui_disk_space_warning_hide to yes.

GreenArrow has a background process which compares disk usage to the ui_disk_space_warning_threshold configuration and updates the warning banner at the top of each minute. As a result, disk usage stats are usually updated once a minute, but if filesystems are slow to respond, as could be the case with hard drive failure or disconnected NFS volume, the update process could be slowed or stopped. If the warning banner has information to display, and that information is more than two minutes old, the banner will indicate how old the information is.

Get Notifications When Free Disk Space Is Low

To be notified when your system is running low on disk space we recommend that you add one or more email addresses to the error notification addresses list. You can find instructions in the General Settings page’s “Error Notification Address” section.

By default, the error notification will only send emails when a mounted filesystem’s utilization exceeds 80%. If you wish to set a different threshold, follow the instructions in the General Settings page’s “Disk Space Depletion Notifications” section.

View Available Space

Use the df command to view available disk space. This command reports on all mounted filesystems:

df -h

The output can be seen in the following example. Note: the -h option presents values in human readable format:

Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        18G  5.0G   12G  30% /
tmpfs           1.9G     0  1.9G   0% /dev/shm
/dev/loop6      100M  1.7M   98M   2% /var/hvmail/qmail-bounce/queue
/dev/loop5      396M  2.4M  394M   1% /var/hvmail/qmail-ram/queue

The filesystem mounted on / has 12GB available space. 30% of the filesystem is in use which amounts to 18GB. The RAM (/var/hvmail/qmail-bounce/queue) and bounce (/var/hvmail/qmail-ram/queue) queues are almost empty. NOTE: the RAM and bounce queues are not stored on disk.

Use the du command to identify which directories are using the most space. du reports the space taken by all files and subdirectories of the specified directory, including filesystems mounted under the path that is reported.

This command will report GreenArrow’s total disk usage as well as the usage of the three subdirectories that usually account for the majority of GreenArrow’s disk usage:

du -hs /var/hvmail /var/hvmail/log /var/hvmail/postgres /var/hvmail/qmail-disk

This is an example of the output:

1.2G	/var/hvmail
21M	/var/hvmail/log
94M	/var/hvmail/postgres
35M	/var/hvmail/qmail-disk

In this example, the total space used by GreenArrow is 1.2GB, while the three subdirectories are using 21MB, 94M and 35MB.

Here’s more information on the above three subdirectories:

Messages that are deferred or throttled are stored in /var/hvmail/qmail-disk
The PostgreSQL database stores its data in /var/hvmail/postgres
GreenArrow stores logs in /var/hvmail/log

What to Do If GreenArrow’s Queue Is Causing a Filesystem to Be at Risk of Filling Up

If you’ve determined that the queue is the main cause for the filesystem to be at risk of filling up, then the following actions may help you get the system back to normal. Make sure that you follow the order written here.

Pause Injections

Try to stop the source application from injecting more messages. This will give GreenArrow Engine time to process the queue and either deliver or bounce the messages.

Dump Messages

Sometimes a few sends take a lot of space in the queue. For example, when they have big attachments and/or are being deferred constantly. If this is the case, it is possible to dump those messages from the queue. GreenArrow will remove the messages that are already in the queue on their next delivery attempt. It will also dump new messages injected with the same SendID. Follow the instructions in the Pausing and Dumping Queues document to dump messages.

Decrease the Queue Lifetime

By default, GreenArrow will retry delivering deferred and throttled messages for up to two days. Decreasing this value will reduce the amount of disk space that’s used by these messages.

See the queue_lifetime directive to customize how long messages will be retried.

Contact GreenArrow Technical Support

If you followed these instructions and couldn’t solve the problem, or if you feel unsure of what to do, please contact GreenArrow’s technical support.

What to Do If PostgreSQL Is Causing a Filesystem to Be at Risk of Filling Up

If PostgreSQL is using a lot of space, then we recommend that you do the following:

If you use GreenArrow Studio, then verify that your Data Retention Settings are at the desired levels.
Contact GreenArrow’s technical support

What to Do If Logs Are Causing a Filesystem to Be at Risk of Filling Up

If log files represent a high percentage of the disk usage, you can:

Reduce the Space Available for Delivery Attempt Log Files

Use the following command to determine the space that is being used by the delivery attempt log files:

du -hc /var/hvmail/log/*-qmail-send/ | grep total

The above figure can be reduced by using the hvmail_set command, as described in the General Settings page’s “Amount of Logs to Store” section to set the space that GreenArrow will use for keeping logs of delivery attempts.

Managing /var/hvmail/log/send-summary Files

Files in the /var/hvmail/log/send-summary are where the statistics for the Engine Send Statstics pages and some data for Studio Campaign Statistics are stored. Busy systems, especially systems that have been in use for a while, can accumulate many of these files and their total space used can grow quite large.

The following command shows the space that is being used by the /var/hvmail/log/send-summary files:

du -hs /var/hvmail/log/send-summary

If the amount of space used by these files is concerning to you, you have two options.

Option 1 - Move Older send-summary Files to a Different Filesystem

If your server has a separate filesystem with enough free space, you can move older copies of these files to another filesystem:

Verify that you’re running a version of GreenArrow that supports this by checking for the existence of the /var/hvmail/bin/hvmail_move_old_send_summary_files file. If the file is not present, then please contact GreenArrow’s technical support to request an update to your installation.
Determine how much free space will be needed in the directory that you’d like to move these files to:
```
du -hs /var/hvmail/log/send-summary
```
Create the directory that you would like old /var/hvmail/log/send-summary files to be moved to. For example:
```
mkdir /media/scratch/var-hvmail-log-send-summary
```
Run the hvmail_move_old_send_summary_files command, specifying the minimum age of files (in days) to be moved as the first argument, and the destination directory as the second argument. For example, to move all /var/hvmail/log/send-summary files that were last modified 7 or more days ago to /media/scratch/var-hvmail-log-send-summary, run:
```
/var/hvmail/bin/hvmail_move_old_send_summary_files 7 /media/scratch/var-hvmail-log-send-summary
```
If you’re using Unmanaged Backups, verify that you’re using the latest version of the Unmanaged Backups script, which will automatically backup files at the new location. Managed Backups will also automatically backup files at the new location.

Option 2 - Delete Older send-summary Files

Files matching the glob /var/hvmail/log/send-summary/*/*.db, which have not been updated in 30 days, may be deleted. Since these files are the source of data for the Engine Send Statstics pages and some Studio Campaign Statistics pages, deleting them will cause the Engine statistics for older “sends” to become unavailable, and Studio statistics for older “sends” to be incomplete.

Deleting any send-summary files that have been updated more recently than 30 days can cause operational issues with your system.

The following command will find and delete send-summary files that are older than 365 days:

find /var/hvmail/log/send-summary -name *.db -type f -mtime +365 -delete

The above command can be modified to your needs by changing the mtime parameter, and the command can either be run on an ad-hoc basis or run routinely from cron.

Contact GreenArrow Technical Support

If you followed these instructions and couldn’t solve the problem, or if you feel unsure of what to do, please contact GreenArrow’s technical support.

Remove Bloat from Postgres Tables

This procedure is only available on Postgres 9.5 or later. To check what version of Postgres you’re running, run cat /var/hvmail/postgres/default/data/PG_VERSION.

This procedure is considered experimental and could potentially result in data loss. In a future release, GreenArrow will incorporate this procedure into a greenarrow command, at which time this will no longer be considered experimental.

Postgres tables can have bloat when they contain a significant amount of data which is then removed. The disk space formerly used by that data is, under normal circustances, not automatically released back to the operating system. Instead, the space is reserved for future data that is added to that table.

If the following command reveals significantly bloated Postgres tables, there may be disk space you could free.

greenarrow disk_usage --postgres-bloat

Here’s an example of some tables that show varying amounts of bloat:

postgres-bloat

In the image above, the size printed is the disk space that the table is currently using. The percentage to the right is how much of that size is bloated space.

Check available disk space: Before running the following procedure, ensure that you have enough disk space to double the size of the table (as reported by the greenarrow disk_usage command) you’re going to de-bloat (meaning, this process temporarily uses double the table’s disk space).

Take a backup before proceeding: Before running the following procedure, ensure you have taken a recent backup. You can take a backup of only the table you’ll be working on by running:

/var/hvmail/postgres/default/bin/pg_dump -U greenarrow -t TABLE --data-only greenarrow | gzip > pg_backup.TABLE.`date +"%Y%m%d%H%M%S"`.sql.gz

To free the bloat, run the following:

/var/hvmail/postgres/default/bin/psql -U postgres greenarrow -c ' CREATE EXTENSION pg_repack '
/var/hvmail/postgres/default/bin/pg_repack -U postgres -t TABLE_NAME_TO_DEBLOAT greenarrow
/var/hvmail/postgres/default/bin/psql -U postgres greenarrow -c ' DROP EXTENSION pg_repack '

The above process can take anywhere from a few seconds to a few hours to complete, depending on the size of the bloated table. For this reason, we recommend running the above in a terminal session running screen or tmux. GreenArrow will continue operating as normal during the above process.

Restore a single table backup: To restore a backup taken of a single table using pg_dump like demonstrated above (replacing TABLE and BACKUP_FILENAME appropriately):

(
    echo "BEGIN;"
    echo "SET CONSTRAINTS ALL DEFERRED;"
    echo "TRUNCATE TABLE;"
    zcat BACKUP_FILENAME
    echo "commit;"
) | /var/hvmail/postgres/default/bin/psql -U greenarrow greenarrow

What to Do If Redis Is Causing a Filesystem to Be at Risk of Filling Up

In GreenArrow version 4.251.0, we fixed an issue where subscriber imports could leave temporary data behind permanently, possibly leading to bloat in GreenArrow’s internal Redis server. The fix in that version cleans up the old temporary files, but you may wish to take additional action to clean it up if that created significant bloat.

This process consumes some CPU and Disk IO resources which could add some system load, but overall the impact of running this process is minimal and there is no downtime caused by running these steps.

Check the starting data size:
```
du -sh /var/hvmail/data/redis
```
Start the background rewrite process:
```
/var/hvmail/redis/bin/redis-cli -s /var/hvmail/var/redis.sock bgrewriteaof
```
This process runs in the background, so you will not see any status or progress updates after running that command. It will just report back to you that the background append only file rewriting started.
Run this command to check on the progress of the background process - this command only reads the status, so it is safe to re-run it as often as you need until the background rewrite is complete:
```
/var/hvmail/redis/bin/redis-cli -s /var/hvmail/var/redis.sock info | grep aof_rewrite
```
While the process is still running, this is the output you will see:
```
aof_rewrite_in_progress:1
aof_rewrite_scheduled:0
aof_rewrite_buffer_length:0
```
Once the process is complete, this is the output you will see:
```
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_rewrite_buffer_length:0
```
Re-check the data size:
```
du -sh /var/hvmail/data/redis
```
If you had bloat in Redis, the disk usage should now be lower than when you ran this command in the first step.

Troubleshooting Disk Space Issues

Overview

Warning Banner

Get Notifications When Free Disk Space Is Low

View Available Space

What to Do If GreenArrow’s Queue Is Causing a Filesystem to Be at Risk of Filling Up

Pause Injections

Dump Messages

Decrease the Queue Lifetime

Contact GreenArrow Technical Support

What to Do If PostgreSQL Is Causing a Filesystem to Be at Risk of Filling Up

What to Do If Logs Are Causing a Filesystem to Be at Risk of Filling Up

Reduce the Space Available for Delivery Attempt Log Files

Managing /var/hvmail/log/send-summary Files

Option 1 - Move Older send-summary Files to a Different Filesystem

Option 2 - Delete Older send-summary Files

Contact GreenArrow Technical Support

Remove Bloat from Postgres Tables

What to Do If Redis Is Causing a Filesystem to Be at Risk of Filling Up