If you work with PostgreSQL in production, backups are not optional.
In this lab, I built a complete PostgreSQL 18 backup and recovery environment on Ubuntu 24.04 and tested all three major backup approaches:
- logical backups with
pg_dump - physical backups with
pg_basebackup - production-grade backups with pgBackRest, including WAL archiving and Point-In-Time Recovery (PITR)
Everything was tested on a real VM with actual recovery drills — not simulations.
Table of Contents
The Environment
| Component | Value |
|---|---|
| OS | Ubuntu 24.04 |
| PostgreSQL | 18 (PGDG repository) |
| Backup Tools | pg_dump, pg_basebackup, pgBackRest |
| Lab Type | Single VM |
| Repository | Local filesystem |
Installing PostgreSQL 18
I used the official PGDG repository rather than Ubuntu's default package, which ships an older version.
Install the repository helper:
sudo apt install -y postgresql-common
Run the PGDG setup script:
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
Install PostgreSQL 18:
sudo apt install -y postgresql-18
Verify the cluster is running:
pg_lsclusters
Expected output:
Ver Cluster Port Status Owner Data directory
18 main 5432 online postgres /var/lib/postgresql/18/mainCreating the Lab Database
Create the application user:
sudo -u postgres psql -c "CREATE USER banking WITH PASSWORD 'banking';"
Create the database:
sudo -u postgres psql -c "CREATE DATABASE bankingdb OWNER banking;"
Connect as the banking user and create the schema:
psql -h localhost -U banking -d bankingdb -c "CREATE SCHEMA banking AUTHORIZATION banking;"
Creating the Tables
I created five tables representing a small banking schema: branches, employees, customers, accounts, and transactions. The foreign-key chain between them exercises the constraints that matter in real restore scenarios.
-- Connected as banking@bankingdb
CREATE TABLE banking.branches (
branch_id INTEGER PRIMARY KEY,
branch_name VARCHAR(100) NOT NULL,
city VARCHAR(50),
country VARCHAR(50),
opened_date DATE
);
CREATE TABLE banking.employees (
employee_id INTEGER PRIMARY KEY,
branch_id INTEGER NOT NULL REFERENCES banking.branches(branch_id),
full_name VARCHAR(100) NOT NULL,
role VARCHAR(50),
hire_date DATE
);
CREATE TABLE banking.customers (
customer_id INTEGER PRIMARY KEY,
full_name VARCHAR(100) NOT NULL,
email VARCHAR(100),
country VARCHAR(50),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE banking.accounts (
account_id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES banking.customers(customer_id),
branch_id INTEGER NOT NULL REFERENCES banking.branches(branch_id),
account_type VARCHAR(20),
balance NUMERIC(15,2) DEFAULT 0,
opened_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE banking.transactions (
txn_id INTEGER PRIMARY KEY,
account_id INTEGER NOT NULL REFERENCES banking.accounts(account_id),
employee_id INTEGER REFERENCES banking.employees(employee_id),
txn_type VARCHAR(20),
amount NUMERIC(15,2) NOT NULL,
txn_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
description VARCHAR(200)
);
I inserted 18 rows across the five tables — three branches, four employees, three customers, four accounts, four transactions. This is the baseline state every backup in the lab must capture and every restore must reproduce.
Creating Backup Directories
# pg_dump output — runs as the fernando OS user
mkdir -p /home/fernando/backups/pg-dump
# pg_basebackup output — runs as postgres
sudo mkdir -p /var/lib/postgresql/backups/pg-basebackup
sudo chown -R postgres:postgres /var/lib/postgresql/backups
sudo chmod -R 750 /var/lib/postgresql/backups
The pgBackRest repository directory is created separately in Part 3.
Part 1 — pg_dump
pg_dump creates a logical export of a database — SQL statements or a compressed binary format that can be replayed into any compatible PostgreSQL instance.
Best for:
- cross-version migrations
- single-database or single-table restores
- schema-only exports for documentation or DDL versioning
Not suitable for PITR or large databases where restore time must be measured in minutes rather than hours.
Full Database Backup
I used custom format (-F c) — compressed binary, restorable only with pg_restore, supports parallel restore and table-selective options.
PGPASSWORD=banking pg_dump \
-h localhost \
-U banking \
-F c \
-d bankingdb \
-f /home/fernando/backups/pg-dump/bankingdb.dump
Inspect the dump contents without restoring:
pg_restore -l /home/fernando/backups/pg-dump/bankingdb.dump | head -30
The table of contents showed schemas, tables, sequences, constraints, and table data for all five tables.
Schema-Only Backup
PGPASSWORD=banking pg_dump \
-h localhost \
-U banking \
-F c \
--schema-only \
-d bankingdb \
-f /home/fernando/backups/pg-dump/bankingdb-schema.dump
Useful for migrations, DDL versioning, and recreating empty target environments before a data load.
Single-Table Backup
PGPASSWORD=banking pg_dump \
-h localhost \
-U banking \
-F c \
-t banking.transactions \
-d bankingdb \
-f /home/fernando/backups/pg-dump/transactions.dump
The dump contains only the transactions table and its constraints — not the parent tables. Restoring it requires the parent tables to already exist in the target.
Full Restore Test
I created a new empty database and restored the full dump into it:
sudo -u postgres psql -c "CREATE DATABASE bankingdb_restore OWNER banking;"
PGPASSWORD=banking pg_restore \
-h localhost \
-U banking \
-d bankingdb_restore \
/home/fernando/backups/pg-dump/bankingdb.dump
Verify row counts match the source across all five tables:
PGPASSWORD=banking psql -h localhost -U banking -d bankingdb_restore -c "
SELECT 'branches' AS table_name, COUNT(*) FROM banking.branches
UNION ALL SELECT 'employees', COUNT(*) FROM banking.employees
UNION ALL SELECT 'customers', COUNT(*) FROM banking.customers
UNION ALL SELECT 'accounts', COUNT(*) FROM banking.accounts
UNION ALL SELECT 'transactions', COUNT(*) FROM banking.transactions;"
Then dropped the test database:
sudo -u postgres psql -c "DROP DATABASE bankingdb_restore;"
Disaster Recovery Drill — Restore One Table
Simulated an accidental table drop:
PGPASSWORD=banking psql -h localhost -U banking -d bankingdb -c "
DROP TABLE banking.transactions CASCADE;"
Restored only that table from the dedicated transactions dump:
PGPASSWORD=banking pg_restore \
-h localhost \
-U banking \
-d bankingdb \
-t transactions \
/home/fernando/backups/pg-dump/transactions.dump
Verified the four rows were back. This is where pg_dump genuinely shines — granular, object-level recovery without touching the rest of the database.
Part 2 — pg_basebackup
pg_basebackup makes a binary filesystem-level copy of the entire PostgreSQL cluster. It does not understand databases or tables — it copies files.
Best for:
- bootstrapping streaming replicas
- full-cluster disaster recovery snapshots
- cluster migration to a new server
Not suitable for table-level or database-level recovery — restore is all-or-nothing at the cluster level.
Checking WAL Configuration
sudo -u postgres psql -c "SHOW wal_level; SHOW max_wal_senders;"
Expected output:
wal_level
-----------
replica
max_wal_senders
-----------------
10Both are set correctly by default on PostgreSQL 18.
Creating the Physical Backup
I used tar format (-F t) — produces compressed base.tar.gz and pg_wal.tar.gz archives rather than an unpacked directory tree.
sudo -u postgres pg_basebackup \
-D /var/lib/postgresql/backups/pg-basebackup/cluster \
-F t \
-z \
-P \
-c fast
-c fast issues a checkpoint at backup start instead of waiting for the next scheduled one — adds a brief I/O spike but is the right choice for an interactive lab.
Verifying the Backup
ls -lh /var/lib/postgresql/backups/pg-basebackup/cluster/
Expected files:
| File | Purpose |
|---|---|
base.tar.gz | cluster data directory archive |
pg_wal.tar.gz | WAL captured during backup |
backup_manifest | file list with CRC32C checksums (PostgreSQL 13+) |
Verify integrity before relying on the backup:
sudo -u postgres /usr/lib/postgresql/18/bin/pg_verifybackup --no-parse-wal \
/var/lib/postgresql/backups/pg-basebackup/cluster
A backup you have never verified is a backup you cannot trust.
Full Physical Restore Drill
Stop PostgreSQL:
sudo systemctl stop postgresql
Move the existing data directory aside:
sudo mv /var/lib/postgresql/18/main /var/lib/postgresql/18/main.before-restore
sudo mkdir -p /var/lib/postgresql/18/main
sudo chown postgres:postgres /var/lib/postgresql/18/main
sudo chmod 700 /var/lib/postgresql/18/main
Extract the base archive:
sudo -u postgres tar -xzf \
/var/lib/postgresql/backups/pg-basebackup/cluster/base.tar.gz \
-C /var/lib/postgresql/18/main
Extract the WAL archive:
sudo -u postgres mkdir -p /var/lib/postgresql/18/main/pg_wal
sudo -u postgres tar -xzf \
/var/lib/postgresql/backups/pg-basebackup/cluster/pg_wal.tar.gz \
-C /var/lib/postgresql/18/main/pg_wal
Start PostgreSQL:
sudo systemctl start postgresql
sudo systemctl status postgresql@18-main
Verify row counts match the baseline across all five tables:
PGPASSWORD=banking psql -h localhost -U banking -d bankingdb -c "
SELECT 'branches' AS table_name, COUNT(*) FROM banking.branches
UNION ALL SELECT 'employees', COUNT(*) FROM banking.employees
UNION ALL SELECT 'customers', COUNT(*) FROM banking.customers
UNION ALL SELECT 'accounts', COUNT(*) FROM banking.accounts
UNION ALL SELECT 'transactions', COUNT(*) FROM banking.transactions;"
Remove the moved-aside directory:
sudo rm -rf /var/lib/postgresql/18/main.before-restore
Part 3 — pgBackRest
pgBackRest is the production-grade backup solution for PostgreSQL. It combines physical backup with built-in WAL archiving, incremental and differential backup types, retention management, parallel I/O, and point-in-time recovery — all in one tool.
Installing pgBackRest
sudo apt install -y pgbackrest
Creating the Repository Directory
pgBackRest does not create the repository directory automatically — it must exist and be owned by postgres before the stanza is created.
sudo mkdir -p /var/lib/pgbackrest
sudo chown postgres:postgres /var/lib/pgbackrest
sudo chmod 750 /var/lib/pgbackrest
Configuring pgBackRest
sudo mkdir -p /etc/pgbackrest
sudo tee /etc/pgbackrest/pgbackrest.conf > /dev/null << 'EOF'
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
log-level-console=info
log-level-file=detail
[main]
pg1-path=/var/lib/postgresql/18/main
EOF
The stanza name main matches the Ubuntu PostgreSQL cluster name — the default on any Ubuntu PostgreSQL install.
Configuring WAL Archiving
Three parameters are needed in /etc/postgresql/18/main/postgresql.conf:
archive_mode requires a full PostgreSQL restart — it is not a reload-able parameter.
sudo systemctl restart postgresql
Verify the parameters took effect:
sudo -u postgres psql -c "SHOW archive_mode; SHOW archive_command; SHOW wal_level;"
Creating the Stanza and Running Check
A stanza is pgBackRest's named configuration for one PostgreSQL cluster. stanza-create initializes the repository directory structure — run it exactly once.
sudo -u postgres pgbackrest --stanza=main stanza-create
Then run check — pgBackRest forces a WAL switch, archives the segment, and reads it back:
sudo -u postgres pgbackrest --stanza=main check
This is the single most important pre-backup verification step. If check fails, backups will not be recoverable.
Full Backup
sudo -u postgres pgbackrest --stanza=main --type=full backup
Differential Backup
After inserting a new transaction row, I took a differential backup:
sudo -u postgres pgbackrest --stanza=main --type=diff backup
The backup size was a small fraction of the full — only changed blocks were stored.
Incremental Backup
After inserting another row:
sudo -u postgres pgbackrest --stanza=main --type=incr backup
The backup chain became: Full → Differential → Incremental.
Inspect the full catalog:
sudo -u postgres pgbackrest info
Point-In-Time Recovery (PITR)
This was the most important part of the lab.
Capture the current timestamp before the disaster:
RECOVERY_TARGET=$(sudo -u postgres psql -d bankingdb -Atc "SELECT now()::text;")
echo "Recovery target: $RECOVERY_TARGET"
sleep 2
RECOVERY_TARGET is a bash shell variable. It stores the exact PostgreSQL timestamp captured before the disaster.
All the PITR steps run in the same terminal session, so the variable stays alive between commands.
It is passed directly to pgBackRest's --target parameter, which tells it to replay WAL up to that exact moment and stop.
Simulate the disaster — drop the transactions table:
PGPASSWORD=banking psql -h localhost -U banking -d bankingdb -c "
DROP TABLE banking.transactions CASCADE;"
Stop PostgreSQL:
sudo systemctl stop postgresql
Restore to the timestamp before the drop:
sudo -u postgres pgbackrest --stanza=main \
--type=time \
"--target=$RECOVERY_TARGET" \
--target-action=promote \
--delta \
restore
--delta compares the existing data directory against the backup and replaces only changed files — much faster than a full re-extraction when most files are unchanged.
Start PostgreSQL — it enters recovery, replays WAL up to the target timestamp, then promotes to read/write:
sudo systemctl start postgresql
sudo tail -20 /var/log/postgresql/postgresql-18-main.log
Verify the table is back:
PGPASSWORD=banking psql -h localhost -U banking -d bankingdb -c "
SELECT COUNT(*) FROM banking.transactions;"
Expected: 6 rows — 4 baseline plus the 2 inserted during the differential and incremental steps. PITR worked.
Summary
| Property | pg_dump | pg_basebackup | pgBackRest |
|---|---|---|---|
| Backup level | Logical (SQL) | Physical (filesystem) | Physical (filesystem) |
| Granularity | Database / schema / table | Whole cluster | Whole cluster |
| Cross-version restore | Yes | No | No |
| Incremental backups | No | No (PG17+ limited) | Yes |
| PITR | No | Only with separate WAL archiving | Yes — built in |
| Suitable for daily production | No | No | Yes |
| Suitable for cross-version migration | Yes | No | No |
| Suitable for table-level recovery | Yes | No | No |
Practical decision rules:
- Use
pg_dumpfor cross-version migrations, table-selective backups, and ad-hoc snapshots. - Use
pg_basebackupprimarily to bootstrap streaming replicas. - Use pgBackRest for everything in production — daily incremental, PITR, retention, WAL archiving, all in one tool.
In real environments you combine them: pgBackRest for continuous protection, pg_dump for migrations and object-level recovery.
Final Thoughts
A backup is only useful if recovery actually works.
The most important discipline in this lab was not running the backup commands — it was running the restore drills and verifying the row counts afterward.
Testing restores is not optional.
If you are planning an Oracle to PostgreSQL migration and need help designing the backup and recovery strategy for the target environment, see my services →
