{"id":6863,"date":"2026-05-19T21:58:37","date_gmt":"2026-05-19T19:58:37","guid":{"rendered":"https:\/\/rootfan.com\/?p=6863"},"modified":"2026-05-19T23:14:28","modified_gmt":"2026-05-19T21:14:28","slug":"configuracion-de-repmgr-en-postgresql","status":"publish","type":"post","link":"https:\/\/rootfan.com\/es\/postgresql-repmgr-setup\/","title":{"rendered":"Configuraci\u00f3n de repmgr en PostgreSQL para replicaci\u00f3n con conmutaci\u00f3n por error autom\u00e1tica"},"content":{"rendered":"<p>PostgreSQL no incluye conmutaci\u00f3n por error autom\u00e1tica de f\u00e1brica.<br>Cuando el primario falla, alguien tiene que promover el secundario manualmente, lo que significa tiempo de inactividad.<br>repmgr a\u00f1ade un demonio de conmutaci\u00f3n por error autom\u00e1tica (repmgrd) que supervisa el cl\u00faster y promueve el standby en cuesti\u00f3n de segundos cuando el primario falla.<br>Esta gu\u00eda explica c\u00f3mo configurar un cl\u00faster PostgreSQL 18 de dos nodos con r\u00e9plica por streaming y conmutaci\u00f3n por error autom\u00e1tica en Ubuntu 24.04, utilizando repmgr 5.x.<br>Cada paso se ha ejecutado en tiempo real en un cl\u00faster real y la salida ha sido verificada.<\/p>\n\n\n\n<!--more-->\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>A streaming standby that requires manual promotion is not a high-availability setup \u2014 it is a disaster recovery setup.<\/p>\n\n\n\n<p>The difference matters during an incident at 3 AM.<\/p>\n\n\n\n<p>PostgreSQL ships with the building blocks for replication, but the automatic failover logic lives in a separate tool.<\/p>\n\n\n\n<p>repmgr is the tool most PostgreSQL DBAs reach for first: it is lightweight, well-documented, and integrates cleanly with systemd.<\/p>\n\n\n\n<p>This guide builds the full stack: streaming replication from scratch, repmgrd running as a daemon, and a tested automatic failover that recovers the cluster without human intervention.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<div class=\"wp-block-rank-math-toc-block\" id=\"rank-math-toc\"><h2>\u00cdndice<\/h2><nav><ul><li><a href=\"#the-environment\">El medioambiente<\/a><\/li><li><a href=\"#step-1-install-postgre-sql-18-and-repmgr\">Step 1 \u2014 Install PostgreSQL 18 and repmgr<\/a><\/li><li><a href=\"#step-2-configure-postgre-sql-for-streaming-replication\">Step 2 \u2014 Configure PostgreSQL for Streaming Replication<\/a><\/li><li><a href=\"#step-3-allow-replication-connections-in-pg-hba-conf\">Step 3 \u2014 Allow Replication Connections in pg_hba.conf<\/a><\/li><li><a href=\"#step-4-create-the-repmgr-user-and-database\">Step 4 \u2014 Create the repmgr User and Database<\/a><\/li><li><a href=\"#step-5-configure-repmgr-on-both-nodes\">Step 5 \u2014 Configure repmgr on Both Nodes<\/a><\/li><li><a href=\"#step-6-set-up-ssh-keys-and-passwordless-sudo\">Step 6 \u2014 Set Up SSH Keys and Passwordless sudo<\/a><\/li><li><a href=\"#step-7-register-the-primary-and-clone-the-standby\">Step 7 \u2014 Register the Primary and Clone the Standby<\/a><\/li><li><a href=\"#step-8-start-repmgrd-for-automatic-failover\">Step 8 \u2014 Start repmgrd for Automatic Failover<\/a><\/li><li><a href=\"#step-9-test-automatic-failover\">Step 9 \u2014 Test Automatic Failover<\/a><\/li><li><a href=\"#step-10-rejoin-the-failed-node-as-a-standby\">Step 10 \u2014 Rejoin the Failed Node as a Standby<\/a><\/li><li><a href=\"#frequently-asked-questions\">Preguntas frecuentes<\/a><ul><li><a href=\"#faq-question-1779206653802\">Does repmgr work with PostgreSQL 18?<\/a><\/li><li><a href=\"#faq-question-1779206654802\">Why does my planned switchover fail with \u201cprimary shutdown could not be confirmed\u201d?<\/a><\/li><li><a href=\"#faq-question-1779206655802\">Why does repmgrd not trigger failover even though the primary is down?<\/a><\/li><li><a href=\"#faq-question-1779206656802\">Does the demoted node restart automatically after a switchover?<\/a><\/li><li><a href=\"#faq-question-1779206657802\">What is pg_rewind and why is it needed for node rejoin?<\/a><\/li><\/ul><\/li><li><a href=\"#in-summary\">En resumen<\/a><\/li><\/ul><\/nav><\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"the-environment\">El medioambiente<\/h2>\n\n\n\n<p>This guide uses two Ubuntu 24.04 servers on the same subnet.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Host<\/th><th>PI<\/th><th>Initial role<\/th><\/tr><\/thead><tbody><tr><td>server1<\/td><td>192.168.0.181<\/td><td>Primary<\/td><\/tr><tr><td>server2<\/td><td>192.168.0.182<\/td><td>Standby<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>PostgreSQL 18 and repmgr 5.5.0 are installed from the PGDG repository on both servers.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-1-install-postgre-sql-18-and-repmgr\">Step 1 \u2014 Install PostgreSQL 18 and repmgr<\/h2>\n\n\n\n<p>PostgreSQL 18 is not in the Ubuntu 24.04 default repository.<br>En <code>postgresql-common<\/code> package includes an official script that adds the PGDG APT repository and signing key automatically.<\/p>\n\n\n\n<p>On both servers:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo apt install -y postgresql-common\nsudo \/usr\/share\/postgresql-common\/pgdg\/apt.postgresql.org.sh\n\nsudo apt update\nsudo apt install -y postgresql-18 postgresql-18-repmgr\n<\/pre><\/div>\n\n\n<p>Verifique la instalaci\u00f3n:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\npsql --version\n# Expected: psql (PostgreSQL) 18.x\n\nrepmgr --version\n# Expected: repmgr 5.x\n<\/pre><\/div>\n\n\n<p>On server2, stop PostgreSQL and remove the default data directory.<br>repmgr will clone the primary\u2019s data directory to server2 in a later step \u2014 if a data directory already exists, <code>repmgr standby clone<\/code> will refuse to proceed.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo systemctl stop postgresql\nsudo rm -rf \/var\/lib\/postgresql\/18\/main\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-2-configure-postgre-sql-for-streaming-replication\">Step 2 \u2014 Configure PostgreSQL for Streaming Replication<\/h2>\n\n\n\n<p>These parameters must be set in <code>\/etc\/postgresql\/18\/main\/postgresql.conf<\/code> on both servers before setting up replication.<br>repmgr requires <code>wal_log_hints<\/code> y <code>shared_preload_libraries<\/code> \u2014 without them, the node rejoin process (pg_rewind) and the repmgrd daemon will not work.<\/p>\n\n\n\n<p>On both servers, edit <code>\/etc\/postgresql\/18\/main\/postgresql.conf<\/code>:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\">\nlisten_addresses = &#8216;*'\nwal_level = replica\nmax_wal_senders = 10\nmax_replication_slots = 10\nhot_standby = on\nwal_log_hints = on\nshared_preload_libraries = &#8216;repmgr'\n<\/div>\n\n\n<p>These parameters require a PostgreSQL restart to take effect.<br>Do not restart yet \u2014 add the <code>pg_hba.conf<\/code> entries first so that you do not restart twice.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-3-allow-replication-connections-in-pg-hba-conf\">Step 3 \u2014 Allow Replication Connections in pg_hba.conf<\/h2>\n\n\n\n<p>On both servers, append these lines to <code>\/etc\/postgresql\/18\/main\/pg_hba.conf<\/code>.<br>These allow the <code>repmgr<\/code> user to connect for both management queries and WAL streaming from either node.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo tee -a \/etc\/postgresql\/18\/main\/pg_hba.conf &gt; \/dev\/null &lt;&lt; &#039;EOF&#039;\n\nhost    repmgr          repmgr          192.168.0.181\/32        scram-sha-256\nhost    repmgr          repmgr          192.168.0.182\/32        scram-sha-256\nhost    replication     repmgr          192.168.0.181\/32        scram-sha-256\nhost    replication     repmgr          192.168.0.182\/32        scram-sha-256\nEOF\n<\/pre><\/div>\n\n\n<p>Now restart PostgreSQL on server1 (server2 has no data directory yet):<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo systemctl restart postgresql\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-4-create-the-repmgr-user-and-database\">Step 4 \u2014 Create the repmgr User and Database<\/h2>\n\n\n\n<p>Run these commands on server1 (the primary) only.<br>The standby will receive the repmgr schema through replication in a later step.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres psql -c &quot;CREATE USER repmgr WITH SUPERUSER REPLICATION LOGIN PASSWORD &#039;repmgr&#039;;&quot;\nsudo -u postgres psql -c &quot;CREATE DATABASE repmgr OWNER repmgr;&quot;\n<\/pre><\/div>\n\n\n<p>A\u00f1adir un <code>.pgpass<\/code> file for the postgres OS user on both servers so that repmgr can connect without a password prompt.<\/p>\n\n\n\n<p>On server1:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres bash -c &#039;cat &gt; \/var\/lib\/postgresql\/.pgpass &lt;&lt;EOF\n192.168.0.181:5432:repmgr:repmgr:repmgr\n192.168.0.182:5432:repmgr:repmgr:repmgr\n192.168.0.181:5432:replication:repmgr:repmgr\n192.168.0.182:5432:replication:repmgr:repmgr\nEOF&#039;\nsudo chmod 600 \/var\/lib\/postgresql\/.pgpass\n<\/pre><\/div>\n\n\n<p>On server2:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres bash -c &#039;cat &gt; \/var\/lib\/postgresql\/.pgpass &lt;&lt;EOF\n192.168.0.181:5432:repmgr:repmgr:repmgr\n192.168.0.182:5432:repmgr:repmgr:repmgr\n192.168.0.181:5432:replication:repmgr:repmgr\n192.168.0.182:5432:replication:repmgr:repmgr\nEOF&#039;\nsudo chmod 600 \/var\/lib\/postgresql\/.pgpass\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-5-configure-repmgr-on-both-nodes\">Step 5 \u2014 Configure repmgr on Both Nodes<\/h2>\n\n\n\n<p>Create <code>\/etc\/repmgr.conf<\/code> on each server.<br>En <code>pg_bindir<\/code> parameter is required on Ubuntu \u2014 the <code>pg_rewind<\/code> binary is not in the default PATH for the postgres OS user, and repmgr needs it during node rejoin after failover.<br>En <code>service_start_command<\/code> y <code>service_stop_command<\/code> parameters are required: without <code>service_stop_command<\/code>, planned switchovers fail with \u201cprimary shutdown could not be confirmed\u201d; without <code>service_start_command<\/code>, a node cannot rejoin the cluster automatically after being rewound.<\/p>\n\n\n\n<p>On server1:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo tee \/etc\/repmgr.conf &gt; \/dev\/null &lt;&lt; &#039;EOF&#039;\nnode_id=1\nnode_name=&#039;server1&#039;\nconninfo=&#039;host=192.168.0.181 user=repmgr dbname=repmgr connect_timeout=2&#039;\ndata_directory=&#039;\/var\/lib\/postgresql\/18\/main&#039;\n\nfailover=automatic\npromote_command=&#039;repmgr standby promote -f \/etc\/repmgr.conf --log-to-file&#039;\nfollow_command=&#039;repmgr standby follow -f \/etc\/repmgr.conf --upstream-node-id=%n --log-to-file&#039;\n\nuse_replication_slots=yes\nmonitoring_history=yes\nlog_file=&#039;\/var\/log\/repmgr\/repmgr.log&#039;\n\npg_bindir=&#039;\/usr\/lib\/postgresql\/18\/bin&#039;\nservice_start_command=&#039;sudo systemctl start postgresql&#039;\nservice_stop_command=&#039;sudo systemctl stop postgresql&#039;\n\nnode_rejoin_timeout=120\nstandby_reconnect_timeout=120\nEOF\n\nsudo chown postgres:postgres \/etc\/repmgr.conf\nsudo chmod 640 \/etc\/repmgr.conf\n<\/pre><\/div>\n\n\n<p>On server2, only <code>node_id<\/code> y <code>node_name<\/code> differ:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo tee \/etc\/repmgr.conf &gt; \/dev\/null &lt;&lt; &#039;EOF&#039;\nnode_id=2\nnode_name=&#039;server2&#039;\nconninfo=&#039;host=192.168.0.182 user=repmgr dbname=repmgr connect_timeout=2&#039;\ndata_directory=&#039;\/var\/lib\/postgresql\/18\/main&#039;\n\nfailover=automatic\npromote_command=&#039;repmgr standby promote -f \/etc\/repmgr.conf --log-to-file&#039;\nfollow_command=&#039;repmgr standby follow -f \/etc\/repmgr.conf --upstream-node-id=%n --log-to-file&#039;\n\nuse_replication_slots=yes\nmonitoring_history=yes\nlog_file=&#039;\/var\/log\/repmgr\/repmgr.log&#039;\n\npg_bindir=&#039;\/usr\/lib\/postgresql\/18\/bin&#039;\nservice_start_command=&#039;sudo systemctl start postgresql&#039;\nservice_stop_command=&#039;sudo systemctl stop postgresql&#039;\n\nnode_rejoin_timeout=120\nstandby_reconnect_timeout=120\nEOF\n\nsudo chown postgres:postgres \/etc\/repmgr.conf\nsudo chmod 640 \/etc\/repmgr.conf\n<\/pre><\/div>\n\n\n<p>Create the log directory on both servers:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo mkdir -p \/var\/log\/repmgr\nsudo chown postgres:postgres \/var\/log\/repmgr\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-6-set-up-ssh-keys-and-passwordless-sudo\">Step 6 \u2014 Set Up SSH Keys and Passwordless sudo<\/h2>\n\n\n\n<p>Planned switchovers require repmgr to SSH from one node to the other as the postgres OS user and run <code>systemctl stop postgresql<\/code>.<br>This step is mandatory \u2014 without it, switchovers fail at the point where repmgr tries to stop the current primary remotely.<\/p>\n\n\n\n<p><strong>SSH key setup:<\/strong><\/p>\n\n\n\n<p>On server1 \u2014 generate a key and display the public key:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo -u postgres ssh-keygen -t ed25519 -N &#039;&#039; -f \/var\/lib\/postgresql\/.ssh\/id_ed25519\nsudo -u postgres cat \/var\/lib\/postgresql\/.ssh\/id_ed25519.pub\n<\/pre><\/div>\n\n\n<p>On server2 \u2014 authorise server1\u2019s public key:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres mkdir -p \/var\/lib\/postgresql\/.ssh\n# Paste the public key from server1 in place of &lt;server1-public-key&gt;\nsudo -u postgres bash -c &#039;echo &quot;&lt;server1-public-key&gt;&quot; &gt;&gt; \/var\/lib\/postgresql\/.ssh\/authorized_keys&#039;\nsudo chmod 700 \/var\/lib\/postgresql\/.ssh\nsudo chmod 600 \/var\/lib\/postgresql\/.ssh\/authorized_keys\nsudo chown -R postgres:postgres \/var\/lib\/postgresql\/.ssh\n<\/pre><\/div>\n\n\n<p>Repeat in the other direction \u2014 on server2 generate a key, then authorise it on server1:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres ssh-keygen -t ed25519 -N &#039;&#039; -f \/var\/lib\/postgresql\/.ssh\/id_ed25519\nsudo -u postgres cat \/var\/lib\/postgresql\/.ssh\/id_ed25519.pub\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo -u postgres mkdir -p \/var\/lib\/postgresql\/.ssh\nsudo -u postgres bash -c &#039;echo &quot;&lt;server2-public-key&gt;&quot; &gt;&gt; \/var\/lib\/postgresql\/.ssh\/authorized_keys&#039;\nsudo chmod 700 \/var\/lib\/postgresql\/.ssh\nsudo chmod 600 \/var\/lib\/postgresql\/.ssh\/authorized_keys\nsudo chown -R postgres:postgres \/var\/lib\/postgresql\/.ssh\n<\/pre><\/div>\n\n\n<p>Verify from each node (type <code>yes<\/code> if prompted to accept the host key \u2014 only the first time):<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo -u postgres ssh postgres@192.168.0.182 &quot;echo OK&quot;\n# Expected: OK\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres ssh postgres@192.168.0.181 &quot;echo OK&quot;\n# Expected: OK\n<\/pre><\/div>\n\n\n<p><strong>Passwordless sudo \u2014 on both servers:<\/strong><\/p>\n\n\n\n<p>Create <code>\/etc\/sudoers.d\/postgres-repmgr<\/code> to allow the postgres user to start and stop PostgreSQL without a password.<br>The path must be <code>\/usr\/bin\/systemctl<\/code> \u2014 on Ubuntu 24.04 this is where systemctl lives, and sudo validates the exact path.<br>The file must have permissions 440 \u2014 sudo silently ignores files with world-readable permissions.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo tee \/etc\/sudoers.d\/postgres-repmgr &gt; \/dev\/null &lt;&lt; &#039;EOF&#039;\npostgres ALL=(ALL) NOPASSWD: \/usr\/bin\/systemctl start postgresql, \/usr\/bin\/systemctl stop postgresql, \/usr\/bin\/systemctl restart postgresql\nEOF\nsudo chmod 440 \/etc\/sudoers.d\/postgres-repmgr\n<\/pre><\/div>\n\n\n<p>Verify \u2014 no password prompt expected:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres sudo systemctl status postgresql@18-main\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-7-register-the-primary-and-clone-the-standby\">Step 7 \u2014 Register the Primary and Clone the Standby<\/h2>\n\n\n\n<p>On server1, register the running PostgreSQL instance as the primary node:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo -u postgres repmgr -f \/etc\/repmgr.conf primary register\n<\/pre><\/div>\n\n\n<p>On server2, run a dry run first to confirm connectivity:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres repmgr -h 192.168.0.181 -U repmgr -d repmgr \\\n  -f \/etc\/repmgr.conf standby clone --dry-run\n# Expected: &quot;STANDBY CLONE (target node \\&quot;server2\\&quot;) would complete successfully&quot;\n<\/pre><\/div>\n\n\n<p>Then run the actual clone:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres repmgr -h 192.168.0.181 -U repmgr -d repmgr \\\n  -f \/etc\/repmgr.conf standby clone\n<\/pre><\/div>\n\n\n<p>On Ubuntu, <code>postgresql.conf<\/code> y <code>pg_hba.conf<\/code> live in <code>\/etc\/postgresql\/18\/main\/<\/code> \u2014 outside the data directory.<br><code>pg_basebackup<\/code> only clones the data directory, so these config files are not copied automatically.<br>Copy them from server1 to server2:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1 \u2014 stage the files for transfer\nsudo cp \/etc\/postgresql\/18\/main\/postgresql.conf \/tmp\/postgresql.conf\nsudo cp \/etc\/postgresql\/18\/main\/pg_hba.conf \/tmp\/pg_hba.conf\nsudo chmod 644 \/tmp\/postgresql.conf \/tmp\/pg_hba.conf\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2 \u2014 copy and install\nscp fernando@192.168.0.181:\/tmp\/postgresql.conf \/tmp\/postgresql.conf\nscp fernando@192.168.0.181:\/tmp\/pg_hba.conf \/tmp\/pg_hba.conf\nsudo cp \/tmp\/postgresql.conf \/etc\/postgresql\/18\/main\/postgresql.conf\nsudo cp \/tmp\/pg_hba.conf \/etc\/postgresql\/18\/main\/pg_hba.conf\n<\/pre><\/div>\n\n\n<p>Start PostgreSQL on server2 and register it as a standby:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo systemctl start postgresql\nsudo -u postgres repmgr -f \/etc\/repmgr.conf standby register\n<\/pre><\/div>\n\n\n<p>Verify the cluster from either node:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgr -f \/etc\/repmgr.conf cluster show\n<\/pre><\/div>\n\n\n<p>Salida esperada:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string\n----+---------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------\n 1  | server1 | primary | * running |          | default  | 100      | 1        | host=192.168.0.181 user=repmgr dbname=repmgr connect_timeout=2\n 2  | server2 | standby |   running | server1  | default  | 100      | 1        | host=192.168.0.182 user=repmgr dbname=repmgr connect_timeout=2\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-8-start-repmgrd-for-automatic-failover\">Step 8 \u2014 Start repmgrd for Automatic Failover<\/h2>\n\n\n\n<p>repmgrd is the monitoring daemon that triggers automatic failover.<br>On Ubuntu 24.04, repmgr does not ship with a systemd unit file for repmgrd \u2014 start it manually using <code>--daemonize<\/code>.<\/p>\n\n\n\n<p>On both servers:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgrd -f \/etc\/repmgr.conf --daemonize\n<\/pre><\/div>\n\n\n<p>Verify the daemon is running and not paused:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgr daemon status\n<\/pre><\/div>\n\n\n<p>Salida esperada:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n ID | Name    | Role    | Status    | repmgrd Active | PID  | Paused? | Upstream\n----+---------+---------+-----------+----------------+------+---------+---------\n 1  | server1 | primary | * running | yes            | XXXX | no      | n\/a\n 2  | server2 | standby |   running | yes            | XXXX | no      | server1\n<\/pre><\/div>\n\n\n<p>En <code>Paused? no<\/code> column is critical \u2014 repmgrd pauses itself after a failed failover attempt and will not attempt another failover while paused.<br>Before any failover test, verify both nodes show <code>Paused? no<\/code>.<br>If a node shows <code>Paused? yes<\/code>, run:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgr daemon unpause\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-9-test-automatic-failover\">Step 9 \u2014 Test Automatic Failover<\/h2>\n\n\n\n<p>Stop PostgreSQL on server1 to simulate a primary failure:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo systemctl stop postgresql\n<\/pre><\/div>\n\n\n<p>On server2, watch repmgrd react:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo tail -f -n 50 \/var\/log\/repmgr\/repmgr.log\n<\/pre><\/div>\n\n\n<p>repmgrd waits through a configurable number of reconnect attempts (default: 6 attempts \u00d7 10 seconds = ~60 seconds) before promoting the standby.<br>When promotion completes, the log shows:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nNOTICE: promoting standby to primary\nNOTICE: STANDBY PROMOTE successful\n<\/pre><\/div>\n\n\n<p>Verify the new topology:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server2\nsudo -u postgres repmgr -f \/etc\/repmgr.conf cluster show\n<\/pre><\/div>\n\n\n<p>Expected: server2 is now primary, server1 shows as unavailable.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-10-rejoin-the-failed-node-as-a-standby\">Step 10 \u2014 Rejoin the Failed Node as a Standby<\/h2>\n\n\n\n<p>After the failover, server1 needs to be reintegrated.<br>PostgreSQL may have advanced on server2 while server1 was down \u2014 the data directories are now diverged.<br>repmgr uses <code>pg_rewind<\/code> to sync server1 back to the new primary\u2019s timeline before starting replication.<\/p>\n\n\n\n<p>On server1, verify PostgreSQL is stopped, then run:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# On server1\nsudo systemctl status postgresql@18-main\n# Expected: inactive (dead) \u2014 if active, stop it first: sudo systemctl stop postgresql\n\nsudo -u postgres repmgr node rejoin \\\n  -d &#039;host=192.168.0.182 user=repmgr dbname=repmgr&#039; \\\n  -f \/etc\/repmgr.conf \\\n  --force-rewind \\\n  --verbose\n<\/pre><\/div>\n\n\n<p><code>--force-rewind<\/code> tells repmgr to run <code>pg_rewind<\/code> regardless of the timeline divergence check.<br>On Ubuntu 24.04, the rejoin sometimes times out before PostgreSQL finishes starting as a standby.<br>If the command exits with a timeout message but the log shows pg_rewind completed successfully, start PostgreSQL manually:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo systemctl start postgresql\n<\/pre><\/div>\n\n\n<p>Register server1 in the repmgr metadata and start the repmgrd daemon.<br>En <code>--force<\/code> flag is required because server1 was previously registered as the primary:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgr -f \/etc\/repmgr.conf standby register --force\nsudo -u postgres repmgrd -f \/etc\/repmgr.conf --daemonize\n<\/pre><\/div>\n\n\n<p>Verify both nodes are running:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code\" data-no-translation=\"\"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nsudo -u postgres repmgr -f \/etc\/repmgr.conf cluster show\n<\/pre><\/div>\n\n\n<p>Expected: server1 listed as standby, server2 as primary.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"frequently-asked-questions\">Preguntas frecuentes<\/h2>\n\n\n<div id=\"rank-math-faq\" class=\"rank-math-block\">\n<div class=\"rank-math-list\">\n<div id=\"faq-question-1779206653802\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Does repmgr work with PostgreSQL 18?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>S\u00ed.<br \/>repmgr 5.5.0 supports PostgreSQL 18.<br \/>Install from the PGDG repository using the postgresql-18-repmgr package.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1779206654802\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Why does my planned switchover fail with \u201cprimary shutdown could not be confirmed\u201d?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>The service_stop_command parameter is not set in \/etc\/repmgr.conf.<br \/>repmgr needs to stop the current primary via SSH during a switchover and requires an explicit command to do so.<br \/>Add service_stop_command='sudo systemctl stop postgresql' to \/etc\/repmgr.conf on both nodes.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1779206655802\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Why does repmgrd not trigger failover even though the primary is down?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>repmgrd is paused.<br \/>The daemon pauses itself after a failed failover attempt to prevent cascading promotion in split-brain scenarios.<br \/>Run repmgr daemon status to check the Paused? column, and repmgr daemon unpause to resume monitoring.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1779206656802\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Does the demoted node restart automatically after a switchover?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>No.<br \/>repmgr stops the old primary during switchover but does not restart it.<br \/>The official repmgr documentation states: The original primary will be shut down in any case, and will need to be manually reintegrated into the replication cluster.<br \/>After the switchover completes, run sudo systemctl start postgresql on the demoted node to bring it back as a standby.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1779206657802\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>What is pg_rewind and why is it needed for node rejoin?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>pg_rewind resynchronises a PostgreSQL data directory that has diverged from the primary timeline.<br \/>This happens after failover \u2014 the old primary may have committed transactions that were not replicated before it failed.<br \/>pg_rewind replaces those diverged blocks with the correct data from the new primary, allowing the node to resume replication without a full base backup.<br \/>wal_log_hints = on must be set in postgresql.conf for pg_rewind to work.<\/p>\n\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"in-summary\">En resumen<\/h2>\n\n\n\n<p>PostgreSQL streaming replication is built in; automatic failover requires repmgr.<br>The setup involves six moving parts that must all be correct: PostgreSQL configuration, repmgr configuration, SSH keys between the postgres users, passwordless sudo for service management, replication slots, and repmgrd running and unpaused.<br>The most common failure points are the sudoers file permissions (must be 440, not 644) and the missing <code>service_stop_command<\/code> for switchovers.<\/p>\n\n\n\n<p>Once the cluster is running, the next step is to add a floating Virtual IP with Keepalived so applications always connect to the current primary automatically &#8212; see <a href=\"https:\/\/rootfan.com\/es\/postgresql-keepalived-vip\/\">PostgreSQL repmgr with Keepalived Adding a Floating VIP<\/a>.<\/p>\n\n\n\n<p>If you are building a PostgreSQL high-availability cluster for a production environment and want a second opinion on the architecture before you commit, <a href=\"https:\/\/rootfan.com\/es\/servicios\/\">ponerse en contacto<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>PostgreSQL does not include automatic failover out of the box.When the primary goes down, someone has to promote the standby manually \u2014 which means downtime.repmgr adds an automatic failover daemon (repmgrd) that monitors the cluster and promotes the standby within seconds when the primary fails.This guide walks through setting up a two-node PostgreSQL 18 cluster &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/rootfan.com\/es\/postgresql-repmgr-setup\/\" class=\"more-link\">Seguir leyendo<span class=\"screen-reader-text\"> &#8220;PostgreSQL repmgr Setup Replication with Automatic Failover&#8221;<\/span><\/a><\/p>","protected":false},"author":1,"featured_media":6864,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"rank_math_focus_keyword":"postgresql repmgr automatic failover","rank_math_title":"PostgreSQL Streaming Replication with Automatic Failover Using repmgr","rank_math_description":"Step-by-step guide to setting up a two-node PostgreSQL 18 cluster with streaming replication and automatic failover using repmgr 5.x on Ubuntu 24.04. Includes common pitfalls and how to avoid them.","rank_math_robots":"","rank_math_og_title":"","rank_math_og_description":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[126],"tags":[127,81],"class_list":["post-6863","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-postgresql","tag-architecture","tag-step-by-step"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/rootfan.com\/wp-content\/uploads\/pexels-photo-37455405.jpeg?fit=975%2C1300&ssl=1","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/posts\/6863","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/comments?post=6863"}],"version-history":[{"count":3,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/posts\/6863\/revisions"}],"predecessor-version":[{"id":6869,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/posts\/6863\/revisions\/6869"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/media\/6864"}],"wp:attachment":[{"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/media?parent=6863"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/categories?post=6863"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rootfan.com\/es\/wp-json\/wp\/v2\/tags?post=6863"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}