Cluster management

Crunchy Bridge helps you manage your clusters through a variety of cluster management operations. These operations are forms of maintenance that keep your clusters operational and secure.

A brief service interruption is required to perform cluster management operations. Please ensure that your applications are able to automatically reconnect to the database.

Info

A cluster's connection string will remain the same across cluster management operations unless you explicitly rotate the credentials.

When required to ensure the health of your cluster, we may schedule these on your behalf (for example, Cluster Resizes or minor version upgrades).

See the Cluster settings page for details about enabling or disabling cluster settings like the protected flag, maintenance window, or High Availability (HA).

Overview of cluster management

During cluster management operations, the Crunchy Bridge platform will execute the following steps:

  1. Create replica of current cluster

  2. Migrate existing data (duration is relative to data size)

  3. Complete any additional operation-specific steps

  4. Fail over to updated cluster

More information about the states a new cluster passes through while it is being built can be found in the Cluster states section below.

Important Notes:

  • You can cancel cluster management operations or change the run time at any time before the cluster failover is about to begin.
  • A cluster failover causes a service interruption that should last no longer than a few minutes.
  • Clusters that have HA enabled are not faster to resize than equivalent non-HA clusters (the HA is not used for the resize operation).
  • If you have no maintenance window set, and have not specified a run time, cluster failover will occur as soon as the new cluster is populated and ready.
  • If you need assistance scheduling maintenance, or are concerned about a maintenance in progress, please contact support.

Cluster states

Most of the time, when a cluster management operation is initiated, a new replica cluster is created behind the scenes. Even if you have High Availability (HA) enabled, a completely separate, fresh cluster is built and populated with the existing data in the current cluster.

Any cluster management operation will take some time to complete. The exact duration of the end-to-end process depends on many factors, including your data and schema sizes, and how busy your cluster is. You can follow the progress of an ongoing operation by tracking the state of the replacement cluster as it builds. A cluster's current state is shown in the dashboard or on the CLI using cb info.

All possible cluster states are listed below. During most cluster management operations, the replacement cluster will go through all of the states listed in the first table. Note that when a fresh empty cluster is being created, you will see it pass through some but not all of the states listed.

States seen during create, resize, update, refresh, or fork:

StateWhat's happeningTypical durationNext state
CreatingA new underlying server is being created1-2 minutesRestoring
RestoringLatest base backup is being restored to the serverVariableStarting
StartingPostgres is being started on the cluster
WAL that accumulated during base backup is being applied
VariableReplaying
ReplayingAccumulated WAL since last base backup is being replayedVariableFinalizing
FinalizingCluster configuration is being finalized and the server is being made available1-2 minutesReady
ReadyNew cluster matches source cluster and is ready for the operation to proceed
If scheduled for an upcoming maintenance window, cluster is kept Ready until that time
If scheduled for now (or "Run Now" is clicked) the operation proceeds once it reaches Ready
Running clusters normally show the Ready state
N/AN/A

Other cluster states you may see on the platform:

StateWhat's happeningTypical durationNext state
RestartingUnderlying server is being restarted1-2 minutesReady
ResumingA new server is being built and a suspended cluster is being resumed3-5 minutesReady
SuspendingCluster is being suspended3-5 minutesSuspended
SuspendedCluster is currently suspendedUntil resumedResuming

Available operations

Cluster Refresh - Replace instance, update to latest minor version, get the newest OS version, and enable the latest features

Cluster Resize - Change the instance size or storage size of the cluster

Postgres Upgrade - Upgrade the Postgres Major version

Cluster Suspend/Resume - Spin down the Postgres server but retain data on disk

Cluster refresh

You can refresh your cluster instance at anytime using the "Refresh Instance" button in the Cluster Settings tab, using cb maintenance create --cluster CLUSTER_ID on the CLI, or by using the Cluster Upgrade API endpoint without any parameter. Doing this causes minimal impact and will not change your connection string. This is a simple way to perform a minor version upgrade or to enable the newest Crunchy Bridge features on your cluster.

Cluster resize

You can resize a cluster in-place with minimal impact and no changes to your connection string. During a cluster resize, you can change the instance tier and size, as well as the amount of storage. Sizing up or down instance tier, memory, and storage are all supported.

You can change the instance size and/or storage size of your cluster using the Dashboard Cluster Action - Resize, cb upgrade start on the CLI, or by using the Cluster Upgrade API endpoint.

If you plan to decrease the storage size of your cluster, please note that we currently allow the resize to be greater than or equal to 1.4x the current disk usage to reduce alerting and immediate resizing up. If you have questions about your instance size, please contact support and we'd be happy to assist you.

Postgres upgrade

Crunchy Bridge allows you to schedule your major version upgrades through the Dashboard, cb upgrade start on the CLI, or by using the Upgrade Cluster API endpoint. When scheduling a major version upgrade, you'll be prompted to choose the new Postgres version.

Postgres major version upgrades work differently than other Cluster Management operations. Once you initiate the process, Crunchy Bridge will execute the following steps:

  1. Create a replica of your current cluster

  2. Migrate existing data (duration is relative to data size)

  3. When your maintenance window arrives:

  • Lock primary cluster to prevent writes
  • Upgrade the new cluster (duration depends on the number of objects in your database, not data size)
  1. Fail cluster over once the upgrade is complete

Important Notes:

  • Major Version changes can affect application compatibility. We recommend testing your application against the new PostgreSQL version before upgrading.
  • Read Replicas are automatically upgraded when performing a major version upgrade, but only once its primary is upgraded and a fresh backup is taken. Until then, replicas will remain available but in a stale state.
  • If you have no maintenance window set, and have not specified a run time, the upgrade will commence as soon as the new cluster is populated and ready.
  • This operation creates a service interruption that should last no longer than a few minutes.
  • If an upgrade fails, your cluster will automatically revert back to the original cluster and you will be notified by email.
  • Contact support if your upgrade is taking longer than expected, or if it fails to complete and you need help determining the cause.

Minor version upgrade

A minor version upgrade can be done simply with a Cluster Refresh. For non-critical fixes, we automatically update clusters to the latest point release gradually through:

  • Updating HA standbys
  • Installing the latest minor version during other Cluster Management operations

We examine all security-related issues and bugs with every Postgres release. For any issues that are deemed critical, we will prioritize your upgrade to ensure your data is safe.

If an emergency update is required, we will perform that update on your behalf during your cluster's next maintenance window.

Cluster suspend and resume

Crunchy Bridge allows you to suspend your cluster. You can perform this action through the Dashboard, cb suspend on the CLI, or by using the Suspend Cluster API endpoint.

Suspending a cluster will deactivate the virtual machine it's running on but keep its disk image in storage so that it can later be resumed. Normal billing for the cluster is suspended, but storage costs will continue to accrue. The existing 10 days' worth of backups are also retained.

You can resume your cluster at any time. The time it takes to resume a cluster depends on the instance and the size of the dataset. When you resume a cluster, normal billing and backups will also recommence.