Scaling CDC ClickPipes via OpenAPI
Default configuration of CDC ClickPipes was designed to handle most use-cases as-is. If your workload exceeds 1 TB for the initial load or 5,000 row changes a second, or the data needs to be moved as quickly as possible regardless of cost, this API might be what you need.
Other signs that scaling may be necessary:
- Initial load is taking longer than 24 hours while the load on the source DB is low
- Consider tweaking the initial load parallelism and partitioning first
- The new rows taking more than 2× the sync interval to appear on the destination table
- As long as there are no long-running transactions on the source
For more information about the underlying infrastructure and costs, see Postgres CDC Pricing.
Prerequisites for this process
Before you get started you will need:
- ClickHouse API key with
Admin
permissions on the target ClickHouse Cloud service. - A CDC ClickPipe (Postgres, MySQL or MongoDB) provisioned in the service at some point in time. CDC infrastructure gets created along with the first ClickPipe, and the scaling endpoints become available from that point onwards.
Steps to scale CDC ClickPipes
Set the following environment variables before running any commands:
Fetch the current scaling configuration (optional):
Set the desired scaling - supported configurations include 1..16 CPU cores and memory GB that is 4× the core count:
Wait for a minute or two for the command to propagate. After the scaling is finished, the GET endpoint will reflect the new values: