Migrate Timescale Hypertables To Vanilla Postgres
This post documents a migration from TimescaleDB hypertables to vanilla PostgreSQL tables. The decision to migrate was driven by the need to reduce overall system complexity, particularly given my insufficient understanding of TimescaleDB's internal hypertable maintenance mechanisms and a data loss incident that occurred during an upgrade.
TimescaleDB, Hypertables and the Migration Problem
TimescaleDB enhances PostgreSQL for time-series data by introducing hypertables,
which are virtual tables that automatically partition data into smaller,
manageable tables called chunks, typically organized by time dimension. Standard
tools such as pg_dump
interact only with the main hypertable, which functions as
an empty parent table containing metadata and pointers to its chunks. Since the
actual data resides within child chunk tables, a standard pg_dump
operation
captures only the schema of the empty parent table while omitting the data-filled chunks.
Solution
While the data remains accessible through direct psql
queries, enabling a COPY
with TO 'table.csv'
approach, my database experiences high concurrency with
approximately 98% of operations being inserts. Restoring this data to a new
database while clients continue performing operations would result in conflicts
during the restoration process without proper conflict resolution mechanisms
(such as ON CONFLICT .. DO NOTHING
). A more effective approach involves
leveraging the database's internal conflict resolution capabilities.
Given these considerations, I employed a migration strategy that involves
creating a new table with identical structure and content. Assuming a hypertable
named my_table
, the migration process proceeds as follows:
CREATE TABLE my_table_raw AS SELECT * FROM my_table;
ALTER TABLE my_table RENAME TO my_table_hypertable;
ALTER TABLE my_table_raw RENAME TO my_table;
Subsequently, pg_dump
can properly export the database for migration to a new
vanilla PostgreSQL instance:
pg_dump \
--data-only \
--column-inserts \
--table=ticker \
-U "user" -h "postgres-host.domain.com" -p "5432" \
-d "database_name" \
| sed 's/);$/) ON CONFLICT (col1, col2) DO NOTHING;/' > table.sql
Alternative Solution: CSV Export
For completeness, an alternative CSV export approach is documented below:
psql \
-U "user" -h "postgres-host.domain.com" -p "5432" \
-d "database_name"
-c "\COPY (SELECT * FROM my_table) TO 'data.csv' WITH CSV HEADER"