pg_flo

The easiest way to move and transform data between PostgreSQL databases The project is in active development. Follow the GitHub repo for more details 😊

  • Real-time Streaming

    Stream data changes in near real-time using PostgreSQL Logical Replication.

  • Bulk Copy

    Parallelizable bulk copy for fast initial data synchronization.

  • Powerful Transformations

    Apply regex-based transformations, mask sensitive data, and filter based on column values before the changes reach the destination.

  • Flexible Routing

    Route data seamlessly between tables - whether to differently named tables or the same table with custom column mappings.


# Install using Docker
docker pull shayonj/pg_flo:latest

# Start NATS server
docker run -d --name pg_flo_nats \
  --network host \
  nats:latest

# Start replicator
docker run -d --name pg_flo_replicator \
  --network host \
  shayonj/pg_flo:latest \
  replicator

# Start worker with PostgreSQL sink
docker run -d --name pg_flo_worker \
  --network host \
  shayonj/pg_flo:latest \
  worker postgres

Transformation Rules

Mask Email Addresses Replaces email characters with asterisks, preserving the domain


- type: transform
  column: email
  parameters:
    type: mask
    mask_char: "*"
  operations: [INSERT, UPDATE]
              

Format Phone Numbers Formats 10-digit phone numbers to (XXX) XXX-XXXX format


- type: transform
  column: phone
  parameters:
    type: regex
    pattern: "^(\\d{3})(\\d{3})(\\d{4})$"
    replace: "($1) $2-$3"
  operations: [INSERT, UPDATE]
              

Table Routing & Column Mapping

Flexible Table Routing Route data between differently named tables and columns


users:
  source_table: users
  destination_table: customers
  column_mappings:
    - source: id
      destination: customer_id
    - source: username
      destination: customer_name
  operations: [INSERT, UPDATE]

Multi-Table Routing Route multiple tables with different mappings simultaneously


orders:
  source_table: orders
  destination_table: transactions
  column_mappings:
    - source: id
      destination: transaction_id
    - source: total_amount
      destination: amount
  operations: [INSERT, UPDATE, DELETE]

Use Cases

Real-time Replication

Stream data changes between PostgreSQL databases in real-time with built-in transformation capabilities.

Shadow Tables

Create shadow tables within the same database for data migrations, allowing time to verify data integrity before switching over.

Data Privacy & ETL

Build ETL pipelines with data masking, filtering, and transformation for analytics and security compliance.

Database Migration

Perform zero-downtime database migrations with initial bulk copy followed by continuous replication.

Audit Logging

Maintain comprehensive audit trails by automatically replicating changes to dedicated audit tables with timestamps and change tracking.

How It Works

pg_flo leverages PostgreSQL's logical replication system to capture and stream data changes. It uses NATS as a message broker to decouple reading from the WAL through the replicator and worker processes, providing flexibility and scalability. Transformations and filtrations are applied before the data reaches the destination.