Scaling PostgreSQL Without Hassle: Meet PgDog

PgDog, founded in 2025 and headquartered in San Francisco, delivers a revolutionary solution for PostgreSQL scalability. Built from real-world experience at Instacart during periods of intense growth, PgDog addresses a persistent challenge: how to scale PostgreSQL without extensive application changes. With a unique focus on automatic sharding, query routing, and load balancing, PgDog introduces a new level of simplicity and performance for database management.

What Makes PgDog Different from Traditional Sharding Solutions?

Unlike traditional sharding methods that often require intrusive changes to database schemas or application logic, PgDog operates completely outside the database. It supports managed environments such as AWS RDS and Google Cloud SQL without requiring extensions or modifications. PgDog understands SQL at a deep level, parsing and analyzing queries to distribute them intelligently among shards, maintaining transparency for client applications.

How Does PgDog Handle Query Routing?

PgDog’s core innovation lies in its advanced query routing system. By integrating the PostgreSQL parser directly through the pg_query crate in Rust, PgDog can interpret every valid SQL command. It identifies sharding keys within queries and routes them efficiently to the appropriate database shard. Even complex queries involving joins or foreign keys are managed seamlessly. When a query lacks a clear sharding key, PgDog executes cross-shard queries, gathering and merging results before presenting them to the client.

What Happens When Queries Lack Sharding Information?

In scenarios where queries cannot be automatically routed to a single shard, PgDog transparently distributes the query to all shards. It then consolidates the results, handling complexities such as sorting and aggregation internally. Whether a query demands ordered results or involves aggregate functions like count, min, or max, PgDog ensures accurate, efficient data retrieval without additional burden on the client.

Can PgDog Load Balance Queries Across Multiple Databases?

Yes, PgDog doubles as a powerful load balancer and connection pooler, effectively replacing tools like PgBouncer and RDS Proxy. It operates at the application layer and offers several load balancing strategies, including random distribution, least active connections, and round robin. This flexibility allows organizations to optimize resource usage and improve database performance under varying workloads.

How Does PgDog Manage Transactions Across Shards?

Handling transactions in a sharded environment can be tricky, but PgDog approaches this challenge intelligently. It buffers the initial BEGIN statement and waits for the subsequent query to determine the appropriate shard. This buffering is transparent to the client, ensuring that the entire transaction can be routed correctly without introducing latency or inconsistency, except for specific time-sensitive cases.

Is PgDog Suitable for Existing Databases?

One of PgDog’s standout features is its ability to shard existing PostgreSQL databases without downtime. By leveraging logical replication, it can split databases while maintaining service continuity. This functionality dramatically reduces the risks and complexities typically associated with database migrations and scalability upgrades.

What About Copy Operations and Bulk Data Transfers?

PgDog simplifies bulk data ingestion with intelligent sharding for COPY commands. Large datasets can be distributed across multiple shards without manual intervention from clients. This feature is crucial for businesses that handle extensive data uploads and need efficient, automated data distribution.

Does PgDog Support Schema Changes Across Shards?

Schema modifications, such as CREATE TABLE statements, are automatically propagated to all shards simultaneously. This uniformity ensures that the entire database cluster maintains a consistent structure, simplifying management and minimizing errors. While all shards typically share the same schema, manual routing options are available for more complex scenarios.

How Does PgDog Handle Aggregate Functions and Sorting?

PgDog provides built-in support for several aggregate functions across shards, including count, min, max, and sum. Sorting operations are also handled post-query execution, ensuring that even complex, cross-shard queries deliver properly ordered results. Although some aggregate functions like avg() require query rewrites to function correctly, PgDog’s ability to manage these computations across distributed data sets is a significant advantage.

What Are the Limitations of PgDog?

While PgDog covers a vast range of scalability and routing needs, certain limitations exist. Full support for advanced aggregates and GROUP BY queries involving hidden fields is still under development. Additionally, PgDog does not currently implement two-phase commit protocols for cross-shard transactional integrity, although work is underway to introduce this capability.

How Does PgDog Achieve Speed Comparable to NoSQL Systems?

By employing a share-nothing architecture where each shard operates independently, PgDog enables horizontal scaling similar to NoSQL systems. Data and query operations are parallelized across shards, allowing PostgreSQL to achieve performance levels traditionally associated with key-value stores, without sacrificing the advantages of a relational database.

How Flexible Is PgDog’s Configuration?

PgDog is designed for maximum flexibility and ease of use. Load balancing features activate automatically when multiple databases are present, and administrators can fine-tune load balancing strategies to match their specific needs. Whether organizations require random query distribution, active connection tracking, or orderly round robin routing, PgDog’s configuration options are intuitive and adaptable.

How Transparent Is PgDog for Application Developers?

One of PgDog’s primary goals is to minimize the impact of scaling on application development. Developers do not need to modify their applications or database access layers. PgDog transparently manages sharding, routing, and load balancing behind the scenes, allowing developers to focus on building features rather than worrying about database scaling complexities.

Why Should Companies Choose PgDog for Scaling PostgreSQL?

For companies looking to scale PostgreSQL without compromising reliability, performance, or developer productivity, PgDog offers a compelling solution. It combines the ease-of-use of managed services, the power of SQL, and the scalability of modern distributed architectures. With minimal operational overhead, support for existing deployments, and innovative features tailored to real-world needs, PgDog positions itself as the missing link in PostgreSQL’s evolution toward limitless scalability.

What Is the Future for PgDog?

With continued development focused on enhancing cross-shard transaction support, expanding aggregate function capabilities, and further refining performance optimizations, PgDog is poised to become a critical infrastructure component for organizations relying on PostgreSQL. As demand for high-performance, scalable relational databases grows, PgDog’s approach to effortless scaling will likely set a new industry standard.

In short, with PgDog, PostgreSQL doesn’t just scale—it scales elegantly, transparently, and powerfully, unlocking new possibilities for developers and businesses alike.