This won't work :) echo will run as root but the redirection is still running as the unprivileged user. Needs to be run from a privileged shell or by doing something like sudo sh -c "echo $NUM_PAGES > /proc/sys/vm/nr_hugepages"
The point gets across, though, technicality notwithstanding.
I've always found it odd that there isn't a standard command to write stdin to a file that doesn't also write it to stdout. Or that tee doesn't have an option to supress writing to stdout.
I wish more applications would adopt the "H" option that Jenkins uses in it's cron notation - essentially it is a randomiser, based on some sort of deterministic hashing function. So you say you want this job to run hourly and it will always run at the same minute past the hour, but you don't know (or care) what that minute that is. Designed to prevent the thundering herd problem with scheduled work.
Note that they were running Postgres on a 32 CPU box with 256GB of ram.
I'm actually surprised that it handled that many connections. The data implies that they have 4000 new connections/sec...but is it 4000 connections handled/sec?
Isn't this kind of the reason why teams will tend to put database proxies in front of their postgres instances, to handle massive sudden influxes of potentially short lived connections?
This sounds exactly like the problem tools like pgbouncer were designed to solve. If you're on AWS one could look at RDS Proxy.
The article is very well written but is somewhat lacking at the end.
The conclusion lists pgbouncer as one of the solutions but it does not explain it clearly.
> Many pieces of wisdom in the engineering zeitgeist are well preached but poorly understood. Postgres connection pooling falls neatly into this category. In this expedition we found one of the underlying reasons that connection pooling is so widely deployed on postgres systems running at scale. [...] an artificial constraint that has warped the shape of the developer ecosystem (RDS Proxy, pgbouncer, pgcat, etc) around it.
The artificial constraint is the single core nature of postmaster.
Other points at the end of the article that can be improved:
> we can mechnically reason about a solution.
Mechanically as in letting an AI find a solution, or as in reasoning like a mechanic, or? Furthermore:
> * Implementing jitter in our fleet of EC2 instances reduced the peak connection rate
How? Did they wait a random amount of milliseconds before sending queries to the db?
> * Eliminating bursts of parallel queries from our API servers
Also check out ProxySQL [1][2], it's an extremely powerful and battle-tested proxy. Originally it was only for MySQL/MariaDB, where it is very widely used at scale, even despite MySQL already having excellent built-in scalable threaded connection management. But ProxySQL also added Postgres support too in 2024 and that has become a major focus.
The article addresses this, sort of. I don't understand how you can run multiple postmasters.
> Most online resources chalk this up to connection churn, citing fork rates and the pid-per-backend yada, yada. This is all true but in my opinion misses the forest from the trees. The real bottleneck is the single-threaded main loop in the postmaster. Every operation requiring postmaster involvement is pulling from a fixed pool, the size of a single CPU core. A rudimentary experiment shows that we can linearly increase connection throughput by adding additional postmasters on the same host.
You don't need multiple postmasters to spawn connection processes, if you have a set of Postgres proxies each maintaining a set pool of long-standing connections, and parceling them out to application servers upon request. When your proxies use up all their allocated connections, they throttle the application servers rather than overwhelming Postgres itself (either postmaster or query-serving systems).
That said, proxies aren't perfect. https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html outlines some dangers of using them (particularly when you might need session-level variables). My understanding is that PgDog does more tracking that mitigates some of these issues, but some of these are fundamental to the model. They're not a drop-in component the way other "proxies" might be.
But, I'm hopeful that in 2-3 years from now, we'll see this bear fruition. The recent asynchronous read I/O improvements in Postgres 18 show that Postgres can evolve, one just needs to be patient, potentially help contribute, and find workarounds (connection pooling, in this case).
maybe this is silly but
these days cloud resources are so cheap. just loading up instances and putting this stuff into memory and processing it is so fast and scalable. even if you have billions of things to process daily you can just split if needed.
you can keep things synced across databases easily and keep it super duper simple.
I think this is the kind of investigation that AI can really accelerate. I imagine it did. I would love to see someone walk through a challenging investigation assisted by AI.
> sudo echo $NUM_PAGES > /proc/sys/vm/nr_hugepages
This won't work :) echo will run as root but the redirection is still running as the unprivileged user. Needs to be run from a privileged shell or by doing something like sudo sh -c "echo $NUM_PAGES > /proc/sys/vm/nr_hugepages"
The point gets across, though, technicality notwithstanding.
Or
I've always found it odd that there isn't a standard command to write stdin to a file that doesn't also write it to stdout. Or that tee doesn't have an option to supress writing to stdout.Some a prime example of a service that naturally peaks at round hours.
We have a habbit of never scheduling long running processes at round hours. Usually because they tend to be busier.
https://hakibenita.com/sql-tricks-application-dba#dont-sched...
I wish more applications would adopt the "H" option that Jenkins uses in it's cron notation - essentially it is a randomiser, based on some sort of deterministic hashing function. So you say you want this job to run hourly and it will always run at the same minute past the hour, but you don't know (or care) what that minute that is. Designed to prevent the thundering herd problem with scheduled work.
Note that they were running Postgres on a 32 CPU box with 256GB of ram.
I'm actually surprised that it handled that many connections. The data implies that they have 4000 new connections/sec...but is it 4000 connections handled/sec?
I'm a bit confused here, do they have a single database they're writing to? Wouldn't it be easier and more reliable to shard the data per customer?
I wouldn't call that "easier" perse.
Isn't this kind of the reason why teams will tend to put database proxies in front of their postgres instances, to handle massive sudden influxes of potentially short lived connections?
This sounds exactly like the problem tools like pgbouncer were designed to solve. If you're on AWS one could look at RDS Proxy.
The article is very well written but is somewhat lacking at the end.
The conclusion lists pgbouncer as one of the solutions but it does not explain it clearly.
> Many pieces of wisdom in the engineering zeitgeist are well preached but poorly understood. Postgres connection pooling falls neatly into this category. In this expedition we found one of the underlying reasons that connection pooling is so widely deployed on postgres systems running at scale. [...] an artificial constraint that has warped the shape of the developer ecosystem (RDS Proxy, pgbouncer, pgcat, etc) around it.
The artificial constraint is the single core nature of postmaster.
Other points at the end of the article that can be improved:
> we can mechnically reason about a solution.
Mechanically as in letting an AI find a solution, or as in reasoning like a mechanic, or? Furthermore:
> * Implementing jitter in our fleet of EC2 instances reduced the peak connection rate
How? Did they wait a random amount of milliseconds before sending queries to the db?
> * Eliminating bursts of parallel queries from our API servers
How?
> Mechanically as in letting an AI find a solution, or as in reasoning like a mechanic, or?
As in it's fully characterized, so you can use only math and logic rather than relying on experience and guesswork.
Also check out ProxySQL [1][2], it's an extremely powerful and battle-tested proxy. Originally it was only for MySQL/MariaDB, where it is very widely used at scale, even despite MySQL already having excellent built-in scalable threaded connection management. But ProxySQL also added Postgres support too in 2024 and that has become a major focus.
[1] https://proxysql.com/
[2] https://github.com/sysown/proxysql
And lets you rewrite queries on the fly. :)
One of the many problems PgDog will solve for you!
The article addresses this, sort of. I don't understand how you can run multiple postmasters.
> Most online resources chalk this up to connection churn, citing fork rates and the pid-per-backend yada, yada. This is all true but in my opinion misses the forest from the trees. The real bottleneck is the single-threaded main loop in the postmaster. Every operation requiring postmaster involvement is pulling from a fixed pool, the size of a single CPU core. A rudimentary experiment shows that we can linearly increase connection throughput by adding additional postmasters on the same host.
You don't need multiple postmasters to spawn connection processes, if you have a set of Postgres proxies each maintaining a set pool of long-standing connections, and parceling them out to application servers upon request. When your proxies use up all their allocated connections, they throttle the application servers rather than overwhelming Postgres itself (either postmaster or query-serving systems).
That said, proxies aren't perfect. https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html outlines some dangers of using them (particularly when you might need session-level variables). My understanding is that PgDog does more tracking that mitigates some of these issues, but some of these are fundamental to the model. They're not a drop-in component the way other "proxies" might be.
very stupid question: similar to how we had a GIL replacement in python, cant we replace postmaster with something better?
Specifically on the cost of forking a process for each connection (vs using threads), there are active efforts to make Postgres multi-threaded.
Since Postgres is a mature project, this is a non-trivial effort. See the Postgres wiki for some context: https://wiki.postgresql.org/wiki/Multithreading
But, I'm hopeful that in 2-3 years from now, we'll see this bear fruition. The recent asynchronous read I/O improvements in Postgres 18 show that Postgres can evolve, one just needs to be patient, potentially help contribute, and find workarounds (connection pooling, in this case).
maybe this is silly but these days cloud resources are so cheap. just loading up instances and putting this stuff into memory and processing it is so fast and scalable. even if you have billions of things to process daily you can just split if needed.
you can keep things synced across databases easily and keep it super duper simple.
I think this is the kind of investigation that AI can really accelerate. I imagine it did. I would love to see someone walk through a challenging investigation assisted by AI.