Having built a ticketing system that sold some Oasis level concerts there's a few misconceptions here:
Selling an event out takes a long time to do frequently because tickets are VERY frequently not purchased--they're just reserved and then they fall back into open seating. This is done by true fans, but also frequently by bots run by professional brokers or amateur resellers. And Cloudflare and every other state of the art bot detection platform doesn't detect them. Hell, some of the bots are built on Cloudflare workers themselves in my experience...
So whatever velocity you achieve in the lab--in the real world you'll do a fraction of it when it comes to actual purchases. That depends upon the event really. Events that fly under the radar may get you a higher actual conversion rate.
Also, an act like Oasis is going to have a lot of reserved seating. Running through algorithms to find contiguous seats is going to be tougher than this example and it's difficult to parallelize if you're truly giving the next person in the queue the actual best seats remaining.
There are many other business rules that accrue after years of features to win Oasis like business unfortunately that will result in more DB calls and add contention.
Agree with the above, we built and run a ticketing platform, the actual transaction of purchasing the ticket at the final step in the funnel is not the bottleneck.
The shopping process and queuing process puts considerably more load on our systems than the final purchase transaction, which ultimately is constrained by the size of the venue, which we can control by managing the queue throughput.
Even with a queue system in place, you inevitably end up with the thundering heard problem when ticket sales open, as a large majority of users will refresh their browsers regardless of instructions to the contrary
Does that mean that there is some smoke and mirrors when, eg Taylor Swift, says they sold out the concert in minutes? Or are the mega acts truly that high demand?
You can get the seats into "baskets" (reserved) in minutes. In my experience they will not sell out for some time as they usually keep dropping back into inventory. "Sold Out" is a matter of opinion. There are usually lots of single seats left sometimes for weeks or months. The promoter decides when to label the event as "sold out".
I recently did performance testing of Tigerbeetle for a financial transactions company. The key thing to understand about Tigerbeetle's speed is that it achieves very high speeds through batching transactions.
----
In our testing:
For batch transactions, Tigerbeetle delivered truly impressive speeds: ~250,000 writes/sec.
For processing transactions one-by-one individually, we found a large slowdown: ~105 writes/sec.
This is much slower than PostgreSQL, which row updates at ~5495 sec. (However, in practice PostgreSQL row updates will be way lower in real world OLTP workloads due to hot fee accounts and aggregate accounts for sub-accounts.)
One way to keep those faster speeds in Tigerbeetle for real-time workloads is microbatching incoming real-time transactions to Tigerbeetle at an interval of every second or lower, to take advantage of Tigerbeetle's blazing fast batch processing speeds. Nonetheless, this remains an important caveat to understand about its speed.
We didn't observe any automatic batching when testing Tigerbeetle with their Go client. I think we initiated a new Go client for every new transaction when benchmarking, which is typically how one uses such a client in app code. This follows with our other complaint: it handles so little you will have to roll a lot of custom logic around it to batch realtime transactions quickly.
Interesting, I thought I had heard that this is automatically done, but I guess it's only through concurrent tasks/threads. It is still necessary to batch in application code.
But nonetheless, it seems weird to test it with singular queries, because Tigerbeetle's whole point is shoving 8,189 items into the DB as fast as possible. So if you populate that buffer with only one item your're throwing away all that space and efficiency.
We certainly are losing that efficiency, but this is typically how real-time transactions work. You write real-time endpoints to send off transactions as they come in. Needing to roll more than that is a major introduction of complexity.
We concluded where Tigerbeetle really shines is if you're a large entity like a central bank or corporation sending massive transaction files between entities. Tigerbeetle is amazing for moving large numbers of batch transactions at once.
We found other quirks with Tigerbeetle that made it difficult as a drop-in replacement for handling transactions in PostgreSQL. E.g. Tigerbeetle's primary ID key isn't UUIDv7 or ULID, it's a custom id they engineered for performance. The max metadata you can save on a transaction is a 128-bit unsigned integer on the user_data_128 field. While this lets them achieve lightning fast batch transaction processing benchmarks, the database allows for the saving of so little metadata you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.
I'm a bit worried you think instantiating a new client for every request is common practice. If you did that to Postgres or MySQL clients, you would also have degradation in performance.
PHP has created mysqli or PDO to deal with this specifically because of the known issues of it being expensive to recreate client connects per request
We didn't rule out using Tigerbeetle, but the drop in non-batch performance was disappointing and a reason we haven't prioritised switching our transaction ledger from PostgreSQL to Tigerbeetle.
There was also poor Ruby support for Tigerbeetle at the time, but that has improved recently and there is now a (3rd party) Ruby client: https://github.com/antstorm/tigerbeetle-ruby/
It seems to me that, in practice, you'd want the "LiveBatcher" to have some durability as well. Is there a scenario where a customer could lose their place because of a horribly timed server shutdown, where those transfers hadn't even been sent to TigerBeetle as pending yet? Or am I misunderstanding the architecture here?
Edit: Yes, I think I misunderstood something here. The user wouldn't even see their request as having returned a valid "pending" ticket sale since the batcher would be active as the request is active. The request won't return until its own transfer had been sent off to TigerBeetle as pending.
Yes, that's why I would expect it to smoke Postgres here, in process is orders of magnitude faster. Do you really need concurrency here when you can do 10-100k+ inserts per second?
Also surprised. My yardstick was this post which showed SQLite beating Postgres in a Django app. Benchmarking is hard, and the author said the Postgres results were not tuned to the same degree as SQLite, so buyer beware.
https://blog.pecar.me/django-sqlite-benchmark
Having built a ticketing system that sold some Oasis level concerts there's a few misconceptions here:
Selling an event out takes a long time to do frequently because tickets are VERY frequently not purchased--they're just reserved and then they fall back into open seating. This is done by true fans, but also frequently by bots run by professional brokers or amateur resellers. And Cloudflare and every other state of the art bot detection platform doesn't detect them. Hell, some of the bots are built on Cloudflare workers themselves in my experience...
So whatever velocity you achieve in the lab--in the real world you'll do a fraction of it when it comes to actual purchases. That depends upon the event really. Events that fly under the radar may get you a higher actual conversion rate.
Also, an act like Oasis is going to have a lot of reserved seating. Running through algorithms to find contiguous seats is going to be tougher than this example and it's difficult to parallelize if you're truly giving the next person in the queue the actual best seats remaining.
There are many other business rules that accrue after years of features to win Oasis like business unfortunately that will result in more DB calls and add contention.
Agree with the above, we built and run a ticketing platform, the actual transaction of purchasing the ticket at the final step in the funnel is not the bottleneck.
The shopping process and queuing process puts considerably more load on our systems than the final purchase transaction, which ultimately is constrained by the size of the venue, which we can control by managing the queue throughput.
Even with a queue system in place, you inevitably end up with the thundering heard problem when ticket sales open, as a large majority of users will refresh their browsers regardless of instructions to the contrary
Does that mean that there is some smoke and mirrors when, eg Taylor Swift, says they sold out the concert in minutes? Or are the mega acts truly that high demand?
You can get the seats into "baskets" (reserved) in minutes. In my experience they will not sell out for some time as they usually keep dropping back into inventory. "Sold Out" is a matter of opinion. There are usually lots of single seats left sometimes for weeks or months. The promoter decides when to label the event as "sold out".
I recently did performance testing of Tigerbeetle for a financial transactions company. The key thing to understand about Tigerbeetle's speed is that it achieves very high speeds through batching transactions.
----
In our testing:
For batch transactions, Tigerbeetle delivered truly impressive speeds: ~250,000 writes/sec.
For processing transactions one-by-one individually, we found a large slowdown: ~105 writes/sec.
This is much slower than PostgreSQL, which row updates at ~5495 sec. (However, in practice PostgreSQL row updates will be way lower in real world OLTP workloads due to hot fee accounts and aggregate accounts for sub-accounts.)
One way to keep those faster speeds in Tigerbeetle for real-time workloads is microbatching incoming real-time transactions to Tigerbeetle at an interval of every second or lower, to take advantage of Tigerbeetle's blazing fast batch processing speeds. Nonetheless, this remains an important caveat to understand about its speed.
Doesn't the Tigerbeetle client automatically batch requests?
We didn't observe any automatic batching when testing Tigerbeetle with their Go client. I think we initiated a new Go client for every new transaction when benchmarking, which is typically how one uses such a client in app code. This follows with our other complaint: it handles so little you will have to roll a lot of custom logic around it to batch realtime transactions quickly.
Interesting, I thought I had heard that this is automatically done, but I guess it's only through concurrent tasks/threads. It is still necessary to batch in application code.
https://docs.tigerbeetle.com/coding/clients/go/#batching
But nonetheless, it seems weird to test it with singular queries, because Tigerbeetle's whole point is shoving 8,189 items into the DB as fast as possible. So if you populate that buffer with only one item your're throwing away all that space and efficiency.
We certainly are losing that efficiency, but this is typically how real-time transactions work. You write real-time endpoints to send off transactions as they come in. Needing to roll more than that is a major introduction of complexity.
We concluded where Tigerbeetle really shines is if you're a large entity like a central bank or corporation sending massive transaction files between entities. Tigerbeetle is amazing for moving large numbers of batch transactions at once.
We found other quirks with Tigerbeetle that made it difficult as a drop-in replacement for handling transactions in PostgreSQL. E.g. Tigerbeetle's primary ID key isn't UUIDv7 or ULID, it's a custom id they engineered for performance. The max metadata you can save on a transaction is a 128-bit unsigned integer on the user_data_128 field. While this lets them achieve lightning fast batch transaction processing benchmarks, the database allows for the saving of so little metadata you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.
I'm a bit worried you think instantiating a new client for every request is common practice. If you did that to Postgres or MySQL clients, you would also have degradation in performance.
PHP has created mysqli or PDO to deal with this specifically because of the known issues of it being expensive to recreate client connects per request
Did the company end up using it?
We didn't rule out using Tigerbeetle, but the drop in non-batch performance was disappointing and a reason we haven't prioritised switching our transaction ledger from PostgreSQL to Tigerbeetle.
There was also poor Ruby support for Tigerbeetle at the time, but that has improved recently and there is now a (3rd party) Ruby client: https://github.com/antstorm/tigerbeetle-ruby/
It seems to me that, in practice, you'd want the "LiveBatcher" to have some durability as well. Is there a scenario where a customer could lose their place because of a horribly timed server shutdown, where those transfers hadn't even been sent to TigerBeetle as pending yet? Or am I misunderstanding the architecture here?
Edit: Yes, I think I misunderstood something here. The user wouldn't even see their request as having returned a valid "pending" ticket sale since the batcher would be active as the request is active. The request won't return until its own transfer had been sent off to TigerBeetle as pending.
Why cant this be done with PostgreSQL?
Obligatory Jepsen report https://jepsen.io/analyses/tigerbeetle-0.16.11
Is FastAPI just bad with SQLite? I would have expected SQLite to smoke Postgres in terms of ops/s.
I think Python is bad in general if you want “high-performance”
SQLite is in process, but concurrent write / performance is a complex matter : https://sqlite.org/wal.html
Yes, that's why I would expect it to smoke Postgres here, in process is orders of magnitude faster. Do you really need concurrency here when you can do 10-100k+ inserts per second?
If 100k users each hit purchase button at the same time will sqlite write it in 1 second?
This is different than 1 user doing the purchase for 100k fans
Also surprised. My yardstick was this post which showed SQLite beating Postgres in a Django app. Benchmarking is hard, and the author said the Postgres results were not tuned to the same degree as SQLite, so buyer beware. https://blog.pecar.me/django-sqlite-benchmark