I used to agree, but recently tried out Clickhouse for high ingestion rate time series data in the financial sector and I’m super impressed by it. Postgres was struggling and we migrated.
This isn’t to say that it’s better overall by any means, but simply that I did actually find a better tool at a certain limit.
I’ve been using ClickHouse too and it’s significantly faster than Postgres for certain analytical workloads. I benchmarked it and while Postgres took 47 seconds, ClickHouse finished within 700ms when performing a query on the OpenFoodFacts dataset (~9GB). Interestingly enough TimescaleDB (Postgres extension) took 6 seconds.
Updates and deletes don’t work as well and not being able to perform an upsert can be quite annoying. However, I found the ReplacingMergeTree and AggregatingMergeTree table engines to be good replacements so far.
Clickhouse has a unique performance gain when you have a system that isn’t operational data that is normalized and updated often. But rather tables of timeseries data being ingested for write only.
An example, stock prices or order books in real-time. Tens of thousands per second. Clickhouse can write, merge, aggregate records really nicely.
Then selects against ordered data with aggregates are lightning fast. It has lots of nuances to learn and has really powerful capability, but only for this type of use case.
It doesn’t have atomic transactions. Updates and deletes are very poor performing.
For high ingestion (really high) you have to start sharding. It’s nice to have a DB that can do that natively, MongoDB and Influx are very popular, depending on the exact application.
I used to agree, but recently tried out Clickhouse for high ingestion rate time series data in the financial sector and I’m super impressed by it. Postgres was struggling and we migrated.
This isn’t to say that it’s better overall by any means, but simply that I did actually find a better tool at a certain limit.
I’ve been using ClickHouse too and it’s significantly faster than Postgres for certain analytical workloads. I benchmarked it and while Postgres took 47 seconds, ClickHouse finished within 700ms when performing a query on the OpenFoodFacts dataset (~9GB). Interestingly enough TimescaleDB (Postgres extension) took 6 seconds.
All actions were performed through Datagrip
1 Insertion speed is influenced by reduced networking overhead due to the databases being in-process.
Updates and deletes don’t work as well and not being able to perform an upsert can be quite annoying. However, I found the ReplacingMergeTree and AggregatingMergeTree table engines to be good replacements so far.
Also there’s !clickhouse@programming.dev
deleted by creator
If you can, share your experience!
I also do finance, so if there is anything more to explore, I’m here to listen and learn.
Clickhouse has a unique performance gain when you have a system that isn’t operational data that is normalized and updated often. But rather tables of timeseries data being ingested for write only.
An example, stock prices or order books in real-time. Tens of thousands per second. Clickhouse can write, merge, aggregate records really nicely.
Then selects against ordered data with aggregates are lightning fast. It has lots of nuances to learn and has really powerful capability, but only for this type of use case.
It doesn’t have atomic transactions. Updates and deletes are very poor performing.
For high ingestion (really high) you have to start sharding. It’s nice to have a DB that can do that natively, MongoDB and Influx are very popular, depending on the exact application.