Scaling PostgreSQL for analytics data world

Scaling PostgreSQL for analytics data world

Author: @Potturi Prathap Chowdary

Introduction

One of the true pillars of the tech revolution, PostgreSQL is an OLTP database designed primarily to handle transactional workloads. The technology has been around for over 25 years and, nevertheless, clearly demonstrates that it’s stable, performant, and feature-rich enough to support significant analytical workloads too.

It's not uncommon for organizations to use two separate systems to handle transactional and analytical workloads. While transactional workloads demand millisecond speeds, analytical workloads focus on complex queries on large volumes of historical data. The infrastructure requirements to deliver on this have always resulted in the adoption of two different types of databases and large ETL pipelines.

This is an attempt to showcase that PostgreSQL can be scaled for various analytical use cases. We will go through the below journey in this blog. Grab a cup of coffee and let’s dive in :)

Let's revise what is OLTP and OLAP

OLTP systems excel at millisecond response times and simple queries such as single-record selects, inserts, updates, and deletes. Ex: Find by X, Update X for Y.

OLAP, or Online Analytical Processing, is an analytical database that manages complex queries and large-scale scans. Users can trade longer response times since the system is querying across larger volumes of historical data to discover and interpret findings. Ex: month-to-month report for all company sales.

Challenges of separate systems for both

  • Too many disparate moving (literally) systems lead to inconsistencies

  • Moving data is slow and painful

  • Delayed development & higher infra spending

HTAP

Here comes, HTAP database solutions. Hybrid transactional/analytical processing (HTAP) is a data architecture that joins online transactional processing (OLTP) and online analytical processing (OLAP) workloads, allowing one system to support both processing sets. Companies that are already offering it in a way (Paid offerings):

  • Snowflake Unistore

  • Singlestore (aka MemSQL)

  • Citus (Aquired by Microsoft)

  • Hydra (Ycombinator 21 batch)

How HTAP works under the hood

The ideal would be to have one single store supporting both OLTP and OLAP workloads. But we know this is very difficult to achieve. What HTAP solution does is when data is stored in a hybrid table to support fast single-row transactions, it is also copied automatically into a columnar format. This means that transactional data will be also immediately (almost) available for analytics. When used to query large quantities of data, passing through the columnar format can make the query up to 50x faster. Writing to two different data stores internally, supporting ACID transactions, and also making sure you have strong consistency for the internal replication are obvious challenges and we may in fact pay a price in performance.

How we can tune PostgreSQL for analytics use cases

We have to understand our data and query patterns to apply various techniques to get the best out of PostgreSQL. Here we will go through below cases:

  1. TimescaleDB extension of PostgreSQL for time series data analytics

  2. Citus / cstore_fdw for columnar storage and sharding the data

Things in action

Enough of talking, Let's look at PostgreSQL in action. To test out what we discussed so far, I had quickly fetched historic Indian stocks (F&O) data of past 10years. I had to persist more than 100M records in my PostgreSQL running in humble Macbook air M1 (having 8G RAM, ~250G SSD). PostgreSQL version 14 and Datadog agent has been setup locally to benchmark various cases.

Tech stack
PostgreSQL, TimescaleDB, Citus, Docker, Datadog

Hardware
Apple M1 silicon 8G RAM, 256G SSD

Leap#1: TimescaleDB (An extension of PostgreSQL)

TimescaleDB is an open-source database designed to make SQL scalable for time-series data. It is engineered from PostgreSQL and packaged as a PostgreSQL extension, providing automatic partitioning across time and space.

Let’s try to understand why we will need it by trying to operate on a huge dataset for both insertions and queries.

Usecase#1: Write Performance

Consider we are dealing with historic stocks minute data (each minute record to have - stock_name, open, high, low, close, candle_at). Let’s try saving 50M records to a table in PostgreSQL.

We have quickly hit the limits of the PostgreSQL server (running in our machine) post inserting around 10M records into our table.

This is where the TimescaleDB extension of PostgreSQL comes to our rescue with efficient partition techniques. The good thing about it is it’s an extension of PostgreSQL (Not a fork) and works seamlessly on our existing PostgreSQL setup. All we need to do is install the timescale extension and convert our table to Hypertable (That's what TimescaleDB tables are called :)

# setup timescale
brew install timescaledb
timescaledb_move.sh
brew services restart postgresql@14

# convert our table to hypertable
select create_hypertable('historic_candle_sticks', 'candle_at', migrate_data => true);

We can see our ingestion system is stable without any downgrade in performance and I was able to insert 50M records into a single table with ease.

PostgreSQL can handle much higher QPS, here I have rate-limited ingestion service to be we with rate limits set by the data provider.

Usecase#2: Read performance

Let’s get to a complex analytical aggregation query. Finding a 5-minute moving average for all stocks for a year.

The above query on Hypertable did much better as shown below.

Leap#2: cstore_fdw (columnar storage for PostgreSQL)

What if we are dealing with non-time-series data, how can we scale PostgreSQL for analytics? An attempt to solve this is by using ‘cstore_fdw’. It is a storage extension that is suited for OLAP-/DWH-style queries and data-intense applications. Now it is part of “citus” which has been acquired by Microsoft. However, citus is both open source (for self-hosted) and a cloud offering (Azure).

Let’s revise our knowledge on what is columnar storage and when it is beneficial

Columnar storage for database tables is an important factor in optimizing analytic query performance because it drastically reduces the overall disk I/O requirements. It reduces the amount of data you need to load from disk. Columnar analytical databases have unique characteristics compared to row-oriented data access. Columnar tables are organized transversely from row tables. Instead of rows being added one after another, rows are inserted in bulk into a stripe. Within each stripe, data from each column is stored next to each other.

Imagine rows of a table containing:

| a | b | c | d |

| a | b | c | d |

| a | b | c | d |

This data would be stored as follows in columnar:

| a | a | a |

| b | b | b |

| c | c | c |

| d | d | d |

  • Columnar is optimized for table scans — in fact, it doesn’t use indexes at all. Using columnar, it’s much quicker to obtain all data for a particular column. The database does not need to read data that you are not interested in.

  • It uses the metadata about the values in the chunks to eliminate reading data. This is a form of “auto-indexing” the data.

  • Because similar data is stored next to each other, very high data compression is possible. Data compression is an important benefit because columnar is often used to store huge amounts of data. By compressing the data, you can effectively read data more quickly from disk, which both reduces I/O and increases effective fetch speed. It also has the effect of making better use of disk caching as data is cached in its compressed form. Lastly, you greatly reduce your storage costs.

Usecase#1

Let’s try setting up citus using docker and inserting data in a table to benchmark query performance.

# citus setup and run
docker run -d --name citus -p 5432:5432 -e POSTGRES_PASSWORD=<password_here> citusdata/citus:12.0 
docker exec -it citus psql -U postgres

# create columnar table
CREATE TABLE <table_name>(...) USING columnar;

# randomly generate 100M records and save in page_views table
\COPY (SELECT s % 307, (random()*5000)::int, '203.0.113.' || (s % 251), now() + random() * interval '60 seconds' FROM generate_series(1,100000000) s) TO '/tmp/views.csv' WITH CSV
\COPY page_views FROM '/tmp/views.csv' WITH CSV

Now, Let’s try running an aggerate query to fetch the view count grouped by tenant_id and page_id.

In little over 1s, we are able to see the result. We can even distribute tables (sharding) for better performance as below.

CREATE TABLE daily_page_views (
    tenant_id int,
    day date,
    page_id int,
    view_count bigint,
    primary key (tenant_id, day, page_id)
);

SELECT create_distributed_table('daily_page_views', 'tenant_id');

In less than 50ms we are able to aggregate 100M records. Well, that’s impressive.

Usecase#2

Let’s come back to our stock data store (around 50M) we have persisted, Let’s try a few more queries and benchmark timescale and Citus to understand where columnar storage shines.

We have previously queried for the moving average of all stocks for a given period from Hypertable. Now we will try to do the same for a particular stock. In timescale, data would be partitioned by date and the moment we apply a cross-query parameter (apply trading_symbol in where clause), performance takes a hit.

Now, the same with Citus performed better as shown below.

We typically want to keep aggregations up-to-date, even as the current day progresses. We can achieve this by expanding our original command to only consider new rows and updating existing rows to consider the new data using ON CONFLICT (“upsert”). If we insert data for a primary key that already exists in the aggregation table, then the count will be added instead. With a simple PL/pgSQL function we can copy data from a regular table to our partitioned columnar table (for analytics) with less than a min delay.

Future improvements

  • In some cases, we might need to maintain duplicate tables (analytical query table to be different from transaction table). Context-aware query execution and data movement around would reduce redundant storage. Snowflake Unistore does it better already.

  • Citus parallel processing for even better aggregation results.

  • Ease of use tools for migrating other data from other database solutions to PostgreSQL powered by TimescaleDB and Citus.

Conclusion

Not every use case needs complex ETLs and costly data warehouse solutions. If you look enough, the humble PostgreSQL eco-system would surprise you for many of your analytical use cases. There are several big tech firms already embracing PostgreSQL to the core. Here you can find how Zerodha (a leading stock broker in India) is using PostgreSQL for reporting use cases having >20TB of data.

The secret is understanding your data and query patterns for applying several techniques as learned so far. That's a wrap, Goodbye!!

References

GitHub - timescale/timescaledb: An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

GitHub - citusdata/cstore_fdw: Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.

Making Postgres stored procedures 9X faster in Citus

Single-Node Citus - Citus 12.1 documentation