Exploiting SQL/JSON Enhancements for Modern Workloads in PostgreSQL 16

The latest iteration of PostgreSQL, version 16, brings a suite of enhancements that bolster its capabilities with JSON data. These improvements not only align PostgreSQL more closely with the SQL/JSON standard but also offer significant optimizations that streamline and enhance working with JSON data. This evolution is a testament to PostgreSQL’s commitment to meeting the demands of modern, data-driven applications that leverage semi-structured data for a myriad of use cases.

In this post, we’ll delve into the SQL/JSON enhancements introduced in PostgreSQL 16, illustrate their application with code examples, and explore real-world scenarios where these enhancements can be particularly beneficial.

To fully leverage the SQL code examples mentioned in the blog post, let’s create a sample table schema (DDL) and insert some sample data for two tables, products and company_data. This setup will allow you to experiment with the new SQL/JSON enhancements in PostgreSQL 16 firsthand.

Table products

This table will store product details, including a unique identifier, the product name, attributes in JSON format, and additional product information in JSON format.

DDL for products:

CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name TEXT NOT NULL,
product_attributes JSONB,
product_info JSONB
);

Sample Data for products:

INSERT INTO products (product_name, product_attributes, product_info) VALUES
('Laptop', '{"color": "#C0FF33", "memory": "16GB"}', '{"specifications": {"color": "#FF5733", "processor": "Intel i7", "storage": "512GB SSD"}}'),
('Smartphone', '{"color": "#FF5733", "memory": "8GB"}', '{"specifications": {"color": "#FF5733", "processor": "Snapdragon 888", "storage": "256GB UFS 3.1"}}');

Table company_data

This table will simulate employee data within a company, including an employee ID, their name, and their department.

DDL for company_data:

CREATE TABLE company_data (
employee_id SERIAL PRIMARY KEY,
employee_name TEXT NOT NULL,
department TEXT NOT NULL
);

Sample Data for company_data:

INSERT INTO company_data (employee_name, department) VALUES
('Alice Johnson', 'Engineering'),
('Bob Smith', 'Engineering'),
('Catherine Lee', 'Product'),
('David Brown', 'Sales');

Key PostgreSQL 16 SQL/JSON Improvements

Enhanced Numeric Literals in SQL/JSON Paths

PostgreSQL 16 introduces more readable path expressions by supporting hexadecimal, octal, and binary integers, along with the use of underscores in large numbers to improve readability. This feature is a boon for developers working with complex JSON data structures, allowing for clearer and more intuitive queries.

Example:

-- Using hexadecimal numeric literals for path expressions
SELECT product_info -> 'specifications' -> 'color' as color_code
FROM products
WHERE product_info -> 'specifications' ->> 'color' = '#FF5733';
color_code
------------
"#FF5733"
"#FF5733"
(2 rows)

New SQL/JSON Constructors

Creating JSON objects and arrays directly within SQL queries is now more straightforward with the introduction of JSON_OBJECT, JSON_ARRAY, JSON_OBJECT_AGG, and JSON_ARRAY_AGG. These constructors simplify the process of generating JSON data from relational data, enabling more efficient data transformation and aggregation.

Example:

SELECT department, JSON_OBJECT(employee_id::TEXT: employee_name) AS employees
FROM company_data;
department | employees
-------------+-------------------------
Engineering | {"1" : "Alice Johnson"}
Engineering | {"2" : "Bob Smith"}
Product | {"3" : "Catherine Lee"}
Sales | {"4" : "David Brown"}
(4 rows)

SELECT department, JSON_OBJECTAGG(employee_id::TEXT : employee_name) AS employees
FROM company_data
GROUP BY department;
department | employees
-------------+----------------------------------------------
Product | { "3" : "Catherine Lee" }
Engineering | { "1" : "Alice Johnson", "2" : "Bob Smith" }
Sales | { "4" : "David Brown" }
(3 rows)

SELECT department, JSON_ARRAY(employee_name) AS employees
FROM company_data;
department | employees
-------------+-------------------
Engineering | ["Alice Johnson"]
Engineering | ["Bob Smith"]
Product | ["Catherine Lee"]
Sales | ["David Brown"]
(4 rows)

SELECT department, JSON_ARRAYAGG(employee_name) AS employees
FROM company_data GROUP BY department;
department | employees
-------------+--------------------------------
Product | ["Catherine Lee"]
Engineering | ["Alice Johnson", "Bob Smith"]
Sales | ["David Brown"]
(3 rows)

SQL/JSON Object Checks

The addition of IS JSON predicates improves data integrity and validation by allowing queries to check if a column’s data adheres to expected JSON structures. This feature is particularly useful for ensuring data quality and consistency.

Example:

SELECT * 
FROM products
WHERE product_info IS JSON OBJECT;

product_id | product_name | product_attributes | product_info
------------+--------------+----------------------------------------+-----------------------------------------------------------------------------------------------------
1 | Laptop | {"color": "#C0FF33", "memory": "16GB"} | {"specifications": {"color": "#FF5733", "storage": "512GB SSD", "processor": "Intel i7"}}
2 | Smartphone | {"color": "#FF5733", "memory": "8GB"} | {"specifications": {"color": "#FF5733", "storage": "256GB UFS 3.1", "processor": "Snapdragon 888"}}
(2 rows)

SELECT employees IS JSON ARRAY AS json_object FROM (SELECT department, JSON_ARRAYAGG(employee_name) AS employees
FROM company_data GROUP BY department);
json_object
-------------
t
t
t
(3 rows)

Vectorized JSON Parsing

Vector operations for JSON parsing in PostgreSQL 16 lead to significant performance gains, especially when processing large JSON documents. This enhancement underscores PostgreSQL’s capability to efficiently handle modern workloads that involve heavy JSON data manipulation.

The below snapshot demonstrates the performance difference between version 15 and 16, which is 43% faster.

PostgreSQL 15:
--------------
[postgres@ip-20-0-10-44 data]$ /usr/pgsql-15/bin/psql -p 5434
psql (15.6)
Type "help" for help.

postgres=# CREATE TABLE long_json_as_text AS
postgres-# with long as (
postgres(# select repeat(description, 10) from pg_description pd
postgres(# )
postgres-# SELECT (select json_agg(row_to_json(long)) as t from long) from
postgres-# generate_series(1, 100);
SELECT 100
postgres=# VACUUM FREEZE long_json_as_text;
VACUUM
postgres=# \timing
Timing is on.
postgres=# SELECT 1 FROM long_json_as_text WHERE jsonb_typeof(t::jsonb) = 'not me';
?column?
----------
(0 rows)

Time: 1003.056 ms (00:01.003)

PostgreSQL 16
--------------
postgres=# SELECT version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit
(1 row)
postgres=# CREATE TABLE long_json_as_text AS
postgres-# with long as (
postgres(# select repeat(description, 10) from pg_description pd
postgres(# )
postgres-# SELECT (select json_agg(row_to_json(long)) as t from long) from
postgres-# generate_series(1, 100);
SELECT 100
postgres=# VACUUM FREEZE long_json_as_text;
VACUUM
postgres=# \timing
Timing is on.
postgres=# SELECT 1 FROM long_json_as_text WHERE jsonb_typeof(t::jsonb) = 'not me';
?column?
----------
(0 rows)

Time: 437.378 ms

Real-World Scenarios

Scenario 1: Data Validation and Cleanup

With the IS JSON checks, companies can ensure that data ingested from various sources into their PostgreSQL databases meets the expected JSON format. This is crucial for data quality and downstream processing, especially in ETL pipelines and data integration scenarios.

Scenario 2: Reporting and Analytics

The new JSON aggregation functions allow for dynamic creation of JSON objects and arrays directly within SQL queries, facilitating complex reporting and analytics tasks. For instance, aggregating customer feedback stored in JSON format by categories and generating reports directly from the database.

Scenario 3: Configuration and Preference Data

Applications often store user preferences or configuration settings in JSON format. The enhanced numeric literals and direct JSON construction capabilities make it easier to query and update these settings, improving application responsiveness and user experience.

Closing Thoughts

PostgreSQL 16’s SQL/JSON enhancements significantly improve the efficiency and performance of working with JSON data. These features enable developers to write more concise, readable, and efficient queries, leading to faster application development and better data management. As semi-structured data continues to play a critical role in application development, PostgreSQL reaffirms its position as a leading database choice for modern, data-intensive applications.

Whether you’re developing new applications or looking to migrate existing workloads, PostgreSQL 16 offers compelling features that can help streamline your operations and leverage JSON data more effectively. As the database continues to evolve, it’s clear that PostgreSQL is well-equipped to meet the challenges of today’s—and tomorrow’s—data landscape.

One thought on “Exploiting SQL/JSON Enhancements for Modern Workloads in PostgreSQL 16

Leave a comment