Harnessing the Power of Refdata Storage Extension in PostgreSQL with EDB

PostgreSQL, with its robust and versatile nature, has long been the go-to database for many organizations. Its continuous evolution introduces features that enhance performance and scalability. A notable advancement in this realm is the Refdata storage extension provided by EDB (EnterpriseDB), especially for PostgreSQL users. This blog post will explore the nuances of the Refdata storage extension and how it revolutionizes data handling in PostgreSQL.

What is Refdata Storage Extension?

Refdata, short for “Reference Data,” is a Table Access Method (TAM) introduced in PostgreSQL 12. It’s designed to optimize how table data is stored and accessed, particularly for tables containing mostly static data.

The Ideal Use Case: Static Data and High Concurrency

The Refdata extension shines in situations where a database table, such as a product catalog in an e-commerce system, is read-intensive and seldom modified. These tables often act as reference points for other tables in the database, for example, an ‘orders’ table referencing a ‘products’ table.

In a typical scenario, the ‘products’ table would be structured as follows:

CREATE TABLE products (
product_no integer PRIMARY KEY,
name text,
price numeric
);

The Challenge: Concurrency and Performance

In high-concurrency environments, where multiple transactions are accessing the same row, PostgreSQL uses shared row locks and a mechanism called MultiXact (a mechanism for maintaining the list of active transactions accessing a row in shared mode) to manage these concurrent accesses. While effective, this can lead to overheads like cache misses and lock contention, affecting overall performance.

The Solution: Refdata TAM

Refdata TAM addresses these challenges by eliminating the overhead associated with shared row locks and MultiXacts for tables where it is applied. For instance, in our ‘products’ table example, DML operations become synchronous, and read operations are allowed to proceed concurrently, significantly enhancing performance.

Performance Benchmarks: A Real-World Perspective

Performance testing in a controlled environment, such as an EDB BigAnimal cluster on Amazon EC2 C5.4xlarge instance, reveals the tangible benefits of Refdata. Our engineer, Sravan Velagandula, recently performed a benchmark test focused on high-concurrency inserts. This test involved inserting data into the ‘orders’ table, while the ‘products’ table contained 20 rows. Utilizing pgbench, the test simulated a workload with 1000 clients and 64 threads over a duration of 5 minutes, aiming to insert rows into the orders table. This setup was designed to measure the performance impact of simultaneous insert operations under significant client and thread load. The results showed that using the Refdata extension led to a performance enhancement of over 100%, significantly outperforming the traditional method. This demonstrates the efficiency and effectiveness of the Refdata extension in handling complex database operations.

Conclusion: Refdata TAM as a Game Changer

For PostgreSQL users, the Refdata storage extension marks a significant advancement. This extension is especially advantageous for applications that frequently access tables containing mostly static data in shared mode, such as reference data. This setup ensures efficient handling and querying of data that does not change often but is regularly referenced by other parts of the application.

By implementing Refdata TAM, organizations can achieve more efficient data processing, reduce contention and lock-related overheads, and improve overall transaction throughput. This is particularly crucial in systems where performance and scalability are pivotal, such as in financial services, e-commerce platforms, and large-scale content management systems.

Key Advantages:

  1. Enhanced Performance: By reducing the overhead associated with shared row locks and MultiXacts, the Refdata extension significantly boosts the performance of read-heavy tables.
  2. Concurrency Management: It efficiently manages high-concurrency scenarios, ensuring smoother operations without the typical performance bottlenecks.
  3. Optimized Resource Utilization: Less contention and overhead mean better utilization of hardware resources, leading to cost savings in large-scale deployments.

Who Should Consider Refdata TAM?

The Refdata extension is ideal for database architects and administrators managing PostgreSQL databases with tables that:

  • Serve primarily as reference data for other tables.
  • Experience high read access patterns with minimal updates or deletions.
  • Are crucial for the performance of high-concurrency applications.

Getting Started with Refdata in EDB BigAnimal

EDB BigAnimal offers an intuitive and straightforward way to implement the Refdata storage extension. Database administrators can easily configure their PostgreSQL databases to leverage Refdata TAM for suitable tables, ensuring they get the most out of their database’s performance capabilities.

Final Thoughts

The Refdata storage extension is a testament to PostgreSQL’s and EDB’s commitment to continuous innovation and performance optimization. By understanding and leveraging this extension, organizations can take a significant step towards more efficient, scalable, and high-performing database operations. As the data landscape continues to evolve, tools like Refdata ensure that PostgreSQL remains at the forefront, meeting the demands of modern applications and systems.

Remember: While Refdata TAM brings considerable performance improvements, it’s essential to assess your specific database usage patterns and requirements to determine its suitability for your scenario. With the right implementation, Refdata can be a transformative addition to your PostgreSQL database strategy.

Leave a comment