⟵ Posts

Fifteen PostgreSQL parameters to tune for higher performance

21 Aug 2023

PostgreSQL, with its default configuration, will run on a Raspberry Pi. Learning AWS RDS PostgreSQL ships with the same default settings surprised me. AWS customizes shared_buffers and effective_cache_size, leaving multiple performance-influencing settings at PostgreSQL’s defaults, including work_mem.

work_mem

work_mem determines memory allocation for query sorting, directly impacting query performance. If work_mem is too low, sorting operations will “spill to the disk,” and the query will run more slowly than it would in memory. Setting work_mem too high leads to a memory shortage for new connections, effectively lowering the system’s ability to respond to multiple concurrent connections.

log_temp_files

To identify queries spilling to disk, look for log messages referencing temporary files or “external merge Disk” in query analysis. Enable log_temp_files to generate these messages.

For production web apps, starting with 8MB is recommended. Values larger than 64MB suggest conflicting workload requirements. Consider using a read replica for queries that require more memory.

maintenance_work_mem

This setting specifies the maximum memory for critical operations like vacuuming and index creation. Proper configuration of maintenance_work_mem reduces disruption to queries serving customer requests.

Most databases will benefit from increasing maintenance_work_mem to 5% of the total memory.

checkpoint_completion_target, checkpoint_timeout, max_wal_size, min_wal_size, log_checkpoints

The default settings in Postgresql cause excessive checkpointing, which creates an extra write load. Regular checkpoints based on the checkpoint_timeout parameter are the goal.

These are good starting configuration settings:

checkpoint_completion_target    = 0.97
checkpoint_timeout              = 25m
max_wal_size                    = 4GB
min_wal_size                    = 512MB
log_checkpoints                 = on

Increase max_wal_size gradually if the system hits the limit often, and increase checkpoint_timeout for databases with high I/O during checkpoints. For checkpoint_timeout, values up to one hour are sensible.

wal_buffers

The WAL data generated by a typical transaction must fit in the setting. The impact of wal_buffers increases with more concurrent users. Start by allocating 16MB plus 4 additional megabytes per core.

A restart is required for the new setting to come into effect. Adjust wal_buffers along with max_connections to minimize downtime.

max_connections

Raise max_connections to 200 (from the default 100) and restart. Restarting now affects fewer customers, making it a preferable option. Consider using a connection pooler if more connections are needed.

effective_io_concurrency

This setting determines the maximum number of concurrent operations that the underlying system can support. Adjusting this parameter can lead to significant performance improvements, although identifying the optimal value might require extensive experimentation. I recommend starting with a value of 10 * # of cpus, with possibilities extending up to 1000. Databases that often perform Bitmap Heap Scans will particularly benefit from higher values.

log_statement, log_min_duration_statement

To prevent the potential exposure of Personally Identifiable Information (PII), set log_statement to none to disable it and change the slow_query_logging setting to -1 to turn slow query logging off. To identify slow queries, consider using the auto_explain extension for brief periods.

log_lock_waits

Enable log_lock_waits to identify whether extended lock waits occur. Lock wait timeouts lead to suboptimal performance.

statement_timeout

Long queries consume resources, slow down queries, block new connections, and cause conflicts with other transactions. Setting the statement_timeout parameter to a maximum of 80% of the client timeout is recommended to prevent this scenario. For instance, if using Puma with a default timeout of 60 seconds, set the statement_timeout parameter to a maximum of 48 seconds.

Conclusion

Customizing these fifteen parameters will vastly improve your database’s performance; at least six more parameters can be fine-tuned further for even more performance and operational stability.


Running Postgres on AWS RDS? Reach out and say hi 👋🏽!