If you don't monitor Postgres, you are not serious.
It's year 2024 and Postgres is keeping the momentums. The rise of pg_vector, Supabase and Neon fuel the Postgres adoption. This post reviews the tools and best practices for monitoring your Postgres database.
Open-source Tools
pg_stat_statements
The first thing is to enable the pg_stat_statements
extension.
pg_stat_statements
tracks planning and execution statistics of all SQL statements executed by a server. It's
also a prerequisite for other more advanced monitoring solutions.
explain.dalibo.com
explain.dalibo.com is a web UI to visiualize the Postgres EXPLAIN query plans. It's also open sourced on GitHub and can be embedded into your own web applications.
Prometheus Postgres Exporter
Prometheus Postgres Exporter extract Postgres database metrics and store them in Prometheus. From there, you can use Prometheus Alertmanager to configure alerting rules and use Grafana to create dashboards.
Proprietary Solutions
pganalyze
pganalyze is a Postgres specific observability service. It provides in-depth SQL tuning and performance monitoring. For the Scale plan, it starts with $399 per month including 4 database servers, each additional server costs $100.
DataDog
DataDog provides Postgres Database Monitoring. Its Postgres monitoring is not as extensive as pganalyze. DataDog's advantage is its integration with the entire application stack and the CI/CD pipeline. The price starts at $70 per database server per month.
Other Options
-
pgDash has a similar feature set and pricing point to pganalyze. pgDash looks less polished. On the other hand, pgDash offers self-hosted option for all plans, whereas pganalyze only offers self-hosted option for the Enterprise plan.
-
SolarWinds Database Performance Monitor (DPM) also provides Postgres database performance solution. Its technology comes from the VividCortex acquisition. It's cloud-only and has a steep starting price from $3,096.
Best Practices
Monitor Transaction ID Wraparound
Due to PostgreSQL's MVCC implementation, it has an infamous transaction ID wraparound risk. When this happens, it will render the entire database unavailable, see some past incidents (Sentry, Mailchimp, Figma). Every capable Postgres monitoring service and database service provider can check this risk:
- pganalyze checks every 30 minutes
- Implement an Early Warning System for Transaction ID Wraparound in Amazon RDS for PostgreSQL
- GCP Cloud SQL high-transaction-ID-utilization recommender
Monitoring Locks
Use pg_locks view to find any lock problems. You can find connections that have held locks for a long time as well as connections that have waited too long to acquire the lock.
You should also consider enabling log_lock_waits.
Avoid Blocking Operations
It's always a good practice to set lock_timeout on the user.
When making DDL changes, try to use non-blocking statements. e.g. add index with CONCURRENTLY, add constraint with NOT VALID.
Use a Cloud Postgres Service Provider
Every major Postgres database service provider has basic monitoring out of the box. If you have to self-host due to budget or compliance reasons, then you can start with Prometheus/Grafana setup and upgrade to pganalyze/DataDog as business grows.
A comprehensive monitoring solution can be intimidating. We try to provide actionable suggestions to companies from different stages. Please let us know if you have other monitoring tools and best practices worth sharing to herd the elephant.