NewsLab
Jun 28 15:44 UTC

WAL-RUS: a Rust Rewrite of WAL-G for PostgreSQL Backups (clickhouse.com)

117 points|by saisrirampur||14 comments|Read full story on clickhouse.com

Comments (14)

14 shown
  1. 1. caffeinated_me||context
    Do you have any benchmarks with a mix of long open transactions and short ones? I've struggled a lot with WAL-E in the past there, and am curious if that changes here.
  2. 2. __s||context
    no. but wal-g & wal-rus both have parallelism over wal-e. however are you more asking about handling build up of wal / vacuum prevention caused by long running transactions? those are up to postgres, archive command only keeps pushing wal so that when postgres is ready to get rid of wal it can. seems like your scenario wouldn't care much what the archiver is since wal should be shipped long before postgres is ready to get rid of wal
  3. 3. caffeinated_me||context
    Yeah, I'm probably misremembering some details there. Thanks
  4. 4. nasretdinov||context
    I must say I'm quite pleased to see how well Go version works. It does only use 1.5x the CPU and (predictably) much more RAM/VRAM, but not a crazy amount either (the expected increase is 2x).

    Of course you can write a more optimal version in C / C++ / Zig / Rust, but at the same time Go is much easier to write and you don't pay for the convenience with an absurd performance loss like in Python or PHP

  5. 5. __s||context
    indeed, wal-g actually started as a port of wal-e which was Python: https://www.citusdata.com/blog/2017/08/18/introducing-wal-g-...

    wal-g was a much larger improvement over wal-e. we're optimizing the margins here

  6. 6. sgt||context
    I'd like to use the one getting the most community support. So too soon to wait for Rust vs Go. Although on paper, Rust is better.
  7. 7. __s||context
    tbf it took 4 years since PG15 support was added for me to fix remote BASE_BACKUP support & wal-g base backups being inconsistent on PG15+ (parameter typo had pg_backup_stop return before wal archived far enough for consistency)

    https://github.com/wal-g/wal-g/pull/2262

    but yes, this is young project, so fair take

  8. 8. saisrirampur||context
    ++ We’re big Go fans, most of PeerDB is written in Go: https://github.com/PeerDB-io/peerdb

    The importance of optimizing (resource) margins and having predictable memory usage increases significantly in the DBaaS/Postgres world, where your process coexists and competes with other critical workloads.

    Also, WAL-RUS isn’t rocket science. Postgres already exposes a bunch native constructs for WAL archival, making development fairly straightforward even in Rust.

    TL;DR: when to choose Rust or Go really depends on the workload and what you are going after.

  9. 9. valentynkit||context
    Quick one on the benchmark: was the 2.8GB peak virtual or resident? Go reserves a large virtual arena it mostly never faults in, so RSS tends to be a fraction of the virtual peak, and if Postgres headroom was getting squeezed off the virtual number you were sizing against memory the kernel never actually charges for.
  10. 10. __s||context
    Correct. We tune overcommit so postgres reliably returns out of memory. It becomes complicated to accurately tune overcommit for every AWS instance type. We configure GOMEMLIMIT/cgroups but those are about RSS. Outliers come together: instances running queries out of memory on our service tend to also be pushing other resource limits, causing wal-g & prometheus exporters to start having more erratic memory usage at the worst time

    This helps on both ends of the cost spectrum. Large 64 core instances are where our heuristics fall off the most as variance increases, & tiny instances with 8GB of memory can use every 100MB of RSS we can get

  11. 11. nasretdinov||context
    You probably could limit the bloating of Go programs by setting GOMAXPROCS to something like 1 or 2 on smaller machines, but then again you wouldn't get the best performance. So IMO good call here to rewrite it in a language without GC.
  12. 12. TOMDM||context
    PostgreSQL WAL-RUS, no relation to PostgreSQL WALRUS https://github.com/supabase/walrus
  13. 13. westurner||context
    How to Design an index layer for postgres WAL-G backups to make a paging VFS like sqlite-http-vfs for pglite in WASM?
  14. 14. whitepoplar||context
    As someone who only has a cursory knowledge of Postgres backup systems, how does this compare to something like pgBackRest? When would someone reach for one over the other?