In my relentless pursuit of trying to coax more performance out of my Lemmy instance I read that PostgreSQL heavily relies on the OSs disk cache for read performance. I’ve got 16 GB of RAM and two hdds in RAID 1. I’ve PostgreSQL configured to use 12 GB of RAM and I’ve zram swap set up with 8 GB.

But according to htop PostgreSQL is using only about 4 GB. My swap gets hardly touched. And read performance is awful. Opening my profile regularly times out. Only when it’s worked once does it load quickly until I don’t touch it again for half an hour or so.

Now, my theory is that the zram actually takes available RAM away from the disk cache, thus slowing the whole system down. My googling couldn’t bring me the answer because it only showed me how to set up zram in the first place.

Does anyone know if my theory is correct?

  • moonpiedumplings@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    12 hours ago

    Databases are special. They ofte implement their own optimizations, faster than more general system optimizations.

    For examole: https://www.postgresql.org/docs/current/wal-intro.html

    Because WAL restores database file contents after a crash, journaled file systems are not necessary for reliable storage of the data files or WAL files. In fact, journaling overhead can reduce performance, especially if journaling causes file system data to be flushed to disk. Fortunately, data flushing during journaling can often be disabled with a file system mount option, e.g., data=writeback on a Linux ext3 file system. Journaled file systems do improve boot speed after a crash.

    I didn’t see much in the docs about swap, but I wouldn’t be suprised if postgres also had memory optimizations, like it included it’s own form of in memory compression.

    Your best bet is probably to ask someone who is familiar with the internals of postgres.

  • aubeynarf@lemmynsfw.com
    link
    fedilink
    arrow-up
    8
    ·
    edit-2
    2 days ago

    Why would you reserve ram for swap???

    You’re hindering the OS’s ability to manage memory.

    Put swap on disk. Aim for it to rarely be touched - but it needs to be there so the OS can move idle memory data out if it wants to.

    Don’t hard-allocate a memory partition for postgres. Let it allocate and free as it sees fit.

    Then the OS will naturally use all possible RAM for cache, with the freedom to use more or less for the server process as demand requires.

    Monitor queries to ensure you’re not seeing table scans due to missing indexes. Make sure VACUUM is happening either automatically or manually.

    • Björn Tantau@swg-empire.deOP
      link
      fedilink
      arrow-up
      3
      ·
      2 days ago

      Why would you reserve ram for swap???

      It’s a useful way of squeezing out a few GB more. Worked wonders on my starved Steam Deck and allowed me to play Cities Skylines smoothly and without crashes.

      But on a DB heavy server that is apparently not a good idea. I’ve switched to a swap file.

      Monitor queries to ensure you’re not seeing table scans due to missing indexes.

      There are definitely some unoptimised queries and missing indexes. Lemmy 1.0 will supposedly fix a lot of them.

      • non_burglar@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        2 days ago

        If you put swap in zram, you are paging from RAM to RAM. May as well just not use swap and save the cycles.

        • BB_C@programming.dev
          link
          fedilink
          arrow-up
          2
          ·
          2 days ago

          The point is compression.

          % swapon
          NAME           TYPE      SIZE USED  PRIO
          /dev/nvme0n1p2 partition   8G   0B     5
          /dev/sda2      partition  32G   0B    -2
          /dev/zram1     partition 3.5G 1.8G 32767
          /dev/zram2     partition 3.5G 1.8G 32767
          /dev/zram3     partition 3.5G 1.8G 32767
          /dev/zram4     partition 3.5G 1.8G 32767
          /dev/zram5     partition 3.5G 1.8G 32767
          /dev/zram6     partition 3.5G 1.8G 32767
          /dev/zram7     partition 3.5G 1.8G 32767
          /dev/zram8     partition 3.5G 1.8G 32767
          
          % zramctl
          NAME       ALGORITHM DISKSIZE   DATA  COMPR  TOTAL STREAMS MOUNTPOINT
          /dev/zram8 zstd          3.5G 293.4M 189.2M 192.5M         [SWAP]
          /dev/zram7 zstd          3.5G 282.1M 187.5M   192M         [SWAP]
          /dev/zram6 zstd          3.5G 284.6M 189.4M 192.9M         [SWAP]
          /dev/zram5 zstd          3.5G 297.8M 197.3M 200.1M         [SWAP]
          /dev/zram4 zstd          3.5G 304.9M 202.9M 206.7M         [SWAP]
          /dev/zram3 zstd          3.5G 300.7M 201.9M 204.6M         [SWAP]
          /dev/zram2 zstd          3.5G 311.3M 207.2M 210.6M         [SWAP]
          /dev/zram1 zstd          3.5G 307.9M 210.5M 213.3M         [SWAP]
          /dev/zram0 zstd          <not used for swap>
          
          • non_burglar@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            2 days ago

            zswap is specifically built to this end and far better suited to it.

            zram is great, but it is simply a ramdisk and inappropriate to ops task. It cannot dynamically grow/shrink or deal with hot/cold pages.

            • BB_C@programming.dev
              link
              fedilink
              arrow-up
              2
              ·
              2 days ago

              zswap is not better than modern zram in any way. And you can set up the latter with writeback anyway.

              But that’s not OP’s problem since “swap gets hardly touched” in OP’s case.

  • non_burglar@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    2 days ago

    Zram does not impede disk cache, it’s a block device with compression, unavailable to the kernel for anything else.

    I do wonder what you’re trying to achieve by moving swap to zram? You’re potentially moving pages in and out of swap for no real reason, with compression, where the swap wouldn’t have occurred if zram weren’t in place.

  • taaz@biglemmowski.win
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    2 days ago

    Linux has kind of two forms of memory pages (entries in RAM), one is a file cache (page cache) and the other is “memory allocated by programs for work” (anonymous pages).

    When you look at memory consumed by a process you are looking at RSS, page/file cache is part of kernel and for example in btop corresponds to Cached.

    Page cache can never be moved into swap - that would be the same as duplicating the file from one place on a disk to another place on a (possibly different) disk.
    If more memory is needed, page cache is evicted (written back into the respective file, if changed). Only anonymous pages (not backed by anything permanent) can be moved into swap.

    So what does “PostgreSQL heavily relies on the OSs disk cache” mean? The more free memory there is, the more files can be kept cached in RAM and the faster postgres can then retrieve these files.

    When you add zram, you dedicate part of actual RAM to a compressed swap device which, as I said above, will never contain page cache.
    In theory this still increases the total available memory but in reality that is only true if you configure the kernel to aggressively “swap” anonymous pages into the zram backed swap.

    Notes: I tried to simplify this a bit so it might not be exact, also if you look at a process, the memory consumed by it is called RSS and it contains multiple different things not just memory directly allocated by the code of the program.

  • CondorWonder@lemmy.ca
    link
    fedilink
    arrow-up
    3
    ·
    2 days ago

    Based on what I’ve seen with my use of ZRam I don’t think it reserves the total space, but instead consumes whatever is shown in the output of zramctl --output-all. If you’re swapping then yes it would take memory from the system (up to the 8G disk size), based on how compressible the swapped content is (like if you’re getting a 3x ratio it’s 8GB/3=2.6GB). That said - it will take memory from the disk cache if you’re swapping.

    Realistically I think your issue is IO and there’s not much you can do with if your disk cache is being flushed. Switching to zswap might help as it should spill more into disk if you’re under memory pressure.

  • BB_C@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago
    • Use zram devices equal to the number of threads in your system.
    • Use zstd compression.
    • Mount zram devices as swap with high priority.
    • Mount disk swap partition(s) with low priority.
    • Increase swapiness:
         sysctl vm.swappiness=<larger number than default>
      
    • Use zramctlto see detailed info about your zram disks.
    • Check with iotop to see if something unexpected is using a lot of IO traffic.
  • Shadow@lemmy.ca
    link
    fedilink
    arrow-up
    2
    ·
    2 days ago

    Yes, configuring memory to be used for zram would mark it as unavailable for kernel fs caching.

    Does iostat show your disks being pegged when it’s slow? Odd that performance would be so bad on those specs, makes me think you have disk Io issues maybe.

      • chirping@infosec.pub
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        If you are on HDD then looking at what else is using the same disk, and reducing that usage, may yield some results. Forexample, if /var/log is on the same disk and can’t be avoided, then reducing log volume or batching writes may reduce the “context switches” your HDD has to do. There should be options for I/O limits/throttling/priority in systemd. If you have only postgres on the HDD, I’d consider giving it 90% of the max bandwidth – maybe that’d be more effective than going full throttle and hitting the wall. If you have postgres and some other service fighting for the HDD’s time, these limits could help. Make sure access time tracking is off (or set to relatime).