• DynamicBits
    link
    fedilink
    English
    arrow-up
    26
    ·
    3 months ago

    Technically, XZ is just a container that allows for different compression methods inside, much like the Matroska MKV video container. In practice, XZ is modified LZMA.

    There is no perfect algorithm for every situation, so I’ll attempt to summarize.

    • Gzip/zlib is best when speed and support are the primary concerns
    • Bzip2 was largely phased out and replaced by XZ (LZMA) a decade ago
    • XZ (LZMA) will likely give you the best compression, with high CPU and RAM usage
    • Zstd is… really good, and the numerous compression levels offer great flexibility

    The chart below, which was sourced from this blog post, offers a nice visual comparison.

    • subtext@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      3 months ago

      Thanks for this! Good to know that Zstd seems to be a pretty much drop in replacement.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        8
        ·
        3 months ago

        It looks like someone made a Rust implementation, which is a lot slower and only does decompression, but it’s at least a rival implementation should zstd get some kind of vulnerability.

        • Killing_Spark@feddit.de
          link
          fedilink
          English
          arrow-up
          8
          ·
          3 months ago

          Yep that would be me :)

          There is also an independent implementation for golang, which even does compression iirc (there is also a golang implementation by me but don’t use that. It’s way way slower than the other one and unmaintained since I switched to rust development)

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            Awesome! It’s impressive that it’s decently close in performance with no unsafe code. Thanks for your hard work!

            And that Go implementation is pretty fast too! That’s quite impressive.

            • Killing_Spark@feddit.de
              link
              fedilink
              English
              arrow-up
              1
              ·
              3 months ago

              Sadly it does have one place with unsafe code. I needed a ringbuffer with an efficient “extend from within” implementation. I always wanted to contribute that to the standard library to actually get to no unsafe.

              • sugar_in_your_tea@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                Ah, I saw a PR from like 3 years ago that removed it, so it looks like you added it back in for performance.

                Have you tried contributing it upstream? I’m not a “no unsafe” zealot, but in light of the xz issue, it would be nice.

                • Killing_Spark@feddit.de
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 months ago

                  Have you tried contributing it upstream?

                  I didn’t yet just because I didn’t get around to it (and because I am not sure the std lib even wants this feature).

                  I’m not a “no unsafe” zealot, but in light of the xz issue, it would be nice.

                  I don’t think the two relate. I wouldn’t drop any dependency, the ringbuffer is implemented in the same repo.

                  • sugar_in_your_tea@sh.itjust.works
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    3 months ago

                    Yeah, they’re not really related. I’m just thinking there might be more scrutiny on compression due to the exploit.

                    That said, yours doesn’t support encoding anyway, so it’s kind of moot.