I found this funny.

The context is as explained by @laund@hachyderm.io

the issue is that you can’t return from inside a closure, since the closure might be called later/elsewhere

and this post was the asnwer to the question by @antonok@fosstodon.org

you got me curious what the record for the longest ? operator chain on crates.io is

Original post: https://fosstodon.org/users/antonok/statuses/111134824451525448

  • realharo
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I think the issue with this is that the code (https://docs.rs/nix/0.27.1/src/nix/lib.rs.html#297) allocates a fixed-size buffer on the stack in order to add a terminating zero to the end of the path copied into it. So it just gives you a reference into that buffer, which can’t outlive the function call.

    They do also have a with_nix_path_allocating function (https://docs.rs/nix/0.27.1/src/nix/lib.rs.html#332) that just gives you a CString that owns its buffer on the heap, so there must be some reason why they went this design. Maybe premature optimization? Maybe it actually makes a difference? 🤔

    They could have just returned the buffer via some wrapper that owns it and has the as_cstr function on it, but that would have resulted in a copy, so I’m not sure if it would have still achieved what they are trying to achieve here. I wonder if they ran some benchmarks on all this stuff, or they’re just writing what they think will be fast.

    • orangeboats@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      so there must be some reason why they went this design.

      Some applications have a hard zero-alloc requirement.

      • realharo
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        1 year ago

        But that’s not the case here, seeing as they have

        if self.len() >= MAX_STACK_ALLOCATION {
            return with_nix_path_allocating(self, f);
        }
        

        in the code of with_nix_path. And I think they still could’ve made it return the value instead of calling the passed in function, by using something like

        enum NixPathValue {
            Short(MaybeUninitᐸ[u8; 1024]>, usize),
            Long(CString)
        }
        
        impl NixPathValue {
            fn as_c_str(&self) -> &CStr {
                // ...
        
        impl NixPath for [u8] {
            fn to_nix_path(&self) -> ResultᐸNixPathValue> {
                // return Short(buf, self.len()) for short paths, and perform all checks here,
                // so that NixPathValue.as_c_str can then use CStr::from_bytes_with_nul_unchecked
        

        But I don’t know what performance implications that would have, and whether the difference would matter at all. Would there be an unnecessary copy? Would the compiler optimize it out? etc.

        Also, from a maintainability standpoint, the context through which the library authors need to manually ensure all the unsafe code is used correctly would be slightly larger.

        As a user of a library, I would still prefer all that over the nesting.