7 comments

  • samwillis 8 hours ago
    It's great that the rust community are finding ways to improve the performance of decoding strings from WASM to js, it's one of the major performance holes you hit when using WASM.

    The issue comes down to the fact that even if your WASM code can return a utf16 buffer, to use it as a string in JS code the engine needs to make a copy at some point. The TextDecoder api does a first good job of making this efficient, ensuring there is just a single copy, but it's still overhead.

    Ideally there should be a way to wrap an array buffer with a "String View", offloading the responsibility of ensuring its utf16 to the WASM code, and there being no copy made. But that brings a ton of complexities as strings need to be immutable in js, but the underlying buffer could still be changed.

    • breve 6 hours ago
      The JS string built-ins proposal for WebAssembly:

      https://github.com/WebAssembly/js-string-builtins/blob/main/...

      • samwillis 4 hours ago
        Personally I feel this is backwards - I don't want access to js literals and objects from WASM, I just want a way to wrap an arbitrary array buffer that contains a utf16 string as a js string.

        It keeps WASM simple and provides a thin layer as an optimisation.

        • vanderZwan 4 hours ago
          > It keeps WASM simple

          At the cost of complicating JS string implementations, probably to the point of undoing the benefits.

          Currently JS strings are immutable objects, allowing for all kinds of optimization tricks (interning, ropes, etc.). Having one string represented by a mutable arraybuffer messes with that.

          There's probably also security concerns with allowing mutable access to string internals inside the JS engine side.

          So the simple-appearing solution you suggested would be rejected all major browser vendors who back the various WASM and JS engines.

          Access to constant JS strings without any form of mutability is the only realistic option for accessing JS strings. And creating constant strings is the only one for sending them back.

  • andyferris 9 hours ago
    The whole UTF-8 vs UTF-16 thing makes this way more messy than it should be.

    I'd love for some native way of handling UTF-8 in JavaScript and the DOM (no, TextEncoder/TextDecoder do not count). Even a kind of "mode" you could choose for the whole page would be a huge step forward for the "compile native language to WASM + web" thing.

    • ethan_smith 7 hours ago
      The TC39 proposal for "Resizable ArrayBuffer" and "String.prototype.isWellFormed" methods are steps in this direction, though we still need proper zero-copy UTF-8 string views.
    • theSherwood 8 hours ago
      100%. If we could get a DomString8 (8-bit encoded) interface in addition to the existing DomString (16-bit encoded) and a way to wrap a buffer in a DomString8, we could have convenient and reasonably performant interfaces between WASM and the DOM.
      • continuational 7 hours ago
        The extra DOM complexity that would entail seems like a loss for the existing web.
  • vanderZwan 3 hours ago
    > Wasm-bindgen calls TextDecoder.decode for every string. Sledgehammer only calls TextEncoder.decode once per batch.

    So they decode one long concatenated string and then on the JS side split it into substrings? I wonder if that messes with the GC on the JS side of things.

    • boomskats 2 hours ago
      How would splitting it into substrings be different from decoding individual strings from an allocation/gc perspective? If anything I'd assume splitting a substring was more efficient - i expect there's a ton of optimisations in js for sliced strings or whatever as it's been around for ages.
      • vanderZwan 1 hour ago
        I imagine it's faster during creation because there's fewer allocations for a backing array for the string content (one, basically, unless they move stuff around). But then that can also mean holding on to the entire backing array even if only one of the strings is still "alive", unless there are optimizations for reclaiming memory in those situations too.
  • nhatcher 8 hours ago
    I wrote a while back about a somewhat related issue:

    https://www.nhatcher.com/post/should_i_import_or_should_i_ro...

    The code is a bit outdated, but the principle of linking against the browser implementation stands

  • CyanLite2 8 hours ago
    Sad that this isn’t natively in browsers…
  • MuffinFlavored 3 hours ago
    I think there is a ton of room left on the table here for innovation.

    Context: as far as I know Electron is still the king if you want to do (unsafe but performant) "IPC/RPC" between native and a webview.

    All of the other options that exist in other languages (Deno, Rust, you name it) do the same "stringified JSON back and forth" which really isn't great for performance in my opinion.

    It'd be cool if (obviously in a sandboxed or secure way) you could opt in to something albeit a bit reckless, but some way to provide native methods for the WASM part of V8 and its WebView (thinking Electron-esque here) to call.

    • boomskats 2 hours ago
      I'm not sure if I'm understanding you correctly, but vanilla wasm ipc works by sharing linear memory, where it's up to the implementation to choose the data encoding (arrow/proto/whatever). In the case of wasm-bindgen's dom manipulation api, the implementation serialises individual commands and sends them over the boundary, with any string params for each command being deserialised individually, and this project improves on that by batching them all into one big string thus reducing the deserialisation overhead. However, the string encoding is specific to that use case - it's not a general wasm ipc mechanism.

      VSCode IPC is kinda similar as it's designed to facilitate comms over an enforced process isolation barrier to protect the main thread from slow extensions etc. but it's actually IPC there (as in, there are multiple processes at the os level). The wasm/js stuff is handled within the same v8 context - it's not actually ipc.

      (Happy to be corrected here, but this is my understanding)

      • MuffinFlavored 1 hour ago
        https://github.com/webview/webview_deno

        Tell me how you'd do "native C/C++ FFI (to like a .so or .dylib or .dll)" between the webview using WASM or anything other than "WebKit's built in JSON-string based IPC"

        Like a <button> that triggers a DLL call. How would you achieve it with WASM? How does WASM act as the bridge to the DOM and/or native? It doesn't, right?

  • bcardarella 8 hours ago
    How does the performance compare to projects like Wasmtime?
    • Evan-Almloff 8 hours ago
      The two projects have different usecases so they can't be directly compared. Slegehammer bindgen makes calling javascript from rust faster in the browser. Wasmtime is a native runtime for WASM outside of the browser