JSON River – Parse JSON incrementally as it streams in

(github.com)

232 points | by rickcarlino 6 days ago

30 comments

rictic 23 hours ago
Hi HN! Didn't expect this to be on the front page today! I should really release all the optimizations that've been landing lately, the version on github is about twice as fast as what's released on npm.
I wrote it when I was doing prototyping on doing streaming rendering of UIs defined by JSON generated by LLMs. Using constrained generation you can essentially hand the model a JSON serializable type, and it will always give you back a value that obeys that type, but the big models are slow enough that incremental rendering makes a big difference in the UX.
I'm pretty proud of the testing that's gone into this project. It's fairly exhaustively tested. If you can find a value that it parses differently than JSON.parse, or a place where it disobeys the 5+1 invariants documented in the README I'd be impressed (and thankful!).
This API, where you get a series of partial values, is designed to be easy to render with any of the `UI = f(state)` libraries like React or Lit, though you may need to short circuit some memoization or early exiting since whenever possible jsonriver will mutate existing values rather than creating new ones.
[-]
- rictic 18 hours ago
  I've just published v1.0.1. It's about 2x faster, and should have no other observable changes. The speedup is mainly from avoiding allocation and string slicing as much as possible, plus an internal refactor to bind the parser and tokenizer more tightly together.
  Previously the parser would get an array of tokens each time it pushed data into the tokenizer. This was easy to write, but it meant we needed to allocate token objects. Now the tokenizer has a reference to the parser and calls token-specific methods directly on it. Since most of the tokens carry no data, this keeps us from jumping all over the heap so much. If we were parsing a more complicated language this might become a huge pain in the butt, but JSON is simple enough, and the test suite is exhaustive enough, that we can afford a little nightmare spaghetti if it improves on speed.
  [-]
  - Inviz 17 hours ago
    I want to ditch stream-json so hard (needs polyfills in browser, cumbersome to use), but I need only one feature: invoke callback by path (e.g. `user.posts` need to invoke for each post in array) only for complete objects. Is this something that json river can support?
    [-]
    - rictic 16 hours ago
      jsonriver's invariants do give you enough info to notice which values are and aren't complete. They also mean that you can mutate the objects and arrays it returns to drop data that you don't care about.
      There might be room for some helper functions in something like a 'jsonriver/helpers.js' module. I'll poke around at it.
- stevage 20 hours ago
  Suggestion: make it clearer in the readme what happens with malformed input.
  I can imagine it being useful to have a made where you never emit strings until they are final, also. I don't entirely understand why strings are emitted incrementally but numbers aren't.
  [-]
  - xp84 19 hours ago
    Seems useful to me in the context of something like a progressively rendered UI. A large block of text appearing a few characters at a time would be fine, but a number that represents something like a display metric (say, a position, or font-size) going from 0 to 0.5 or from 1 to 1000, would result in goofy gyrations on-screen that don't make any sense. Or imagine if it was just fields in the app's data.
    Name: John Smith. Birth Year: A.D. 1 [Customer is a Senior: 2,024 years old]
    Name: John Smith. Birth year: A.D. 19 [Customer is a Senior: 2,006 years old]
    Name: John Smith. Birth year: A.D. 199 [Customer is a Senior: 1,826 years old]
    Name: John Smith. Birth year: 1997
    [-]
    - tags2k 10 hours ago
      If you're updating the UI every time you receive a single character from this library, you've got bigger problems than font size.
      [-]
      - spankalee 20 minutes ago
        If your UI layer can't efficiently update when you get new characters, you've got bigger problems than JSON parsing.
        Seriously, you should be able to update the UI with a new character, and much more, at 60fps easily.
      - sysguest 9 hours ago
        hmm this makes sense for LLM usage
        (but for other uses - nope)
  - rictic 18 hours ago
    Good feedback! Just updated the README with the following:
    > The parse function also matches JSON.parse's behavior for invalid input. If the input stream cannot be parsed as the start of a valid JSON document, then parsing halts and an error is thrown. More precisely, the promise returned by the next method on the AsyncIterable rejects with an Error. Likewise if the input stream closes prematurely.
    As for why strings are emitted incrementally, it's just that I was often dealing with long strings produced slowly by LLMs. JSON encoded numbers can be big in theory, but there's no practical reason to do so as almost everyone decodes them as 64bit floats.
syx 1 day ago
For those wondering about the use case, this is very useful when enabling streaming for structured output in LLM responses, such as JSON responses. For my local Raspberry Pi agent I needed something performant, I've been using streaming-json-js [1], but development appears to have been a bit dormant over the past year. I'll definitely take a look at your jsonriver and see how it compares!
[1] https://github.com/karminski/streaming-json-js
[-]
- rokkamokka 23 hours ago
  For LLMs I recommend just doing NDJSON, that is, newline delimited json. It's much simpler to implement
  [-]
  - rictic 22 hours ago
    Do any LLMs support constrained generation of newline delimited json? Or have you found that they're generally reliable enough that you don't need to do constrained sampling?
    [-]
    - sprobertson 20 hours ago
      not for the standard hosted APIs using structured output or function calling, best you can get is an array
  - stevage 20 hours ago
    I love NDJSON in general. I use it a lot for spatial data processing (GDAL calls it GeoJsonSeq).
- cjonas 23 hours ago
  Particularly for REACT style agents that use a "final" tool call to end the run.
saidinesh5 14 hours ago
I wrote something similar in my last job where we had to parse and query data from huge (50+ GB? I remember they weren't even fitting in my laptop) json files that were stored in an S3 Bucket..
We used the streaming parser to create an index of the file locally {json key: (byte offset, byte size)} and then simply used http range queries to access the data we needed.
Here is the full write up about it:
https://dinesh.cloud/2022/streaming-json-for-fun-and-profit/
And here is the open sourced code:
https://github.com/multiversal-ventures/json-buffet
simonw 1 day ago
If anyone needs to do this in Python I've had success with both ijson and jiter - notes here: https://til.simonwillison.net/json/ijson-stream and https://simonwillison.net/2024/Sep/22/jiter/
[-]
- Tmpod 3 hours ago
  +1 for ijson. I wrote some pretty fast and lightweight parsers a while back, using ijson's basic stream. Never heard of jiter, thanks for the posts!
carterschonwald 23 hours ago
Oh fun, I wrote a similar library in 2015 for Haskell. There is an annoying gotcha to deal with: there are sequences of valid characters that can be parsed incorrectly if you’re doing incremental chunks, namely if “0.0” is split across two input chunks you can get a token stream with two valid float literals rather than 1! Namely “0” and “.0”, which is just a really annoying wart of json float syntax.
[-]
- rictic 23 hours ago
  Yeah, getting numbers correct was one of the trickier wrinkles in the project. https://github.com/rictic/jsonriver/blob/5515be978bb564e9bdc...
- tracnar 23 hours ago
  Don't you need to wait for some kind of delimiter (like ",", "]", "}", newline, EOF) before parsing something else than a string?
  [-]
  - rictic 17 hours ago
    Only for numbers! Strings, objects, arrays, true, false, and null all have an unambiguous ending.
    [-]
    - stefs 6 hours ago
      but you don't do this for strings either, as shown in the examples - partial strings are pushed even though they're not yet ended:
      {"name": "Ale"}
- yonatan8070 23 hours ago
  An "off the top of my head" solution to this would be not to yield tokens until a terminating character (comma, \n, }).
jlundberg 20 hours ago
I really like just encoding each object as JSON and then concatinating them with a new line between.
Allows parsing and streaming without any special libraries and allow for an unlimited amount of data (with objects being reasonably sized).
Usually gives these files the .jsonlines suffix when stored on disk.
Allows for batch process without requiring huge amounts of memory.
[-]
- kondro 18 hours ago
  Me too, and it's quite a common technique.
  https://en.wikipedia.org/wiki/JSON_streaming
- 0x6c6f6c 17 hours ago
  Based on this thread that's called NDJSON
  Newline Delimited JSON
  TIL
  [-]
  - keitmo 2 hours ago
    It's also known as JSONL (JSON Lines).
    [-]
    - tracker1 21 minutes ago
      I'm pretty sure jsonl was a bit earlier as a term, but ndjson is now the more prominent term used for this... been using this approach for years though, when I first started using Mongo/Elastic for denormalized data, I'd also backup that same data to S3 as .jsonl.gz Leaps and bounds better than XMl at least.
hamburglar 12 hours ago
If you get partial string values that are replaced with longer string values as they are streamed in, how do you know when the value is finished being read and is safe to use?
magicalhippo 5 days ago
I wrote a more traditional JSON parser for my microcontooller project. You could iterate over elements and it would return "needs more data" if it was unable to proceed. You could then call it again after fetching more. Then just simple state machines to consume the objects.
The benefit with that was that you didn't need the memory to store the deserialized JSON object in memory.
This seems to be more oriented towards interactivity, which is an interesting use-case I hadn't thought about.
[-]
- rickcarlino 5 days ago
  I found this because I am interested in streaming responses that populate a user interface quickly, or use spinners if it is loading still
paulddraper 33 minutes ago
Interesting that it streams a string object value prior to being complete, but not the object key.
ww520 18 hours ago
This is nice to parse incomplete JSON as they come in.
I did something similar for streaming but built it with a streaming protocol at the frame level wrapping the JSON messages [1]. The streaming protocol has support for both the LF based scheme and the HTTP Content-Length header based scheme. It's for supporting MCP and LSP.
[1] https://github.com/williamw520/zigjr/?tab=readme-ov-file#str...
holdenc137 1 day ago
I don't get it (and I'd call this cumulative not incremental)
Why not at least wait until the key is complete - what's the use in a partial key?
[-]
- xg15 22 hours ago
  Doesn't it do exactly that?
  > As a consequence of 1 and 5, we only add a property to an object once we have the entire key and enough of the value to know that value's type.
  [-]
  - 0x6c6f6c 17 hours ago
    Their example in the README is extremely misleading then. It indicates your stream output is
    name: A name: Al name: Ale name: Alex
    Which would suggest you are getting unfinished strings out in the stream.
    [-]
    - __jonas 16 hours ago
      How is it misleading? It shows that it gives back unfinished values but finished keys.
- rictic 23 hours ago
  Cumulative is a good term too. I come from the browser world where it's typically called incremental parsing, e.g. when web browsers parse and render HTML as it streams in over the wire. I was doing the same thing with JSON from LLMs.
- simonw 1 day ago
  If you're building a UI that renders output from a streaming LLM you might get back something which looks like this:
```
  {"role": "assistant", "text": "Here's that Python code you aske
```
  Incomplete parsing with incomplete strings is still useful in order to render that to your end user while it's still streaming in.
  [-]
  - trevor-e 22 hours ago
    In this example the value is incomplete, not the key.
  - cozzyd 1 day ago
    incomplete strings could be fun in certain cases
    {"cleanup_cmd":"rm -rf /home/foo/.tmp" }
    [-]
    - stronglikedan 22 hours ago
      If any part of that value actually made it, unchecked, to execution, then you have bigger problems than partial JSON keys/values.
    - sublee 16 hours ago
      Incremental JSON parsing is key for LLM apps, but safe progressive UIs also need to track incompleteness and per-chunk diffs. LangDiff [1] would help with that.
      [1]: https://github.com/globalaiplatform/langdiff/tree/main/ts
      [-]
      - cozzyd 4 hours ago
        Why not just chunk the json packets instead?
    - xg15 11 hours ago
      Only because you have access to the incomplete value doesn't mean you should treat it like the complete one...
    - rictic 1 day ago
      Yeah, another fun one is string enums. Could tread "DeleteIfEmpty" as "Delete".
      [-]
      - Waterluvian 23 hours ago
        I imagine if you reason about incomplete strings as a sort of “unparsed data” where you might store or transport or render it raw (like a string version of printing response.data instead of response.json()), but not act on it (compare, concat, etc), it’s a reasonably safe model?
        I’m imagining it in my mental model as being typed “unknown”. Anything that prevents accidental use as if it were a whole string… I imagine a more complex type with an “isComplete” flag of sorts would be more powerful but a bit of a blunderbuss.
Xmd5a 21 hours ago
I wrote something similar that can also produce JSON incrementally from other streaming data sources. It combines a streaming JSON parser with streaming strings and a streaming regex engine.
Concretely, it means I can call an LLM, wrap its output stream in a streaming string, and treat it like a regular string. No need for print loops, it’s all handled behind the scenes. I can chain transformations (joining strings, splitting them with regexes, capturing substrings, etc.) and serialize the results into JSON progressively, building lazy sequences or maps on the fly.
The benefit is that I can start processing and emitting structured data immediately, without waiting for the LLM’s full response. Filtered output can be shown to users as it arrives, with near-zero added latency (aside from regex lookaheads).
chrchr 22 hours ago
I did something like this for Python [1]. The application I worked on at the time had a feature allowing users to import and export their data as a JSON document, and users often had enough data to make this cumbersome, especially with serialization and deserialization overhead. My implementation can also generate JSON documents as they stream out, from Python generators. The incremental JSON parsing was a little difficult to use, but incremental generation was an immediate win. We generated JSON documents from database results row-by-row and streamed the output to the web server, never producing the entire document in memory.
[1] https://github.com/chrchr/flojay
AaronFriel 1 day ago
Oh, this is quite similar to an online parser I'd written a few years ago[1]. I have some worked examples on how to use it with the now-standard Chat Completions API for LLMs to stream and filter structured outputs (aka JSON). This is the underlying technology for a "Copilot" or "AI" application I worked on in my last role.
Like yours, I'm sure, these incremental or online parser libraries are orders of magnitude faster[2] than alternatives for parsing LLM tool calls for the very simple reason that alternative approaches repeatedly parse the entire concatenated response, which requires buffering the entire payload, repeatedly allocating new objects, and for an N token response, you parse the first token N times! All of the "industry standard" approaches here are quadratic, which is going to scale quite poorly as LLMs generate larger and larger responses to meet application needs, and users want low latency outputs.
One of the most useful features of this approach is filtering LLM tool calls on the server and passing through a subset of the parse events to the client. This makes it relatively easy to put moderation, metadata capture, and other requirements in a single tool call, while still providing low latency streaming UI. It also avoids the problem with many moderation APIs where for cost or speed reasons, one might delegate to a smaller, cheaper model to generate output in a side-channel of the normal output stream. This not only doesn't scale, but it also means the more powerful model is unaware of these requirements, or you end up with a "flash of unapproved content" due to moderation delays, etc.
I found that it was extremely helpful to work at the level of parse events, but recognize that building partial values is also important, so I'm working on something similar in Rust[3], but taking a more holistic view and building more of an "AI SDK" akin to Vercel's, but written in Rust.
[1] https://github.com/aaronfriel/fn-stream
[2] https://github.com/vercel/ai/pull/1883
[3] https://github.com/aaronfriel/jsonmodem
(These are my own opinions, not those of my employer, etc. etc.)
marenVoyant88 3 hours ago
Finally, a way to make JSON as fluid as our UI needs to be.
eric-p7 21 hours ago
"has no dependencies, and uses only standard features of JavaScript so it works in any JS environment."
Then I see a Node style import and npm. When did Node/NPM stop being dependencies and become standardized by JavaScript? Where's my raw es6 module?
[-]
- jcla1 21 hours ago
  FWIW the import syntax is now part of standard JS, according to the ECMAScript 2026 specification:
  https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
  And node seems to be used only as a dev dependency, to test, benchmark and build/package the project. If you'd be inclined you can use the project's code as-is elsewhere, i.e. in the browser.
- rictic 21 hours ago
  Bare module specifiers aren't just for Node! Deno and browsers support import maps e.g.
  The library doesn't use any APIs beyond those in the JS standard, so I'm pretty confident it will work everywhere, but happy to publish in more places and run more tests. Any in particular that you'd like to see?
  [-]
  - o11c 20 hours ago
    Mostly unrelated, but does anyone know how are you supposed to make path-less module specifiers work for Node if you are not using npm but rather system-installed JS packages (Debian etc. install node-* packages into /usr/share/nodejs/)? With `require` it just works, but with `import` it errors and suggests passing the absolute path (even though it clearly knows what path ...).
    For some reason everybody in the JS world takes "download and execute random software from the Internet" as the only way to do things.
    [-]
    - rictic 20 hours ago
      Try import maps, something like:
      { "imports": { "express": "/usr/share/nodejs/express/index.js", "another-module": "/usr/share/nodejs/another-module/index.js" } }
      Then run node like: `node --import-map=./import-map.json app.js`
      The Debian approach of having global versions of libraries seems like it's solving a different problem than the ones I have. I want each application to track and version its own dependencies, so that upgrading a dependency for one doesn't break another, and so that I can go back to an old project and be reasonably confident it'll still work. That ultimately led me to nix.
      [-]
      - o11c 19 hours ago
        I have a simpler solution to the latter problem: if upgrading a dependency package breaks anything (barring multi-year deprecation, limited-time experimental previews, etc.), I blacklist it and never install that package ever again. After all, they are clearly lacking on either their testing infrastructure or their development guidelines.
        It's amazing how much the quality of installed software improves when you do this. Something our industry desperately needs.
zeroimpl 14 hours ago
I couldn’t find a library like this in PHP, but realized for my use case I could easily hack something together. Algorithm is simply:
- trim off all trailing delimiters: },"
- then add on a fixed suffix: "]}
- then try parsing as a standard json. Ignore results if fails to parse.
This works since the schema I’m parsing had a fairly simple structure where everything of interest was at a specific depth in the hierarchy and values were all strings.
mattvr 23 hours ago
You could also use JSON Merge Patch (RFC 7396) for a similar use case.
(The downside of JSON Merge Patch is it doesn't support concatenating string values, so you must send a value like `{"msg": "Hello World"}` as one message, you can't join `{"msg": "Hello"}` with `{"msg": " World")`.)
[1] https://github.com/pierreinglebert/json-merge-patch
[-]
klntsky 12 hours ago
Nice. You should pair it with Immer + React for a nice UI demo. This will hype hard if the value is properly demonstrated.
seanalltogether 1 day ago
Maybe I'm wrong but it seems like you would only want to parse partial values for objects and arrays, but not strings or numbers. Objects and arrays can be unbounded so it makes sense to process what you can, when you can, whereas a string or number usually is not.
[-]
- rictic 1 day ago
  Numbers, booleans, and nulls are atomic with jsonriver, you get them all at once only when they're complete.
  For my use case I wanted streaming parse of strings, I was rendering JSON produced by an LLM, for incrementally rendering a UI, and some of the strings were long enough (descriptions) that it was nice to see them render incrementally.
- everforward 1 day ago
  It could be useful if you're doing something with the string that operates sequentially anyways (i.e. block-by-block AES, or SHA sums).
  I _think_ the intended use of this is for people with bad internet connections so your UI can show data that's already been received without waiting for a full response. I.e. if their connection is 1KB/s and you send an 8KB JSON blob that's mostly a single text field, you can show them the first kilobyte after a second rather than waiting 8 seconds to get the whole blob.
  At first I thought maybe it was for handling gigantic JSON blobs that you don't want to entirely load into memory, but the API looks like it still loads the whole thing into memory.
- xg15 22 hours ago
  There is json that has very long string literals. Usually, it's either long-ish text or HTML content, or base64-encoded binary data.
  So I'd definitely count strings as "unbounded" as well.
- AaronFriel 1 day ago
  If you're generating long reports, code, etc. with an LLM, partial strings matter quite a lot for user experience.
hk1337 16 hours ago
This looks really nice. I think I could find a use for this.
The title made me think of Star Trek DS9 and Nog talking about The Great Material Continuum.
“Nog: The river will provide”
[-]
- sholladay 16 hours ago
  “Oh, that river. It can be very treacherous.” - Rom
keleftheriou 22 hours ago
Thanks for sharing!
Roughly how does it compare with https://github.com/promplate/partial-json-parser-js ?
alganet 1 day ago
Interesting approach.
I would expect an object JSON stream to be more like a SAX parser though. It's familiar, fast and simple.
Any thougts on not chosing the SAX approach?
[-]
- philbo 6 hours ago
  A nice thing about the SAX approach is it lets you layer other APIs on top too. I did something like that in BFJ:
  https://www.npmjs.com/package/bfj
- rictic 23 hours ago
  SAX is often better if you don't need the full final result, especially if you can throw away most of the data after it's been processed. The nice part about this API is that you just get a DeepPartial<FinalResult> so the code to handle a partial result is basically the same as the code to handle the final result.
- benatkin 1 day ago
  I think this is a lot like etree in python's streaming approach for XML, but with a simpler API, and incremental text parsing. With etree in python, you can access the incomplete tree data and not have to worry about events. So it's missing the SAX API part of a SAX approach, but is built like some real world libraries that use the SAX approach, which end up having a hybrid of events and trees.
  [-]
  - alganet 1 day ago
    It seems to be convenient for some cases. A large object with many keys, for example.
    I don't see it as particularly convenient if I want to stream a large array of small independent objects and read each one of them once, then discard it. The incremental parsed array would get bigger and bigger, eventually containing all the objects I wanted to discard. I would also need to move my array pointer to the last element at each increment.
    jq and JSON.sh have similar incremental "mini-object-before-complete" approaches to parsing JSON. However, they do include some tools to shape those mini-objects (pruning, selecting, and so on). Also, they're tuned for pipes (new line is the event), which caters to shell and text-processing tools. I wonder what would be the analogue for that in a higher language.
    [-]
    - benatkin 23 hours ago
      This is more versatile than it seems at first glance. Under invariants, it shows that you have arrays/objects only being mutated, so you have stable references. You could use a WeakSet to observe new children of an item coming in. You also may not even need manage this directly - you could debounce and just re-render a UI component by returning a modified virtual DOM. Or if you had a visualization in d3, it would automatically notice which ones are new.
      [-]
      - alganet 22 hours ago
        It does sound very practical indeed.
zahlman 23 hours ago
> If you gave this to jsonriver one byte at a time it would yield this sequence of values:
Does it create a new value each time, or just mutate the existing one and keep yielding it?
[-]
- rictic 23 hours ago
  It mutates the existing value and yields it again (unless the toplevel value is a string, because strings are immutable in JS).
rixed 19 hours ago
So SAX, but for json?
Plus ça change, et plus c'est la même chose.
jauntywundrkind 23 hours ago
It's no longer active, but Oboe.js did great stuff for a decade+ in this field! It has some very nice APIs for consuming. https://github.com/jimhigson/oboe.js/
It's less about incrementally parsing objects, and more about picking paths and shapes out from a feed. If you're doing something like array/newline delimited json, it's a great tool for reading things out as they arrive. Also great for example for feed parsing.
EGreg 21 hours ago
I recently also wrote a streaming JSON parser in PHP. In case anyone is interested, I would love to get your feedback. It’s designed to work independently or with the rest of our system.
https://github.com/Qbix/Platform/blob/main/platform/classes/...
quotemstr 22 hours ago
Awesome. You know what would be EVEN COOLER?
Given a schema and a JSON message prefix, parse the complete message but substitute missing field values with Promise objects. Likewise, represent lists as lazy sequences. Add a pubsub system.
florians 1 day ago
Noteworthy: Contributions by Claude
[-]
- rictic 23 hours ago
  Is true. I wrote a ton of tests, testing just about everything I can think of, including using a reverse parser I wrote to exhaustively generate the simplest 65k json values, ensuring that it succeeds with the same values and fails on the same cases as JSON.parse.
  Then added benchmarks and started doing optimization, getting it ~10x faster than my initial naive implementation. Then I threw agents at it, and between Claude, Gemini, and Codex we were able to make it an additional 2x faster.
codesnik 1 day ago
I can't imagine a usecase. Ok, you receive incremental updates, which could be useful, but how to find out that json object is actually received in full already?
[-]
- Supermancho 1 day ago
  When you want to pull multi-gig JSON files and not wait for the full file before processing is where I first used this.
  [-]
  - rictic 23 hours ago
    Funnily enough, this was one of the first users of jsonriver at google. A team needed to parse more JSON than most JS VMs will allow you to fit into a single string, so they had no choice but to use a streaming parser.
- philipallstar 1 day ago
  When its closing brace or square bracket appears.
  EDIT: this is totally wrong and the question is right.
  [-]
  - rising-sky 1 day ago
    Actually, not quite how this works. You always get valid JSON, as in this sequence from the readme:
```json {"name": "Al"} {"name": "Ale"} ```
So the braces are always closed