Show HN: Zero-codegen, no-compile TypeScript type inference from Protobufs

(github.com)

134 points | by 18nleung 1 day ago

17 comments

mubou 1 day ago
The fact that the source is so small is wild. I would have expected a huge convoluted parsing library implemented in types.
On the other hand, the fact that this is even possible is more wild. Instead of replacing JS with a proper statically-typed language, we're spending all this effort turning a preprocessor's type system into a turing-complete metalanguage. Pretty soon we'll be able to compile TypeScript entirely using types.
[-]
- spankalee 1 day ago
  TypeScript does an amazing job at describing the types of real-world JavaScript. It's incredibly good, and very useful, even in the face of extremely dynamic programs. The fact that it can describe transforms of types, like "this is a utility that adds an `xxx` prefix to every property name" is frankly unparalleled in mainstream languages, but more importantly lets us describe patterns that come up in real-world JS programs - it's not fluff!
  And luckily, the most complex of types are usually limited to and contained within library type definitions. They add a lot of value for the library users, who usually don't have to deal that that level of complexity.
  [-]
  - rtpg 1 day ago
    Typescript is so much better than almost every other dependently typed language in terms of expressing these things[0], and it's still kind of miserable.
    We still have a long way to go in figuring out how to get our type systems to be easy enough to use to where this stuff doesn't surprise people anymore (because it shouldn't! identifier manipulation should be table stakes and yet)
    [0]: modulo soundness of course! Though I don't think that's intrinsic to the expressiveness
  - mubou 1 day ago
    I don't disagree! It's just the fact that it has to be transpiled to JS that's the problem, because it means none of the types are "real"; there's no runtime assurance that a string is actually a string. TS is great and I'd never go back to JS, but it's ultimately a bandaid. Native TS support in browsers is probably never going to happen, though, sadly.
    Imagine if WASM were supported natively instead, with browsers exposing the same DOM interfaces that they do to JS. You could link a wasm binary in a <script> and do everything you can with JS/TS, but with any language of your choosing. No doubt a compiled form of TS would appear immediately. We'd no longer need separate runtime type checking.
    Just feels like priorities are in the wrong place.
    [-]
    - spankalee 1 day ago
      I think you're conflating cause and effect in several cases. TypeScript can't be thought of, and would never exist, independently from JavaScript like you're trying to do.
      TypeScript wasn't created separate from JavaScript and then chose JavaScript as a backend. TypeScript only exists to perform build-time type checking of JavaScript. There wouldn't be a TypeScript that compiled to something else, because other languages already have their own type systems.
      Runtime type-checking isn't part of TypeScript because 1) It isn't part of JavaScript, and TypeScript doesn't add runtime features anymore. 2) It'd be very expensive for simple types, 3) Complex types would be prohibitively expense as you have to both reify the types and perform deep structural checking.
      WASM also is natively supported, and with newer extensions like reference types and GC, we're getting closer to the point where a DOM API could be defined. It'll still be a long while, but that's the long-term direction it's heading in. But even then, you would only see a TypeScript-to-WASM compiler[1] because there's already so much TypeScript out there, not because TypeScript is a particularly good language for that environment. A more static language would be a lot better for a WASM target.
      [1]: Porfor is already such a compiler for JS and TS, but it does not do runtime type-checking: https://porffor.dev/
      [-]
      - mubou 1 day ago
        I was thinking more along the lines of a TypeScript-like compiled language. For example, AssemblyScript[0] but with the web APIs added back in. (Personally I'd prefer C# or Rust, but you know most devs will want to keep using JS/TS.) WASM isn't natively supported in the way that I'm wishing it were, though; you still have to use JS to bootstrap it, and JS to call back into web apis. In my ideal world, I'd want to be able to compile
        public static void Main() { Document.Body.Append(new Div("hello world")); }
        and be able to use it in a page like
        <script src="hello.wasm"></script>
        and have that just work without any JS "glue code". Maybe someday. I know they're working on the DOM APIs, but as you said, it's been slow going. Feels like priorities are elsewhere. Even CSS is moving forward with new features faster than WASM is (nesting and view transitions are awesome though).
        (Btw when I said "separate runtime type checking" I didn't mean language-level; I was referring to the validation libraries and `typeof`'s that are required today since TS types obviously no longer exist after build. If it were a real static language, then of course you can't store a bool in a string in the first place.)
        [0]: https://www.assemblyscript.org/ (Porffor looks neat too. Wonder if it could be useful in plugin architectures? E.g. plugins can written in JS, and the program only needs a WASM interpreter. I'll bookmark it. Thanks.)
    - p1necone 1 day ago
      > there's no runtime assurance that a string is actually a string.
      As someone who's written a lot of Typescript in fairly large projects: in practice this isn't really an issue if you
      1. ban casting and 'any' via eslint,
      2. use something like io-ts at http api/storage boundaries to validate data coming in/out of your system without a risk of validator/type mismatch.
      But you have to have total buy in from everyone, and be willing to sit down with new devs and explain why casting is bad, and how they can avoid needing that eslint suppression they just added to the codebase. It certainly would be easier if it just wasn't possible to bypass the type system like this.
      [-]
      - mubou 1 day ago
        I know, but it's that last bit: it shouldn't be possible to bypass it. C# actually got itself into a similar issue despite being a proper static language, because when it added "nullable reference types" (where you can't assign null to a variable of type `Foo` unless it's explicitly typed as `Foo?`) they did it like TypeScript using purely static analysis to avoid having to change the language at a lower level (for compatibility).
        Even though it works 99% of the time, just like in TS you can occasionally run into a bug because some misbehaving library handed you a null that it said can't be a null...
        [-]
        Timon3 1 day ago
        On the other hand, disallowing bypassing it limits what you can do. There's always a ceiling to what the compiler can figure out, and some very complex types can't be analysed statically right now. By allowing bypassing the system, I can still accurately type those functions and reap all the rewards, and I can make sure everything works by combining unit tests with type unit tests. If bypassing was disallowed, I'd be more limited in what I can express.
        [-]
        yencabulator 16 hours ago
        Safety bypasses should be opt-in, case by case, and very explicit. For example, Rust's `unsafe` allows bypassing any limitation the language safety imposes on you normally, but all code not explicitly labeled unsafe is always in the very very safe mode.
        Even inside the Typescript rules, `as` is a ridiculously dangerous timebomb.
        Typescript is 100% about "convenience" and write-lots-of-code-now style of productivity, ~0% about safety or long-term maintainability.
        [-]
        Timon3 15 hours ago
        What's the big difference between `unsafe` and `as` regarding explicit labelling? Both are opt-in and explicit. As the user of a function, you don't see either from the outside. If you don't like `as`, it's fine to use a linter to disallow it.
        [-]
        yencabulator 15 hours ago
        The difference is that in everyday Typescript you end up using `as`, so it's presence is not a blaring alarm.
        Grepping a real world codebase that would not be `unsafe` in Rust:
        event as CustomEvent<T> const errorEvent = event as ErrorEvent; const element = getByRole("textbox"); expect(element).toBeInstanceOf(HTMLInputElement); const input = element as HTMLInputElement; const element = parent.firstElementChild as HTMLElement; type ItemMap = Map<Item["id"], Item>; ... new Map() as ItemMap const clusterSource = this.map.getSource(sourceName) as GeoJSONSource; [K in keyof T as T[K] extends Fn ? K : never]: T[K]; target[type] as unknown as Fn<... export const Foo = [1,2,3] as const;
        and on it goes. Typescript normalizes unsafe behavior.
        [-]
        Timon3 14 hours ago
        Many, if not most, of these occurrences can be made safe. It's very rare that I need `as`, and even more rare that I can't actually check the relevant properties at runtime to ensure the code path is valid.
        It's on you to ensure that you don't misuse `as`. If I could choose between current TS, and a "safer" one that's less expressive in complex cases, I'd choose the current one any day of the week.
        [-]
        yencabulator 14 hours ago
        "Typescript can be made safe" is the "C++ has a subset that is good" argument. Meh.
        [-]
        Timon3 14 hours ago
        Almost every language has some way to do stupid things. Say you're working in C# - you can forcefully cast almost anything to almost anything else, just like in TS. So according to you, C# is just as bad as TS in this respect, right?
        [-]
        yencabulator 14 hours ago
        If that's a thing commonly needed for basic operations like letting your event handler actually access the event details, then very much yes.
        Sane languages have a downcast mechanism that doesn't pretend it succeeds every time.
        [-]
        Timon3 14 hours ago
        Weird, I don't need to do that.
        Also weird that Typescript has exactly the mechanism you're talking about. Why are you acting like it doesn't?
        neonsunset 8 hours ago
        You can only do this with `unsafe { }` or `Unsafe.As/.BitCast`. Casts from/to `object` are type-safe even though may not be very user-friendly or good use of the type system in general.
    - merb 1 day ago
      Wasm gc was needed for that. Wasm evolves slowly so that it can be done right. Even if the dom api comes, not a lot of it will change since only c-like languages will be as small as possible to fit into the space of JavaScript.
- sgrove 1 day ago
  Or even run doom in TypeScript's type system!
  [-]
  - mubou 1 day ago
    Prepare to have your mind blown:
    https://www.youtube.com/watch?v=0mCsluv5FXA
    [-]
    - IshKebab 1 day ago
      Probably not though because he was clearly referring to that.
- sandreas 18 hours ago
  Here is Doom in TypeScript types: https://www.tomshardware.com/video-games/porting-doom-to-typ...
  A fun read / Video...
- 18nleung 1 day ago
  I would have written a shorter source, but I did not have the time.
- throwanem 1 day ago
  People have fussed the same of the C preprocessor, around the same time I and maybe you were born. (There's a pretty good chance I'm your parents' age, and nearly no chance you're the age of mine.)
  [-]
  - NoTeslaThrow 1 day ago
    The criticisms were valid then, too. C (including the preprocessor of course) is still not fully parseable if you include things like token concatenation.
    [-]
    - throwanem 1 day ago
      I make no representation as to soundness, then or now. Not till I figure out where my copy of the UNIX-HATERS Handbook has got to, at any rate. I've had cause reasonably recently to reread the X and sendmail chapters, not so much this one.
      [-]
      - antonvs 1 day ago
        X and sendmail are not really very relevant today.
        [-]
        throwanem 1 day ago
        The mistakes embodied in both thus far look not just still relevant but positively timeless. Certainly, to judge by how often young people with no sense of their field's history recapitulate those mistakes.
- plopz 1 day ago
  I wish javascript had gone in the same direction as php with types.
  [-]
  - rad_gruchalski 1 day ago
    Which is?
    [-]
    - 1oooqooq 1 day ago
      [flagged]
      [-]
      - rad_gruchalski 1 day ago
        I was hoping to hear how are php types better than TS instead of another rant about how Rust is the greatest. Anyone?
        By the way, having lived in Scala 2 for a few years, Rust is half-assed. Type system leaks through the fingers when working with async collections. Future.sequence from Scala makes great cli apps. Scala collections are a work horse. The only thing from Rust I like is the shorthand question mark return (compiler magic for Result type).
        [-]
        1oooqooq 23 hours ago
        lol. i mentioned rust as being the worst typed solution, to highlight ts is even beneath that.
        [-]
        rad_gruchalski 20 hours ago
        You haven’t mentioned anything. It’s not clear what you mean. All you said “coming from rust”. That could mean a ton of things.
spankalee 1 day ago
This would be even nicer if TypeScript added type inference for tagged template literals, like in this issue [1]. Then you could write:
```
    const schema = proto`
      syntax = "proto3";

      message Person { ... }
    `;

    type Person = typeof schema['Person'];
```
And you could get built-in schema validation with a sophisticated enough type definition for `proto`, nice syntax highlighting in many tools with a nested grammar.
We would love to see this feature in TypeScript to be able to have type-safe template in lit-html without an external tool.
The issue hasn't seen much activity lately, but it would be good to highlight this library as another use case.
[1]: https://github.com/microsoft/TypeScript/issues/33304
[-]
jitl 1 day ago
It’s pretty rad how flexible template literal types are, but I can’t imagine wanting this kind of shenanigans hanging out in a production app slowing down compile times. I prefer to define types in TypeScript and generate proto from that, since the TypeScript type system is so much more powerful than the Protobuf system. Types are much more composable in TS.
[-]
- h1fra 1 day ago
  Can you run Doom in a Typescript string template?
- tantalor 1 day ago
  What do you use to go from ts->pb?
  [-]
  - jitl 1 day ago
    I have an old public version here: https://github.com/justjake/ts-simple-type/blob/main/src/com...
    Ultimately i decided ts-simple-type is too difficult to maintain, so now I just use the TypeScript compiler API directly to introspect types and emit stuff, but most of that code is private to Notion Labs Inc
mherkender 1 day ago
This is kinda why I hate advanced type systems, they slowly become their own language.
"No compile/no codegen" sounds nice until you get slow compile times because a type system is slow VM, the error messages are so confusing it's hard to tell what's going on, and there's no debugging tools.
throwanem 1 day ago
I love this, and I bet the compile errors it produces on malformed protobuf are wild.
pragma_x 1 day ago
What's kind of amazing is that Typescript's matching of strings through the type system converges on a first-class PEG in a few places (see string.ts). The rest of the library is really damn succinct for how much lifting it's doing.
My hat's off to the author - I attempted something like this for a toy regex engine and got nowhere fast. This is a much better illustration of what I thought _should_ be possible, but I couldn't quite wrap my head around the use of the ternary operator to resolve types.
anjandutta 23 hours ago
This is super cool — love the zero-codegen approach. I’ve had to deal with codegen hell in monorepos where a tiny .proto change breaks half the pipeline. Curious how this handles more complex types like nested messages or oneof fields?
Also, been building something in a different space (LeetCode prep tool), but the idea of removing build steps for dev speed really resonates. Would love to see how this could plug into a lightweight frontend setup.
aappleby 1 day ago
This is both hilarious and awesome. I think the Typescript devs are just showing off at this point. :D
mifydev 1 day ago
This makes me wonder if this the way to do schema generation in Typescript. I’m working on Typeconf, and we have a separate step for translating Typespec schema to Typescript, it’ll be cool if we could just load typespec directly.
recursive 1 day ago
This requires the whole `.proto` declaration inline in source a string constant. I'm not holding my breath on "Import non-js content"[1] getting approved, so that means you still have to use another build dependency, or manually keep the .proto files synchronized across multiple sources truth. In that light, it's not clear when this would be a benefit over straight-forward code gen. Cool POC hack though.
[1]: https://github.com/microsoft/TypeScript/issues/42219
[-]
- catapart 1 day ago
  It's true that it's another dependency, but this is the entire contents of a file I drop into my project root called `raw-loader.d.ts`:
```
declare module '*?raw' { const rawFileContent: string export default rawFileContent }
```
  Then, when I add the file to my types property array of my tsconfig's compilerOptions, I can import anything I want into a typescript file as a string, so long as I add "?raw" to the end of it. I use it to inject HTML and CSS into templates. No reason it couldn't be used to inject a .proto file's contents into the inline template.
  Again, you're technically correct! But a "import non js content" feature is a pretty solveable problem in TS. Maybe not at the language level, but at the implementation level, at least.
  [-]
  - phpnode 1 day ago
    right, but typescript sees that as a `string`, and not a string literal and thus cannot be parsed by this project or others like it.
    [-]
    - bastawhiz 1 day ago
      That's simply not true. A loader can do whatever it wants. It translates the raw file contents into anything. Granted, at that point you'd might as well have the loader just be a traditional protobuf compiler, but the point still stands that this isn't an invalid solution.
      [-]
      - phpnode 1 day ago
        You’re talking about runtime, I’m talking about at compile time
        [-]
        bastawhiz 21 hours ago
        No, I'm talking about compile time. A loader at compile time (e.g., for webpack) takes whatever your import path is and translates it into something that can be used by the JavaScript application.
        It would be awfully silly to do this at runtime because typescript doesn't exist at runtime, which is sort of the whole point of the library.
        [-]
        phpnode 15 hours ago
        I think you’re misunderstanding how this project works. The contents of that ?raw file are opaque to the typescript type system, it will see it as a `string`, not as the literal content of the file, therefore it cannot be parsed using template literal types as this project does and cannot be used to derive typescript types from protobuf files.
        [-]
        bastawhiz 5 hours ago
        The loader proposal linked by the top level comment does not create strings, it imports them as the string literally type of their contents. That would absolutely 100% work with this project since the content of the imported file is available to the type system.
        The reply with the typescript definition for ?raw is unrelated to this project and would neither solve the issue presently nor address it in the future. But if you implemented it in your bundler, it absolutely solves this problem exactly as described, because the imported file can have whatever boilerplate you want around it (like `as const`). This is something that exists and is usable today.
- yencabulator 16 hours ago
  Even then, no import support -> must preprocess the .proto anyway.
  Might as well do code generation at that point, it'd even be debuggable.
- ZitchDog 1 day ago
  The problem is that TypeScript is terrible at codegen, there are no standard extension points like we have with javac and others. So we are forced to do these crazy hacks at the type level rather than just generating types as you would in other languages.
  [-]
  - recursive 1 day ago
    Not familiar with the capabilities of javac, but in my imagination, I'm referring to a tool that runs prior to the typescript compiler, that just writes the intended source as text. Typescript never knew it wasn't in the repository or anything.
- cadamsdotcom 1 day ago
  That can be done with a `sed` call so it’s not a new dependency.
meindnoch 1 day ago
Looks like TypeScript envies Swift's compile times.
catapart 1 day ago
Very cool work!
Also, I hope you expected me to read that output in the same cadence as the Hooli focus groups, because that's exactly what I did.
jillyboel 1 day ago
Cool, but I assume not great for performance?
Probably better to just stick with codegen
[-]
- 18nleung 1 day ago
  You're right that IDE/dev-time performance might be slower than using generated types since this relies on "dynamic" TypeScript inference rather than static codegen'd types.
  That said, depending on how your codegen works and how you're using protos at runtime, this approach might actually be faster at runtime. Types are stripped at compile-time and there’s no generated class or constructor logic — in the compiled output, you're left with plain JS objects which potentially avoids the serialization or class overhead that some proto codegen tools introduce.
  (FWIW, type inference in VSCode seemed reasonably fast with the toy examples I was playing with)
  [-]
  - recursive 1 day ago
    Typescript never generates classes or constructors that aren't present in source code. Whether or not constructors are present is completely independent from whether you're using code gen.
  - jillyboel 1 day ago
    > depending on how your codegen works and how you're using protos at runtime, this approach might actually be faster at runtime
    If your codegen is introducing runtime overhead you should use a different codegen.
    > type inference in VSCode seemed reasonably fast with the toy examples I was playing with
    It usually is. It can become a problem in a real project that has a lot of stuff going on, though.
- dtech 1 day ago
  Assuming you mean compiler/editor performance then yes I assume this wrecks it. Shouldn't matter for runtime though.
  [-]
  - jillyboel 1 day ago
    Right, I meant editor performance. None of the options should impact runtime performance anyway.
nikolayasdf123 1 day ago
I wish there was other way around. infer binary encoding from native types
tim1994 1 day ago
It's always impressive how far people can take TypeScript and even build parsers with it. But this is limited to inlined string literals and cannot read files (a TS limitation).
I wonder if the author has a use case in mind for this that I don't see. Like if you are only using TS, what's the point of protobuf? If you are exchanging data with programs written in other languages why avoid the protobuf tooling that you need anyway?
Maybe this is just a fun toy project to write a parser in TS?
porridgeraisin 1 day ago
Cool. Once the linked TS issue is resolved it will be able to resolve from files too which is great