ASCII-Driven Development

(medium.com)

116 points | by _hfqa 2 days ago

30 comments

  • tptacek 12 hours ago
    This is a tangential point (this post is not really about TUIs; sort of the opposite) and I think lots of people know it already but I only figured it out last week and so can't resist sharing it: agents are good at driving tmux, and with tmux as a "browser", can verify TUI layouts.

    So you can draw layouts like this and prompt Claude or Gemini with them, and get back working versions, which to me is space alien technology.

    • agavra 8 hours ago
      This is spot on, I understand very little about how terminal rendering works and was able to build github.com/agavra/tuicr (Terminal UI for Code Review) in an evening. The initial TUI design was done via Claude.
    • heliumtera 11 hours ago
      Yeah, text was king yesterday, will be tomorrow
    • eterps 9 hours ago
      Would love to hear more about this approach.
      • tptacek 9 hours ago
        It's actually really easy in Claude Code. Get a TUI to the point where it renders something, and get Claude to the point where it knows what you want to render (draw it in ASCII like this post proposes, for instance).

        Then just prompt Claude to "use tmux to interact with and test the TUI rendering", prompt it through anything it gets hung up on (for instance, you might remind Claude that it can create a tmux pane with fixed size, or that tmux has a capture-pane feature to dump the contents of a view). Claude already knows a bunch about tmux.

        Once it gets anything useful done, ask it to "write a subagent definition for a TUI tester that uses tmux to exercise a TUI and test its rendering, layout, and interaction behavior".

        Save that subagent definition, and now Claude can do closed-loop visual and interactive testing of its own TUI development.

        • electroly 8 hours ago
          Can you explain tmux's contribution here? I'm confused why this process wouldn't work just the same if CC directly executed the program rather than involving tmux. Are you just using tmux to trick the program under test into running its TUI instead of operating in a dumb-stdout mode?
          • tptacek 8 hours ago
            It allows Claude to take screenshots and generate keyboard inputs. It's like TUI Playwright.
            • mrstackdump 7 hours ago
              Maybe I'm not understanding it (totally possible!) but could Claude just do that by reading standard out and writing to standard in?
              • tptacek 7 hours ago
                I had a really hard time getting anything like that to work (you can't just read stdout and write stdin, because you're driving a terminal in raw mode), but it took like 3 sentences worth of Claude prompt to get Claude to use tmux to do this reliably.
                • alehlopeh 6 hours ago
                  I tell Claude code to use an existing tmux session to interact with eg a rails console, and it uses tmux send-keys and capture-pane for IO. It gets tripped up if a pager is invoked, but otherwise it works pretty well. Didn’t occur to me to tell it to take screenshots.
                  • tptacek 6 hours ago
                    `tmux capture-pane`.
                • mrstackdump 6 hours ago
                  I would love to see your prompt if you ever post it anywhere.
                  • _sinelaw_ 21 minutes ago
                    For Claude, it's enough to prompt "use tmux to test", that usually does the work out of the box. If colors are important I also add "use -e option with capture-pane to see colors". It just works. I used it regularly with Claude and my TUI. For other agents other than Claude I need to use a more specific set of instructions ("use send-keys, capture-pane and mouse control via tmux" etc.)

                    Since I have e2e tests, I only use the agent for: guiding it on how to write the e2e test ("use tmux to try the new UI and then write a test") or to evaluate its overall usability (fake user testing, before actual user testing): "use tmux to evaluate the feature X and compile a list of usability issues"

              • rsanheim 6 hours ago
                Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.
            • alehlopeh 6 hours ago
              So by screenshots you mean tmux capture-pane, not actual screenshots. So in essence it is using stdout, just not Claude’s own.
  • mixmastamyk 11 hours ago
    Like the idea, but this is definitely Unicode and not ASCII. It's hard to believe someone finished a piece of this length but still misunderstood, especially when some examples have emoji in them. Alternately, they chose a misleading name on purpose. Why? Someone mentioned TUI, which sidesteps the issue entirely.
    • ehsanu1 11 hours ago
      It kind of makes sense if you relate it to ASCII art, which is very often not ASCII for similar reasons. The naming evokes that concept for me at least. Naming is hard in general, I'm sure they tried to find a name that they thought worked best.

      I agree that "TUI" is a better fit though. But not TUI-driven-development, more like TUI-driven-design, followed by using the textual design as a spec (i.e. spec-driven development) to drive GUI implementation via coding agents.

    • kmoser 11 hours ago
      You're technically right, but I think that's a minor quibble. They could have indeed limited it to ASCII (7-bit if you really want to be a purist) and everything would still work just as well.
  • stormy 4 hours ago
    Most ascii/unicode based diagrams spit out by AI have misaligned boxes, similar to the ones generated in the article.

    I’m not affiliated, but to clean them up you can use something like ascii-guard (https://github.com/fxstein/ascii-guard) which is a linter that will clean it up. Beats doing it by hand after multiple attempts telling AI to do it and repeatedly fail.

  • jimbo808 6 hours ago
    Sorry to be pedantic, but there are a bunch of non-ASCII characters (,↑,) in the mockups and the article contains a lot of AI tropes.
  • theturtle32 10 hours ago
    I love it conceptually, but I can't get past the abject failure of the right edges of boxes to be properly aligned. Because of a mishmash of non-fixed-width characters (emoji, etc.), each line has a slightly different length and the right edges of boxes are a jagged mess and I can't see anything else until that's cleaned up.
    • rmunn 10 hours ago
      Emojis mixed with ASCII-era characters are hard to get right. Some terminal emulators get it right nearly all the time (e.g. Ghostty, which has had a lot of thought and effort put into getting it right) and yet there are still open issues in the Ghostty repo about inconsistent character width. There are just so many corner cases that it's hard.

      That said, the edge alignment is, I believe, caused by the fact that LLMs are involved in the process. Because the LLMs never "see" the final visual representation that humans see. Their "view" of the world is text-based, and in the text file, those columns line up because they have the same number of UTF-8 codepoints in the row. So the LLMs do not realize that the right edges are misaligned visually. (And since the workflow described is for an LLM to take that text file as input and produce an output in React/Vue/Svelte/whatever, the visual alignment of the text file needs to stay LLM-oriented for it to work properly. I assume, of course, since I haven't tried this myself).

      • kevin_thibedeau 8 hours ago
        They are treated like double width characters. All it takes is a Unicode aware layout algorithm that tracks double width codepoints. The tricky part is older single width symbols that were originally not emoji and now have ambiguous width depending on the terminal environment's default presentation mode.
        • rmunn 8 hours ago
          That's how it should work, and does in terminals that are doing it right. Browsers, however, are looking at the monospaced font and saying "Okay, Source Code Pro doesn't have the U+2192 codepoint," (the → arrow) "so let me find a font that does." On my Linux+Firefox setup, the browser chose Menlo to render the → in the "The fastest way to go from 0 → 1" banner. Menlo's width isn't quite identical to Source Code Pro, so the ┃ character on the right of the box was every so slightly misaligned. Because Firefox isn't following strict fixed-width layout rules, and is allowing itself to use other fonts with different horizontal widths even inside a <pre> block. (I haven't looked at this article in other browsers but I bet they're the same since everyone's mentioning misalignment.)
        • rmunn 8 hours ago
          The other tricky part is emojis made up of multiple codepoints with zero-width joiner characters and variation selectors, or other symbols. E.g. is made up of U+1F1FA REGIONAL INDICATOR SYMBOL LETTER U followed by U+1F1F8 REGIONAL INDICATOR SYMBOL LETTER S, or (which should render as a single symbol, a burning heart / heart on fire), which is made up of the four-codepoint sequence U+2764 HEAVY BLACK HEART, U+FE0F VARIATION SELECTOR-16, U+200D ZERO WIDTH JOINER, and U+1F525 FIRE but should only render in one double-width block. Then there are even more complicated sequences like , which again should render in a single block but are made up of six(!) codepoints: U+1F469 WOMAN, U+200D ZERO WIDTH JOINER, U+2764 HEAVY BLACK HEART, U+FE0F VARIATION SELECTOR-16, U+200D ZERO WIDTH JOINER, and U+1F468 MAN.

          The number of codepoints never did correspond exactly to the number of fixed-width blocks a character should take up (U+00E9 é is the same as U+0065 e plus U+0301 COMBINING ACUTE ACCENT, so it should be rendered in a single block but it might be one or two codepoints depending on whether the text was composed or decomposed before reaching the rendering engine). But with emojis in play, the number of possibilities jumps dramatically, and it's no longer sufficient to just count base characters and ignore diacritics: you have to actually compute the renderings (or pre-calculate them in a good lookup table, which IIRC is what Ghostty does) of all those valid emoji combinations.

          P.S. The Hacker News comments stripped out those emojis; fair enough. They were, in order:

          - a US flag emoji (made up of two codepoints) - a heart-on-fire symbol (two distinct symbols combined into a single image, made up of four codepoints total) - a woman and a man with a heart between them (three distinct symbols combined into a single image, made up of six codepoints total)

    • megacrunch 10 hours ago
      [dead]
  • jbmsf 4 hours ago
    I like the idea but I think it's going to be hard to put this particular genie back in the bottle. As an engineering leader, I prefer low fidelity designs early on, but practically no one else in my company wants that.

    Designers have learned figma and it's the de facto tool for them; doing something else is risky for them.

    Product leaders want high fidelity. They love the AI tools that let them produce high fidelity prototypes.

    Some (but not all) engineers prefer it because it means less decision making for them.

  • arglebarnacle 11 hours ago
    A really interesting article, and I'm likely to give it a shot a work. I'm grateful for it, and yet I found it difficult to get through because of a sense of "LLM style" in the prose.

    I won't speculate on whether the post is AI-written or whether the author has adopted quirks from LLM outputs into their own way of writing because it doesn't really matter. Something about this "feeling" in the writing causes me discomfort, and I don't even really know why. It's almost like a tightness in my jaw or a slight ache in my molars.

    Every time I read something like, "Not as an aesthetic choice. Not as nostalgia. *But as a thinking tool*" in an article I had until then taken on faith was produced in the voice of a human being feels like a let down. Maybe it's just the sense that I believed I was connecting with another person, albeit indirectly, and then I feel the loss of that. But that's not entirely convincing, because I genuinely found the points this article was making interesting, and no doubt they came originally from the author's mind.

    Since this is happening more and more, I'd be interested to hear what others' experiences with encountering LLM-seeming blog posts (especially of inherently interesting underlying content) has been like.

    • rmunn 10 hours ago
      I've had too many LLMs tell me that software product ABC can do XYZ, but when I actually read the ABC documentation I discover that that hallucination was the opposite of reality: the docs say "we cannot do XYZ yet but we're working on it." So for me, the question at the back of my mind when I encounter an obviously LLM-generated article is always, "So which parts of this article are factually correct, and which parts are hallucinations?" I care less about the "human voice" aspect than about the factual correctness of the technical facts presented in the article.

      In this particular case, if the facts about how many years ago various products came out are wrong, it doesn't matter since I'm never going to be relying on that fact anyway. The fact that what the author is proposing isn't ASCII, it's UTF-8-encoded Unicode (emojis aren't ASCII) doesn't matter (and I rather suspect that this particular factual error would have been present even if he had written the text entirely by hand with no LLM input), because again, I'm not going to be relying on that fact for anything. The idea he presents is interesting, and is obviously possible.

      So I care less about the "voice" of an article, but a LOT about its accuracy.

      • rmunn 8 hours ago
        I should add that for me, when it comes to LLMs telling me "facts" that are the opposite of reality, "too many" equals ONE or more.
      • trollbridge 9 hours ago
        This is an ongoing problem for those of us who use LLMs every day. I have to check and recheck what it claims is possible.
    • Tenobrus 18 minutes ago
      some ai detectors work now. pangram detects this as 57% AI written, and the parts it thinks are human are.... the ascii diagrams / screenshots. all the actual text it detects as generated.
    • roywiggins 8 hours ago
      I also have this reaction to this type of prose, for better or worse. It's depressing to see so much of it shared. It makes me want to (in a friendly manner!) grab the author and tell them to write in their own voice, damn it.
    • tom_ 3 hours ago
      I just give up the moment I notice it. I gave up on this one once I got to "The High Fidelity Trap". My LLMdar said: brrrrrp. (Imagine the sound of a sad trombone, only out of tune.) If I feel like the author couldn't be bothered to write it, I feel like I can't be bothered to read it.

      And if I'm wrong: so be it. I'm comfortable living dangerously.

      (Reading it again, I probably should have noticed by "But here’s the thing: AI-generated UIs are high-fidelity by default", a couple of sentences previously. And in fact, there's "Deliberately sketchy. Intentionally low-fidelity. The comic-sans-looking wireframes were a feature, not a bug" in the very first paragraph - god, I'm so stupid! Still, each time I get this wrong, I'm that bit more likely to spot it in future.)

    • muzani 5 hours ago
      I stopped reading it at that point. I'm not against AI-written articles; I even think it's a little rude to accuse. But I agree.

      I think we do develop "antibodies" against this kind of thing, like listicles, clickbait, and random links that rickroll you. It's the same reason the article isn't titled, "5 examples of ASCII-Driven Development. You'll never guess #2!"

      Every article is a little mentor, and the thing with mentors and teachers is you have to trust them blindly, suspend disbelief, etc. But the AI voice also triggers the part of the brain designed to spot scams.

    • iamanllm 10 hours ago
      "Not as an aesthetic choice. Not as nostalgia. But as a thinking tool" is a perfectly normal sentence, and I think there is an equally bad trend of people assuming things are AI written and forget that AI was trained on human writing. But to your point, agreed there is a disconnect when things are in fact written by AI, but I skimmed the article anyway so to me it didn't matter lol.
      • oasisbob 9 hours ago
        Those are sentence fragments, not perfect sentences. They're useful in some contexts, but are inappropriate for more formalized writing.

        When LLMs reuse the same patterns dozens of times in a single article, the patterns stops being interesting or surprising and just become obnoxious and grating.

        • iamanllm 4 hours ago
          it's not formal writing it's a blog post.
  • _hfqa 11 hours ago
    Author here. High-level:

    - Problem: AI UI generators are high-fidelity by default → teams bikeshed aesthetics before structure is right.

    - Idea: use ASCII as an intentionally low-fidelity “layout spec” to lock hierarchy/flow first.

    Why ASCII: - forces abstraction (no colors/fonts/shadows)

    - very fast to iterate (seconds)

    - pasteable anywhere (Slack/Notion/GitHub)

    - editable by anyone

    Workflow:

    - describe UI → generate ASCII → iterate on structure/states → feed into v0/Lovable/Bolt/etc → polish visuals last

    It also facilitates discussion:

    - everyone argues about structure/decisions, not pixels

    - feedback is concrete (“move this”, “add a section”), not subjective

    More advanced setups could integrate user/customer support feedback to automatically propose changes to a spec or PRD, enabling downstream tasks to later produce PRs.

    • NetOpWibby 10 hours ago
      The problem with ASCII-driven development for me is that emoji ruin the alignment. It’d be nice if they could be forced into monospaced. Emoji aren’t ASCII so maybe that’s the problem too.
      • Izkata 7 hours ago
        Seems like there's more going on with it than that, it's also affecting the lines that don't have emoji. It kind of looks like it assumes every vertical bar takes up two characters so a space before the bar is missing. Except not always.

        Example 2 has five boxes in a row each with a number 1 to 5 in them, and each box is missing a single space before the second vertical bar... I think the problem might be centering, where it needs to distribute 3 spaces on either side of the text, divides by 2 to get 1.5, then truncates both sides to 1, instead of doing 1 on one side and 2 on the other. Doesn't quite fit with how many are missing in [PRODUCT IMAGE] right above that, though.

        (Also I'm just eyeballing it from mobile so I may be wrong about exact counts of characters)

      • manquer 7 hours ago
        Unicode emojis aren’t ascii .

        Long before Unicode points were assigned we were using emojis in text communication in email and sms.

        you can always be quite expressive with ones like :) :D :-( or even ¯\_(ツ)_/¯ - although not strictly ASCII.

        • Izkata 7 hours ago
          Those are called emoticons, not emoji. "Emoji" came about specifically to distinguish the single-character ones (unicode or proprietary) from what we did before.
    • 4b11b4 11 hours ago
      While I agree a text representation is good for working with LLMs... most of the examples are mis-aligned?

      Even the very first one (ASCII-Driven Development) which is just a list.

      I guess this is a nitpick that could be disregarded as irrelevant since the basic structure is still communicated.

  • roskelld 7 hours ago
    This type of issue comes up in the video game development world. Perhaps in part due to modern engines being off-the-shelf ready to render high quality assets and assets being so available, either internally or from an asset store. It helped push developers into putting high quality assets into games from the start, skipping the "grey box" steps.

    I've had it on a number of projects now where high quality assets were pushed into early builds causing execs eyes to light up as they feel like they're seeing a near final product, blind to the issues and under developed systems below. This can start projects off on bad footing because expectations can quickly become skewed and focus can go to the wrong places.

    At one studio there was a running joke about trees swaying because someone had decorated an outdoor level with simulated trees. During an early test the execs got so distracted by how much they swayed and if it was too much or too little that they completely ignored the gameplay and content that was supposed to be under review. This issue repeated itself a number of times to the point where meetings would begin with someone declaring "We are not here to review the trees, ignore the trees!"

    I've brought this issue up more recently with the advent of AI, which with things like Sora, the act of creating video clips can be stitched together can look like subjectively exciting movie trailers. This now has people declaring that AI movies are around the corner. To me this looks like the similar level of excitement as seeing the trees sway. An AI trailer looks much closer to a shipping product than it should be because the underlying challenges are far from solved; nothing is said about the script, pacing, character development, story etc...

  • nihiven 10 hours ago
    I like to make these kinds of mock ups using https://asciiflow.com. Some of the components from the article paste nicely there.
    • ynac 0 minutes ago
      Monodraw is a champ as well. Stand alone and outputs to graphic or text file. Easy to feed to other people or artificial people.
    • ftr1200 10 hours ago
      https://cascii.app many more features :)
  • killerstorm 11 hours ago
    OK but why not just go back to Balsamiq and make it 'executable'?

    You might believe that TUI is neutral, but it really isn't - there's a bajillion of different ways to make a TUI / CLI.

  • CompoundEyes 4 hours ago
    Recently I was using docling to transform some support site html into markdown and replacing UI images with inline descriptive text. An LLM created all the descriptions. My hope was descriptions like “a two pane..below the hamburger…input field with the value $1.42…” would allow an LLM to understand the UI when given as context in a prompt. Maybe I could just put ASCII renderings inline instead.
  • bccdee 10 hours ago
    Trouble with this is, it's pretty much LLM-only. I don't want to type out a request for Clause to draw a box for me and describe where, and I don't want to be pasting box-drawing characters. I want to click & drag. This is just boxes, arrows, and labels, which are all WAY faster to make by hand.
  • facorreia 5 hours ago
    Good idea to build low-fidelity mockups. SVG in my opinion is a better format for this job than text. For instance, in the screenshots from the article, not a single example is properly aligned. That is distracting and makes these assets hard to share.
  • bulletsvshumans 10 hours ago
    I think this a good technique to be familiar with, although in a lot of situations I've achieved similar value by simply feeding the underlying JSON data objects corresponding to the intended UI state back into the coding agent. It doesn't render quite as nicely, but it is often still human-readable, and more importantly both LLM and procedurally interpretable, meaning you can fold the results back into your agentic (and/or conventional testing) development loop. It's not quite as cool, but I think a bit more practical, especially for earlier stage development.
  • stack_framer 8 hours ago
    I really like this idea, but I get distracted when the vertical bars don't line up!

      +--------+
      |        |
      | ASCII! |
       |       |
      +--------+
  • onaclov2000 7 hours ago
    Somewhat related I suppose, for reasons I am often in a situation where I can use matplotlib and make a plot then save it and then SCP it locally to view it. I got tired of that and started making command line plots so I can see the stuff I want right there, it's not insanely detailed but for my needs it's been fine. This def got me thinking about it lol
  • ucarion 11 hours ago
    It is news to me that manipulating ASCII art is something AI can do well! I remember this being something LLMs were all particularly horrible at. But I just checked and it seems to work at least with Opus 4.5.

    claude(1) with Opus 4.5 seems to be able to take the examples in that article, and handle things like "collapse the sidebar" or "show me what it looks like with an open modal" or "swap the order of the second and third rows". I remember not long ago you'd get back UI mojibake if you asked for this.

    Goes to show you really can't rest on your laurels for longer than 3 months with these tools.

  • ftr1200 10 hours ago
    What tools can we actually use to draw ASCII manually if desired?

    None are mentioned. E.g I made https://cascii.app for exactly this purpose.

  • drob518 10 hours ago
    I really like this idea. I’ve seen teams get stuck quibbling about details too early while using Figma. I see how this totally sidesteps that problem, similar to working with analog drawings, but with the advantage that it’s still electronic so you can stuff it into your repo and version it and also feed it into an LLM easily. I’m really curious how the LLM “sees” it. Sure, it’s characters, but it’s not language. Regardless, very cool idea. Can’t wait to give it a try.
  • sbondaryev 11 hours ago
    Great idea, thanks for sharing! Tried your prompts with ChatGPT and Claude than iterated on it. The ASCII doesn't render perfectly in the web interface but looks good when copy/pasted into a text editor. Key benefit: I used to iterate on layout by generating HTML+Tailwind directly, which burns tokens fast. This ASCII approach lets you nail the structure first without the token cost. Much better for free tier usage. Appreciate the writeup!
  • damnitbuilds 2 days ago
    Nice ! Very true that if you, for example, show a group a button with a given color, they will waste the meeting discussing the color and not what the button should do, so ASCII is a nice way to not have to do that.
  • layer8 10 hours ago
    > Examples: UIs in ASCII

    The examples are using non-ASCII characters. They also don’t render with a consistent grid on all (any?) browsers.

    Maybe they meant plain-text-driven development?

  • sedatk 10 hours ago
    Off-topic but I’m so happy to have moved out of Medium.
    • nmstoker 6 hours ago
      Curious to understand what drove that the most? Certainly for this article, Medium's annoying way it blocks zooming in/out on images when on mobile is limiting and frustrating! But I sense you'll have broader concerns...
      • joshribakoff 2 hours ago
        For me, I’m simply trying to read the article and there are random full screen pop-ups nagging me to sign up for newsletters and stuff
  • jlundberg 11 hours ago
    Neat concept and very inspirational.

    Is ascii/unicode text UI the way to go here or is there other UI formats even more suited for LLMs?

    • ehsanu1 10 hours ago
      It has to be suited for human consumption too though.

      I wonder if this has any real benefits over just doing very simple html wireframing with highly constrained css, which is readily renderable for human consumption. I guess pure text makes it easier to ignore many stylistic factors as they are harder to represent if not impossible. But I'm sure that LLMs have a lot more training data on html/css, and I'd expect them to easily follow instructions to produce html/css for a mockup/wireframe.

    • kmoser 11 hours ago
      LLMs are surprisingly good at extracting data from other data, so at the end of the day there is no right or wrong, it's what works best for you use case.
  • stephc_int13 6 hours ago
    Argh.

    I may suffer from some kind of PTSD here, but after reading a few lines I can't help but see the patterns of LLM style of writing everywhere in this article.

    • alt187 6 hours ago
      Holy cows, same. But it's not PTSD. People think you can tell because of ternary rhythm, —, big-words and other shenanigans. These are tactics over strategy, obsessing over the minutiae. But I think what you're feeling (and what I sure fucking am) is a lack of voice.

      This writing says something, has a point, and you could even say it has a correct way to get to the point, but it lacks any voice, any personality. I may be wrong -I stopped reading midway- but I really don't think so.

    • metalliqaz 6 hours ago
      same. I just seem to automatically tune out
  • ivanjermakov 10 hours ago
    Wide character width is yet another billion dollar mistake. Impossible to make emojis look right together with monospaced font across devices/programs.
  • trollbridge 9 hours ago
    Welcome to the 1970s, when IBM published design guidelines that went along with the then-new 3270 terminal, including trying to keep response times very prompt (under a second) to prevent minds from wandering. This was supposed to allow non-technical users to use the full power of customers without having to master a command-line teletype style interface.

    GUIs were supposed to the big huge thing that would let non-technical staff use computers without needing to grasp TUIs.

  • cjlm 8 hours ago
    Spending a lot of time building tools inspired by and using ASCII nowadays...

    graph-easy.online printscii.com