Positron – A next-generation data science IDE

(positron.posit.co)

214 points | by amai 4 days ago

32 comments

  • ZeroCool2u 1 day ago
    Kind of unfortunate that it uses pyright and jedi instead of just basedpyright for the more advanced features. Python language support just isn't great with jedi compared to pylance or basedpyright.

    And not to beat a dead horse, but I'm also not a huge fan of the broad claims around it being OSS when it very clearly has some strict limitations.

    I've already had to migrate from R Connect Server / Posit Server at work, because of the extreme pricing for doing simple things like having auth enabled on internal apps.

    We found a great alternative that's much better anyways, plus made our security folks a lot happier, but it was still a massive pain and frustrated users. I've avoided any commercial products from Posit since then and this one makes me hesitant especially with these blurry lines.

    • i000 1 day ago
      What is the alternative? Posit princing is absurd. Even academia is charged arm and leg - and the value, very questionable.
      • ZeroCool2u 1 day ago
        Agreed, the value is nonsense.

        This is what we use: https://domino.ai/ The marketing is a bit intense on the website, but the docs are pretty good: https://docs.dominodatalab.com/en/cloud/user_guide/71a047/wh...

        They definitely target large scale companies, but you can use their SaaS offering and it can be relatively affordable. The best part is the flexibility and scaling, but the license model is awesome too. There's no usage based billing, you just pay a flat license fee per user that writes code and for the underlying cloud costs and they'll deploy it on GCP, AWS, or Azure.

        They're used by a lot of large companies, but academia as well to replace or augment on-prem HPC clusters. That's what we used them for as well.

        • benreesman 1 day ago
          It's a shame that they don't have you writing marketing copy! The docs are indeed a lot more reasonable looking (to me at least). I work for a small proprietary fund and not some Godzilla company these days so maybe I'm just not the audience, but whew, for purchasing decision makers with subject matter background, that home page would have been a back button real fast if it wasn't linked from your thoughtful comment.

          I'm interested in your opinion as a user on a bit of a new conundrum for me: for as many jobs / contracts as I can remember, the data science was central enough that we were building it ourselves from like, the object store up.

          But in my current role, I'm managing a whole different kind of infrastructure that pulls in very different directions and the people who need to interact with data range from full-time quants to people with very little programming experience and so I'm kinda peeking around for an all-in-one solution. Log the rows here, connect the notebook here, right this way to your comprehensive dashboards and graphs with great defaults.

          Is this what I should be looking at? The code that needs to run on the data is your standard statistical and numerics Python type stuff (and if R was available it would probably get used but I don't need it): I need a dataframe of all the foo from date to date and I want to run a regression and maybe set up a little Monte Carlo thing. Hey that one is really useful, let's make it compute that every night and put it on the wall.

          I think we'd pay a lot for an answer here and I really don't want to like, break out pyarrow and start setting up tables.

          • ZeroCool2u 17 hours ago
            I'll just say Domino presents very much as a code first solution. So, if you want staff to be able to make dashboards _without_ code like using Looker Studio, then this isn't it.

            The one other big thing that Domino isn't, is it's not a database or data warehouse. You pair it with something like BigQuery or Snowflake or just S3 and it takes a huge amount of the headache of using those things away for the staff you're describing. The best way to understand it is to just look at this page: https://docs.dominodatalab.com/en/cloud/user_guide/fa5f3a/us...

            People at my work, myself included, absolutely love this feature. We have an incredibly strict and complex cloud environment and this makes it, so people can skip the setup nonsense and it will just work.

            This isn't to say that you can't store data in Domino, it's just not a SQL engine. Another loved feature is their datasets. It's just EFS masquerading as an NFS, but Domino handles permissions and mounting. It's great for non-SQL file storage. https://docs.dominodatalab.com/en/cloud/user_guide/6942ab/us...

            So, with those constraints in mind, I'd say it's great for what you're describing. You can deploy apps or API endpoints. You can create on-demand large scale clusters. We have people using Spark, Ray, Dask, and MPI. You can schedule jobs and you can interact with the whole platform programmatically.

          • dm3 1 day ago
            Looks like we're in a similar situation. What is your current go-to for setting up lean incremental data pipelines?

            For me the core of the solution - parquet in object store at rest and arrow for IPC - haven't changed in years, but I'm tired of re-building the whole metadata layer and job dependency graphs at every new place. Of course the building blocks get smarter with time (SlateDB, DuckDB, etc.) but it's all so tiresome.

            • benreesman 22 hours ago
              Yeah, last time I had to do this was about a year ago and I used parquet and arrow on S3-compatible object stores and put a bunch of metadata in postgres and the whole thing. At that time we used Prefect for orchestration which was fine but IMHO not worth what it cost, I've also used flyte seriously and dabbled with other things, nothing that I can get really excited about recommending, it's all sort of fine but kinda meh. I used to work for a megacorp with extremely serious tooling around this and everything I've tried in open source makes me miss that.

              On the front end I've always had reasonable outcomes with `wandb` for tracking runs once you kind get it all set up nicely, but it's a long tail of configuration and writing a bunch of glue code.

              In this situation I'm dealing with a pretty medium amount of data and very modest model training needs (closer to `sklearn` than some mega-CUDA thing) and it feels like I should be able to give someone the company card and just get one of those things with 7 programming languages at the top of the monospace text box for "here's how to log a row", we do Smart Things and now you have this awesome web dashboard and you can give your quants this `curl foo | sh` snippet and their VSCode Jupyter will be awesome.

              • ZeroCool2u 17 hours ago
                Just reading this as well and I neglected to mention that the Domino thing we use has Flyte (They call it Flows, but it's the same thing) and MLFlow built-in as well.
      • hadley 20 hours ago
        We do discount heavily for academia: get 50% off for research and 100% off (i.e. free) for teaching. But I do get that our pro products largely solve problems that folks encounter in larger enterprises, and you may not see the value inside an academic department. I'm also always happy to learn how we could do better, please feel free to reach out to hadley@posit.co.
      • sieste 1 day ago
        Positron looks like the next version of Rstudio, which is currently free. Do you think the plan is to phase out support for the free product and push users into the paid one?
        • jmcphers 1 day ago
          Positron inherits many ideas from RStudio, but is a separate project with an intentionally different set of tradeoffs; it gains multi-language/multi-session support, better configuration/extensibility, etc. but at the expense of RStudio's simplicity and support for many R-only workflows.

          We're still investing in RStudio and while the products have some overlap there's no attempt to convert people from one to the other.

          (I work at Posit on both of these products)

        • i000 1 day ago
          I am talking about the RStudio Server and Connect - these are really expensive. One of the sales reps claimed that it is so expensive because they are a PBC and support open-source development. As in if they were just for profit it would be cheaper, but we should feel good about paying more. I could not take it.
          • Onawa 22 hours ago
            As an admin and advocate for Posit Teams, Connect and Server filled a niche where a single admin could spin up infra and allow for anything deployable by end users without having to worry about scaling.

            It paid for itself in terms of scientists spinning up their own projects without having to provision server hardware, VMs, or anything else.

  • qsort 1 day ago
    I don't want to dunk too hard on this as it seems to be reasonably well made, but how are we making a data science IDE without a good SQL client? I might be biased but that's a major part of the workflow. You're already losing against PyCharm or Visual Studio (not code, the real one) simply because of that.

    I appreciate that full IDEs are heavy tools, but when I just need an editor I go with vim, if I have to do real work why not take out the power tools?

    • juliasilge 1 day ago
      I work on Positron, and I do not entirely disagree with you! We do have support for managing connections from Python and R: https://positron.posit.co/connections-pane.html

      But we have some pretty big aspirations around expanding our SQL support, based on the features we have already built like that Connections pane, our Data Explorer, our Observable support via Quarto, etc. We plan to invest in this area over the coming months, starting in Q4 this year.

      • qsort 1 day ago
        Appreciate the response, sorry if I'm being a bit direct but as you probably know that's the style that works the best on a forum like this one.

        I'll keep tabs on you guys, my DS colleagues might be interested in the project.

    • notnmeyer 1 day ago
      it looks like it is based on vsc. there’s got to be a decent sql client extension, right?
      • qsort 1 day ago
        You're welcome to suggest one that has 10% of the functionality that with PyCharm you take for granted.

        Graphical table creator? View and export ER schemas? SQL Syntax and autocomplete in .sql files AND within literal strings in your code? Query explainer?

        Yeah, I don't think so.

      • NeutralForest 1 day ago
        Then why use this instead of VSCode with a couple extensions?
        • jmcphers 1 day ago
          You can in fact get something "pretty close" to Positron by adding and configuring a whole bunch of VS Code extensions, adjusting the layout, etc. However it's fiddly and time-consuming work (and quite challenging for novice users); the resulting UX can be pretty disjointed, too.

          Positron provides a batteries-included experience that lets you work with Python and R out of the box; it's easier to get started, everything's already set up for data work, and the tools all work together smoothly. At least, that's the goal. :-)

          (disclaimer - I work on Positron)

          • NeutralForest 1 day ago
            I honestly don't think it's enough of a value proposition but I wish you the best, more competition is good!
            • samuell 23 hours ago
              I'd go out on a limb and say the rock-solid out-of-the box experience is what is keeping many people using R-studio (and even R itself), rather than the messy ecosystem of Python. I'm seeing this tendency in myself for some tasks.

              Also, my impression is that that is also a big part of why MATLAB still exists, despite outraging prices.

              I think the common theme among these tools' main user groups is that they are not developers. They are not comfortable fiddling a lot with a dev environment, but can be productive in an environment where everything just works.

              Thus, if Positron can get the same smooth and rock-solid out-of-the box experience, it will be able to reach a lot of these non-developer user groups.

              At least that's my 5c.

              • RamblingCTO 22 hours ago
                Jup, that's 100% it. Even tho I build productive systems in anything but R, whipping up Rstudio and having it all included or easily installable without any package manager fuckery or anything makes it a non brainer.
  • williamstein 1 day ago
    Not open source: “You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software.”
    • sunnybeetroot 8 hours ago
      Software can’t be considered open source if it has a restrictive license? Which license is mandatory?
  • alterom 1 day ago
    While this is "based on" the open source codebase of VSCode, it's very much unclear from the project page which features and dependencies aren't open source or even free-as-in-beer (..and may require a paid subscription, enterprise plan purchase, premium account, etc).

    I.e., where they are making money off of this.

    One clear indication that there are strings attached is that they're bundling a specialized GenAI assistant with the IDE.

    Wish it was made clear in the FAQ. It doesn't cover this at all.

    • juliasilge 1 day ago
      Hello! I work on Positron. We outline some answers here: https://positron.posit.co/faqs.html#how-can-i-use-positron-w...

      tl;dr is that the desktop app (including remote SSH sessions) is free to use with a permissive license (no account needed, no subscription, commercial use is OK, etc), but using Positron in a server mode does require a paid subscription.

      • teruakohatu 1 day ago
        Can it be run through a web browser like VS Code and RStudio?

        Why did you relicense it under Elastic License 2.0 from VSCode’s MIT?

        A better alternative would be proprietary extensions under a different license like Microsoft does.

        • juliasilge 1 day ago
          Yes, but that is the server mode that is not free to use. If you want a free mode that lets you connect to a server, you might want to check out the remote SSH support: https://positron.posit.co/remote-ssh.html

          We talk a bit about why we chose the Elastic license here: https://positron.posit.co/licensing.html

          We have thought pretty carefully about what kind of functionality works well in extensions (in fact, we build and maintain a number of extensions!) and came to the decision that the more integrated data science experience we wanted to make required forking.

          • notpushkin 1 day ago
            Please consider adding a delayed open-source license/clause :-)

            Also, are you using Open VSX, and what’s your take on the recent malware extension story?

            • juliasilge 16 hours ago
              Thanks for your thoughts on the license; we know that it's FRAUGHT, for sure. Our company makes quite a lot of software available under OSI-approved licenses (MIT, etc) and we did think pretty carefully about what to try here, given our goals around both OSS and building a sustainable business.

              We do use OpenVSX, yes, like the other forks, and our company is a major sponsor of OpenVSX. Security around the extension ecosystem is a pretty messy, complicated issue both for the proprietary Microsoft marketplace and OpenVSX. For example, the recent Amazon Q story! I currently think about it as conceptually fairly similar to the risks of using packages from PyPI or npm.

  • throwaway328 1 day ago
    Emacs is the only truly next-generation data science IDE, and the last-generation one too.

    (Hiding behind my couch after writing that)

    • kleinishere 18 hours ago
      What packages and workflow specifically do you use? I haven’t come across many gentle introductions so looking for clues on what’s a reasonable first step that’s well maintained with good docs.
      • throwaway328 17 hours ago
        I am not a scientist, and was primarily having a laugh with my comment.

        That said, I do know that the type of person who likes configuring things very in-depth can set up intricate and powerful workflows in Emacs. I don't know what kind of data science IDE specifically you're interested in putting together, but here's a general article:

        https://michaelneuper.com/posts/replace-jupyter-notebook-wit...

        There's also this MOOC on reproducible research in French and English from Inria, where you're encouraged to follow the course in one of three ways: Jupyter, RStudio, or in Emacs' Org-Mode. I'd love to do it, but can't really justify spending the time at the minute.

        https://www.fun-mooc.fr/en/courses/reproducible-research-met...

        Creator of org-mode is Carsten Dominik, who is an astronomer by trade, so, it's a scientist's tool. A few of his talks are listed on this page, if you're interested in going straight to the source:

        https://orgmode.org/worg/org-tutorials/

        • kleinishere 8 hours ago
          This is great - thank you! Hadn't seen the blog post or the MOOC. Appreciate the resources.
    • Koshcheiushko 17 hours ago
      Can you elaborate, how?
      • throwaway328 17 hours ago
        See my reply in the other thread, where I dutifully elaborate.
  • mirkodrummer 1 day ago
    Wow can't we escape the vscode fork black hole anymore right?
    • n42 1 day ago
      "Days Since Last VSCode Fork", when?

      this seems unmaintained https://dayssincelastvscodefork.com/

    • 90s_dev 1 day ago
      To be fair, the Monaco team did an amazing thing. It's not clear to me just how much of VS Code's complexity is essential to its genius, but if anyone ever creates a slim version that does 90% of the work in 10% of the code, it would forever change editing like VS Code did. It would be great if it was portable too, but it's hard to get that without pulling in HTML + JS + CSS as dependencies. Maybe as a Dear ImGui extension?
      • jmcphers 1 day ago
        VS Code's complexity is due in large part to its extensibility. It has the biggest, most robust extension API of any modern editor. Extensions don't get to run on the main/UI thread but run in a separate process that communicates with the main window over RPCs. This necessitates a lot of plumbing and layered generics but makes the main UI fast/stable and was a key innovation over other editors at the time (cough Atom cough).

        The API is so good that a lot of core VS Code behavior (e.g. Github integration, support for lots of languages) is implemented in the form of built-in extensions.

        It is possible to get 80% of VS Code's functionality with 10-20% of the code if you just bake everything into one monolith, but this has been tried repeatedly and it keeps failing in part because the extension ecosystem and attendant network effects form a wide moat.

        (disclaimer - I work on Positron)

      • notpushkin 1 day ago
        CodeMirror is amazing these days, super lightweight compared to Monaco, and pretty extensible: https://codemirror.net/

        (But that’s just the editor component, if you need all the other IDE stuff you’ll have to build it :-)

        For something non-browser, I’m currently using Zed and it’s pretty good: https://zed.dev/

  • pks016 1 day ago
    I'm a daily user of R with R studio (academia) and also use Python in VS code. Tried posit a couple of months ago, but it wasn't stable as daily driver. Not all packages were there. Might have to try again soon.

    I have different specific habits for R and Python. Think it'll take a bit of time for people like me to switch. For a week, I also tried R in VS code, but something wasn't feeling right. I excited for the connections-pane if it works smoothly

  • aleph_minus_one 1 day ago
    Positron is also the name of an ultra-portable 3D printer

    > https://www.positron3d.com/

    that actually defined its whole new class of ultra-portable 3D printers (Positron-style 3D printers; their drive system is named "Positron drive").

    A sibling of the Positron is the JourneyMaker:

    > https://github.com/mcfazio2001/JourneyMaker-Positron

    A cost-reduced (no CNC-machined parts) variant of the JourneyMaker with a unibody chassis that you can 3D-print by yourself is the Lemontron:

    > https://lemontron.com/

  • georgeg 1 day ago
    This tool and ecosystem does not support Julia. I would expect that at this stage a data science polyglot tool is not just R and Python. Not sure why they would not support Julia.
  • 7thaccount 1 day ago
    This looks exactly like the Spyder IDE that comes with Anaconda and WinPython. You get. Code editor, repl, variable inspector, and inline charts. Everything you need.
    • juliasilge 1 day ago
      I work on Positron, and I would say that Spyder can be a great choice for someone doing data science who uses Python only. I would argue that Positron is a better choice for people who use more than one language during their regular work (Python + C, Python + Rust, Python + JavaScript, etc) or who want a more customizable, extensible IDE.
      • 7thaccount 1 day ago
        Thank you for the extra context. Best of luck on your project. I might check it out if I ever need something like Spyder with a different language.
  • angelgonzales 19 hours ago
    I use Spyder daily for data analysis, plotting and preparing presentations. Does Positron have any improvements over Spyder? I’m not a “programmer” at all and love to use basic IDEs to analyze telemetry generated from testing.
    • juliasilge 16 hours ago
      I work on Positron and I'll chime in! If you only use Python (not any other languages) and you are happy with Spyder, then Positron might not be compelling as something for you to switch to. If you ever find yourself using additional programming languages for your work or you want a more customizable, extensible IDE, then you might consider taking a look at it; I don't expect it would be an onerous switch coming from Spyder.
  • greazy 1 day ago
    I have been testing this IDE as I'm a heavy user of Rstudio.

    On Linux it's slow and buggy unfortunately. It's improving though.

  • clatan 1 day ago
    Not a fan of abandoning Rstudio notebooks in favor of jupyter i always found them inferior.
    • jsilence 23 hours ago
      Maybe consider evaluating Marimo?
  • vovavili 19 hours ago
    Weird that they've switched from Qt for RStudio to being a VSCode fork with Positron. RStudio was so loved precisely because it was lightweight and performant, which just isn't a possibility with something like VSCode.
    • bb86754 16 hours ago
      RStudio is still a web app. It's GWT with QtWebEngine whereas this is Electron. Not too different.
  • positron26 1 day ago
    Gave me a heart attack that someone was posting my site overnight while I wasn't ready for launch :D...

    But thanks for adrenaline spike. Now I can start my morning.

  • gclawes 1 day ago
    Is VS Code/VSCodium becoming the Chromium of IDEs?
  • flusteredBias 1 day ago
    I use it and like it.
  • incomplete 1 day ago
    hard no for me WRT positron... i managed a university's jupyterhub deployments for a while and we had faculty CLAMORING for this vs. Rstudio.

    the problem? the fact that you need a license to use. it's not OSS. you are not allowed to deploy this on a hosted/managed system:

    ``` Limitations

    You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software.

    You may not move, change, disable, or circumvent the license key functionality in the software, and you may not remove or obscure any functionality in the software that is protected by the license key.

    You may not alter, remove, or obscure any licensing, copyright, or other notices of the licensor in the software. Any use of the licensor's trademarks is subject to applicable law. ```

    • hadley 20 hours ago
      We're working on this! Education is really important to us so this is 100% a problem we want to solve.
    • sharifhsn 1 day ago
      Hey, my university is setting up a JupyterHub deployment right now, and my coworker is in charge of it. Do you mind elaborating more on how you went about it, best practices, etc.?
    • insane_dreamer 1 day ago
      They probably should have said “paying users”; then univ labs could use it but some commercial company couldn’t just host it and charge for it
      • hadley 20 hours ago
        Students also pay :)
        • insane_dreamer 16 hours ago
          they don't pay the lab for use of the software
  • kookamamie 1 day ago
    Is there a reason this couldn't have been an extension to VS Code?
  • luke-stanley 19 hours ago
    Positron: Consider starting with Astral UV! Fast and reproducible, with early fixes to evade CUDA dependency hell.
  • malcolmgreaves 1 day ago
    This is just a clone of Spyder, which has been around for well over a decade: https://www.spyder-ide.org/
    • juliasilge 1 day ago
      I work on Positron, and I would say that it isn't a clone of Spyder, but rather a fork of VS Code that is very inspired by RStudio (also built by our company). Spyder takes a lot of inspiration from RStudio, and I would say that Spyder can be a great choice for someone doing data science who uses Python only. I would argue that Positron is a better choice for people who use more than one language during their regular work (Python + C, Python + Rust, Python + JavaScript, etc) or who want a more customizable, extensible IDE.
  • ishita159 1 day ago
    I don't get the hype.
  • HSO 1 day ago
    Has anyone compared this with DataSpell? Relative pros and cons?
  • ezst 1 day ago
    It doesn't look like it supports Scala, unfortunately.
  • joshmarinacci 1 day ago
    Is this a rebrand of R-Studio?
    • juliasilge 1 day ago
      I work on Positron, and I would say no, it's a different product that has a different set of tradeoffs than RStudio. RStudio is a different IDE, and it is not going anywhere; our company is committed to long term support for it.

      I would say that Positron is better for folks who use more than one language (not only R) or want to customize/extend their IDE in a way that is not possible in RStudio.

      • joshmarinacci 1 day ago
        Fascinating. Thank you for the quick reply!
  • yannis7 20 hours ago
    briefly reminded me of my MATLAB days... back in the 2000s/early-2010s it was all the rage. Great IDE too
  • binarymax 1 day ago
    The interesting thing to me is that another company is adopting the Elastic license.
  • hatmatrix 1 day ago
    Can this connect to WSL like VSCode?
  • iJohnDoe 1 day ago
    I have a project that I want to do something with - a JSON file with about 500k lines. I want to present the data in a clear way and struggling to find the right approach. Maybe an obvious answer to this, but I’m not a data scientist.
    • pplonski86 23 hours ago
      I think you can easily load JSON data into Pandas DataFrame. Then you can create visualizations and compute statistics. Python might be usefule for that.
      • samuell 23 hours ago
        ... and if you (the parent comment author) wants a really easy tool for creating a UI for the Python code, I'd recommend looking at Streamlit: https://streamlit.io/
  • splittydev 1 day ago
    Great, another VS Code fork that could realistically just be an extension.
  • PontingClarke 19 hours ago
    [dead]