5 comments

  • elaus 1 hour ago
    I don't really see how the vacuum can effectively clean a whole room or flat using only a CNN of the current image in front of the robot. This would help detect obstacles, but a bumper sensor would do that as well.

    All but the most basic vacuum robots map their work area and devise plans how to clean them systematically. The others just bump into obstacles, rotate a random amount and continue forward.

    Don't get me wrong, I love this project and the idea to build it yourself. I just feel like that (huge) part is missing in the article?

    • thebruce87m 1 hour ago
      https://opencv.org/structure-from-motion-in-opencv/

      Not saying that it’s viable here to build a world map since things like furniture can move but some systems, e.g. warehouse robots do use things like lights to triangulate on the assumption that the lights on the tall ceiling are fixed and consistent.

    • jhbadger 1 hour ago
      The classic Roombas from a decade or so ago worked without any sort of mapping or camera at all -- they basically did a version of the "run and tumble" algorithm used by many bacteria -- go in one direction until you can't any more then go off in a random new one. It may not be efficient but it does work for covering territory.
  • isoprophlex 2 hours ago
    Cool project! That validation loss curve screams train set memorization without generalization ability.

    Too little train data, and/or data of insufficient quality. Maybe let the robot run autonomously with an (expensive) VLM operating it to bootstrap a larger train dataset without needing to annotate it yourself.

    Or maybe the problem itself is poorly specified, or intractable with your chosen network architecture. But if you see that a vision llm can pilot the bot, at least you know you have a fighting chance.

  • vachanmn123 2 hours ago
    Check out using maybe some kind of monocular depth estimation models, like Apple's Depth Pro (https://github.com/apple/ml-depth-pro) and use the depth map to predict a path?

    Very cool project though!

  • amelius 1 hour ago
    The trick is to make a robot that has a Lidar and a camera, then train a model that can replace the Lidar.

    (Lidar can of course also be echolocation).

    • ThatMedicIsASpy 20 minutes ago
      I thought the trick is just to use an xbox kinect. But lidar got a lot cheaper in the recent years.
  • villgax 1 hour ago
    There’s things like SLAM, optical flow etc, read up on things instead of being so defeatist IMO even for a hobby project, seems so forced