Tell HN: YC companies scrape GitHub activity, send spam emails to users

Hi HN,

I recently noticed that an YC company (Run ANywhere, W26) sent me the following email:

From: Aditya <aditya@buildrunanywhere.org>

Subject: Mikołaj, think you'd like this

[snip]

Hi Mikołaj,

I found your GitHub and thought you might like what we're building.

[snip]

I have also received a deluge of similar emails from another AI company, Voice.AI (doesn't seem to be YC affiliated). These emails indicate that those companies scrape people's Github activity, and if they notice users contributing to repos in their field of business, send marketing emails to those users without receiving their consent. My guess is that they use commit metadata for this purpose. This includes recipients under the GDPR (AKA me).

I've sent complaints to both organizations, no response so far.

I have just contacted both Github and YC Ethics on this issue, I'll update here if I get a response.

140 points | by miki123211 3 hours ago

24 comments

  • martinwoodward 48 minutes ago
    Martin from GitHub here. This type of behaviour is explicitly against the GitHub terms of service, when we catch the accounts doing this we can (and do) take action against those accounts including banning the accounts. It's a game of whack-a-mole for sure, and it's not just start-ups that take part in this sketchy behaviour to be honest. I've been plenty of examples in my time across the board.

    The fundamental nature of Git makes this pretty easy for folks to scrape data from open source repositories. It's against our terms of service and those folks might want to talk with some lawyers about doing it - but as every Git commit contains your name and email address in the commit data it's not technically difficult even if it is unethical.

    From the early days we've added features to help users anonymise their email addresses for commits posted to GitHub. Basically, you configure your local Git client to use your 'no-reply' email address in commits and that still links back to your GitHub account when you push: https://docs.github.com/en/account-and-profile/reference/ema...

    I think that's still probably the best route. We want to keep open source data as open as possible, so I don't think locking down API's etc is the right route. We do throttle API requests and scraping traffic, but then again there have been plenty of posts here over the years from people annoyed at hitting those limits so it's definitely a balancing act. Love to know what folks here think though.

    • ayhanfuat 42 minutes ago
      I am also getting constant spam because apparently they can see who starred a repo (i.e. I see you starred repo x and we are doing something similar). I am not starring anything anymore.
    • AznHisoka 33 minutes ago
      Maybe I am missing something, but can’t you simply not show the email address in a git commit? (Sincere question, not saying this is trivial. i am dumb and like to ask dumb questions even if might be embarassing)

      If someone wants to message someone, it goes through github notifications or github emails them

      Also banning an account doesnt seem like a heavy punishment, given they can simply move to gitlab, bitbucket etc

      • easton 26 minutes ago
        Git commits have a email address as a required field[0], although some people put something bogus in there. And then it's in the data provided when you clone the repo onto your machine even if you aren't using the GitHub APIs.

        To his point, you can set that to the no-reply email address GitHub gives you if you don't want mail but do want the commit to be linked to your GitHub account.

        [0]: https://git-scm.com/docs/git-commit#_commit_information

      • EdNutting 29 minutes ago
        That would be a fundamental change to how Git works, not just GitHub. Even if the web UI didn't show it, a simple `git log` would reveal it.

        You can mask your email address in git commits but a lot of open source projects won't accept that. And some pseudo-open-source ones insist on sending you an email to authenticate before they'll give you access to the GitHub repo (looking at you Unreal Engine!)

        So, no, I don't think they could simply "not show the email address".

        • AznHisoka 14 minutes ago
          Makes sens! Appreciate the explanation!
  • lordgrenville 3 minutes ago
    Maybe a dumb question, but isn't this trivially solved with this .gitconfig?

        [user]
             name = lordgrenville
             email = <some_kind_of_id>+lordgrenville@users.noreply.github.com
  • scottydelta 28 minutes ago
    YC is a proud investor in Flock, what YC Ethics thing are you talking about?
    • ls-a 4 minutes ago
      There is a YC startup that blacklists employees so that other startups don't hire them. Female founders. Forgot its name.
    • cassonmars 25 minutes ago
      And Cluely
      • tasn 14 minutes ago
        Cluely is not YC.
  • dewey 1 hour ago
    This happens all the time, not really surprised as the GitHub API makes it pretty easy to extract valuable leads with real and confirmed email addresses.
    • tommoor 1 hour ago
      Yea, been going on at least a decade
  • neya 2 hours ago
    This is atleast fine as it's just spam, I got pulled into an actual scam and it never made it to the frontpage.

    https://news.ycombinator.com/item?id=45357205

    • medi8r 30 minutes ago
      But that is someone pretending to be YC which is sort of less interesting than a YC company doing something bad. Because phishers imitate legit companies all the time. Easy to get roped in and I sympathise, anyone is suseptable (today I almost clicked the phishing training email as it looked urgent and pushed the right buttons)
    • ChrisMarshallNY 2 hours ago
      Looks like GH nuked it, though.

      Hope they didn’t get too many folks.

    • nubinetwork 1 hour ago
      That's a little creepier than the time I got an email from someone trying to push a new crypto coin to me because I contributed to OSS.
  • keiferski 25 minutes ago
    I've spent a lot of my career marketing to developers, and spamming their GitHub account might be top 1 or 2 worst marketing tactics you can use.

    Cold emailing rarely works by itself. Cold emailing developers via emails you pulled from their GitHub accounts? At that point, you're actively harming your brand, and may as well just send them spam diet pill ads.

  • EdNutting 24 minutes ago
    My solution to this is to use a Github-specific email address. All emails sent to that address which do not originate from GitHub are immediately reported as spam, marked read and deleted.

    I sometimes use different git/GitHub addresses depending on who I'm working for or specific projects so I can more accurately detect where data is being scraped from.

    • EdNutting 17 minutes ago
      N.B. Using service-specific emails is trivial - you don't need separate email accounts. Just use email aliases, e.g. "john.smith+github@gmail.com" -- which is an alias called "github" for "john.smith@gmail.com"
      • gus_massa 7 minutes ago
        Don't spammers have an automatic filter to cleanup that?
  • c16 1 hour ago
    Email address privacy is a feature offered by Github and replaces your day to day email: https://docs.github.com/en/account-and-profile/how-tos/email...
  • armchairhacker 2 hours ago
    I remember this being discussed a while ago

    https://news.ycombinator.com/item?id=9332418 (11 years ago)

    https://news.ycombinator.com/item?id=20660624 (7 years ago)

    https://news.ycombinator.com/item?id=27855152 (5 years ago)

    https://news.ycombinator.com/item?id=30900237 (4 years ago)

    Seems it’s a reoccurring issue

  • WhatsName 1 hour ago
    Doesn't YC have some code of conduct or legal/ethical guidelines? I would assume a legal and compliance department would have some major headache if documented cases of misconduct jeopardize later due diligence. I would not fund or aquire a company on the radar of national regulatory bodies for something as stupid as this.
  • j16sdiz 12 minutes ago
    Over many years, I have got email from university for survey / research.

    This is not GitHub only, I have got a survey on how my experience interacting with folks on lkml

  • kristoff_it 1 hour ago
    I have received over the years so much spam of this kind by multiple YC-funded companies that I now reflexively send to spam any email that mentions being YC-funded, regardless of how legitimate the email is.
    • neya 1 hour ago
      I don't blame you, the FOMO is real to the point even basic ChatGPT wrappers are getting funded these days, I guess.
    • AznHisoka 31 minutes ago
      Same here, having YC attached to your name is not the flex you think it is, its even the opposite for me
  • theturtletalks 47 minutes ago
    General advice would be to mark the email as spam or junk and hopefully their email platform penalizes them, but this has been working less and less. Email has truly become pay to play now.
    • suyash 1 minute ago
      That's exactly what I've been doing with solicitation emails, reporting as SPAM on gmail.
  • axegon_ 13 minutes ago
    I've received several similar ones over the years. At this point, if I get an email from someone I don't know and it contains a link, chances are it's spam. I genuinely doubt github(or any other company for that matter) would do something about it. While I fully support GDPR, the truth is, few people are willing to take action knowing how much bureaucracy would be involved...
  • pscanf 2 hours ago
    I was also spammed (twice) by voice.ai.

    You mention GDPR, which also "applies" to me, though I wonder if what they're doing is actually illegal. I mean, after all, I'm putting my email on GitHub precisely to give people a way to contact me.

    Of course, I do that naïvely, assuming good faith, not expecting _companies_ to use it to spam me. So definitely what they're doing is, at the very least, in poor taste.

    • notpushkin 36 minutes ago
      > I'm putting my email on GitHub precisely to give people a way to contact me.

      They’re not only looking at the public email in your profile, they’re also looking at your committer email (git config user.email). You could argue that you’re not putting that out for people to contact you.

      (I’ve used that trick a couple times to reach out to people, too, but never mass emailing.)

    • zvqcMMV6Zcr 1 hour ago
      Is there any company that will take my money to solve GDPR issues? And by solve I mean sue the spammers? For last few years I saw they "try" to look legit, by claiming addresses are managed by some Hungarian/Spanish shell company, hoping no one will be able to afford pursuing infractions over borders.
      • RobotToaster 1 hour ago
        There's probably a law against it, but I've always thought a legal company could make decent money taking cases like this in bulk for free, on the condition that they get to keep all the compensation, while the "client" still gets the satisfaction of punishing the offending party.
        • notpushkin 34 minutes ago
          That’s pretty much class action lawsuits!
      • KomoD 1 hour ago
        > Is there any company that will take my money to solve GDPR issues? And by solve I mean sue the spammers?

        A lawyer

    • victorbjorklund 59 minutes ago
      They spammed me as well.
  • rodrigodlu 13 minutes ago
    I did receive these kinds of emails as well.

    And I use a different email fromy priority email for GitHub commits since 4 years ago.

    So just stop with marketing slop please.

    Yes, I work with AI, and I'm becoming pretty good at it.

    But this doesn't mean I'm comfortable pushing AI slop into potential users and customers.

    I (and they) want to use AI to facilitate their processes, not to ingest slop content.

  • rlaabs 1 hour ago
    I've received the exact same email from the same company.
  • ChrisMarshallNY 2 hours ago
    I’m not especially bothered by this [yet -AI is likely to make this worse]. It’s a fairly insignificant component of my spam catcher. At least, it’s a bit focused.

    Every day, I get deluged with hundreds of spam and scam emails, often because some knucklehead entered my email in a form (either accidentally, or as a throwaway red herring).

    • Maxious 1 hour ago
      Sure but these YC spammers are identifiable and have much more to lose https://www.ycombinator.com/ethics/

      > Some examples of ethical behavior we expect from founders are:

      > - Not spamming members of the community

      > To maintain our community, if we determine (in our sole discretion) that a founder has behaved unethically during or after YC, we will revoke their YC founder status. This includes access to all Y Combinator spaces, software, lists and events. All founders in a company may be held responsible for the unethical actions of a single co-founder or a company employee, depending on the circumstances.

      • RobotToaster 1 hour ago
        Has this ever actually been enforced?
      • ChrisMarshallNY 45 minutes ago
        > > - Not spamming members of the community

        Ah... but there's the rub.

        Define "the community."

        Do random GH accounts count as "members of the YC community"?

        Sorry, but unsolicited contact, much as I hates, HATESSSS it, is a classic component of any business, and has been, for many decades. I don't think it would be appropriate for a business organization to prohibit its members from engaging in "cold calling," of which, UCE is really an example.

        Using the YC branding/name, however, is a different matter.

  • nprateem 29 minutes ago
    There's no reason to put your real email in git config unless you're signing, in which case repos should be private. I would have thought that was obvious.
  • outloudvi 1 hour ago
    I usually check the "Received" header and report to the email service provider. Once in a while I receive a response saying the case is properly handled.

    These providers are the only ones that care about their reputation and thus may take some action. Investors? Nope.

  • bakugo 1 hour ago
    This sounded familiar, so I checked my inbox and I did indeed receive a similar email from sanchitmonga@runanywheresdk.com earlier this month:

    > I came across your GitHub profile and thought you might be interested in what my team and I are building. We're developing an open source SDK that runs LLMs directly on-device.

    What's even more interesting is that both buildrunanywhere.org and runanywheresdk.com show a stock hostinger parking page when accessed in a browser. Something tells me they're intentionally registering these "alternate" domains specifically for spam, to avoid tanking the email reputation of their main runanywhere.ai domain.

    I guess I shouldn't be surprised given YC is going all in on AI and most AI companies are no better than the crypto scammers of yesteryear, but still.

  • koakuma-chan 1 hour ago
    I have been having the same experience. If you starred a GitHub repo, and they think that their product is similar, they will send you their spam. I condemn this! They should be ashamed!
  • atfzl 2 hours ago
    [flagged]
    • speedgoose 1 hour ago
      Why would you promote spam?
    • RobotToaster 1 hour ago
      I feel like spam is somewhat less offensive when it's for FOSS, assuming it isn't some faux FOSS freemium scam. It's about the only spam I wouldn't mind getting.
    • bilekas 1 hour ago
      This is some next level spam posting. Not sure to be annoyed or impressed.
  • ValentineC 2 hours ago
    > These emails indicate that those companies scrape people's Github activity, and if they notice users contributing to repos in their field of business, send marketing emails to those users without receiving their consent. My guess is that they use commit metadata for this purpose.

    There are likely marketing email datasets floating around the internet that contain email addresses scraped from commit metadata.

    I use a catchall with a specific Git client (not GitHub) email address, and found spam and phishing emails being sent there quite a few times.

    • input_sh 1 hour ago
      May not necessarily be from commit messages, there's at least one way simpler way: simply adding .gpg to the end of any user URL will return that user's public GPG key.