Show HN: Scriber Pro – Offline AI transcription for macOS

(scriberpro.cc)

132 points | by rezivor 1 day ago

31 comments

geerlingguy 1 day ago
I've been using MacWhisper for this, with a huge variety of transcription options and things like speaker detection. It works great for all the 1 hour and shorter videos I've fed it, but does this have more to offer?
I haven't tried a 4+ hour video with MacWhisper but I presume that would work the same.
[-]
- rezivor 1 day ago
  Please be my guest to test my claims. No tall tales here!
  [-]
  - gcr 1 day ago
    MacWhisper handles multiple-hour-long recordings just fine for me. I regularly process 4hrs on MacWhisper. Even whisper-cpp works fine these days for long recordings too.
    Cool product, but it would be better if you stopped spreading misinformation to support it.
    [-]
    - lostlogin 1 day ago
      > Cool product, but it would be better if you stopped spreading misinformation to support it.
      I don’t see this sort of thing, has the page changed? Edit: the comments here…
      The drop shadow on the pages does make it deeply unpleasant to read.
busymichael 1 day ago
As a side project, I just launched a privacy-first web-based meeting transcriber (https://basilai.app/app). Everything runs entirely in your browser — both the transcription and AI summarization — so no audio or text ever leaves your device.
I'm using the browser built in transcription service plus downloading a model and running it via webgpu. No login. At the end of your meeting, you get a zip file with the audio, transcript and summary.
xnx 1 day ago
You can also run Whisper locally in your browser for free: https://ggml.ai/whisper.cpp/
[-]
- rezivor 1 day ago
  Great when you have time to kill and not a lot to process I suppose
Telemakhos 1 day ago
What languages does this support? Does it support switching between multiple languages in one video?
For example, could it support a video that included spoken Latin, ancient Greek, German, and Italian?
[-]
- runxel 1 day ago
  So weird that this is nowhere stated on the website at all. Was literally the first and only thing I was interested in. So bad.
- rezivor 1 day ago
  eng der fr de es it pt ru zh ko ar and ja
  [-]
  - Telemakhos 1 day ago
    So, can it handle multiple languages in one video, or do you need to segment the different languages using LID first? This has been a thorny issue for people working in multilingual audio (there are at least two or three of us).
    [-]
    - rezivor 1 day ago
      I haven't test that specific edge case, I'm sorry. I tested 2 langue's having a normal conversation and that worked fine- "Auto or English" handle multiple lan the best
yewenjie 1 day ago
Does it support speaker diarization?
torstenvl 1 day ago
You use the word "transcribe" but the page doesn't appear to support that claim? This looks like straightforward STT? Or does it actually support transcription (diarization, etc.)?
(Also, the text is completely illegible on your site.)
[-]
- rezivor 1 day ago
  r/#FF0000_rage
der_philipp 8 hours ago
Also look at Vibe:
It even supports speaker differentiation/recognition and is open source on mac/windows/linux;
https://github.com/thewh1teagle/vibe
[-]
- der_philipp 7 hours ago
  It uses whisper, but also directly calls other tools and puts everything under one nice Gui
masonkim25 6 hours ago
I’ve also been pretty careful with sensitive recordings, so the offline part really stands out to me. This looks great.
mattstudio 1 day ago
One thing that Rev and other online services have as well as MacWhisper is a good interface for editing the text to correct inevitable errors. Being able to click on the text and have it sync to the correct place in the audio is a must for my use case of transcribing interviews. Also speaker diarization.
[-]
- rezivor 1 day ago
  Scribers’ iCloud system automatically backs up each transcription and organizes them in a three-pane folder view—somewhat inspired by Bars’ layout. This structure allows a surprising degree of customization for all your data needs, especially when transcribing interviews. It would probably make for a very comfortable workflow here
CrazyCatDog 1 day ago
Question: can it discern (and label) different speakers? If so, could you kindly share the limit on speakers per video?
[-]
- CharlesW 1 day ago
  MacWhisper Pro supports this, if your need for this is time-sensitive. https://macwhisper.helpscoutdocs.com/article/32-automatic-sp...
- oidar 1 day ago
  You are looking for speaker diarization. No one is doing this well currently on device (in macOS land at least).
- rezivor 1 day ago
  No, not yet! That will definitely be included in the next update next month. Thank you for reminding me of peoples unique need for this use case
scilro 1 day ago
Seconding/thirding the request for diarization! I would use this as my main transcription app if it had that.
[-]
- rezivor 1 day ago
  I use it as my own transcription app, I really do love it ( biased I know, but genuinely)
cassettelabs 12 hours ago
Not gonna lie, the pricing is super attractive! :) I'd love to see an API, so I can also run automations using this tool on my mac.
pmarreck 1 day ago
Does it do separate speaker identification (diarization)?
What's the stack, if I may ask? (I believe Whisper-X does the diarization thing)
nvdnadj92 1 day ago
I vibecoded a similar app. Here’s the open source link, if folks want to build their own:
https://github.com/naveedn/audio-transcriber
[-]
- rezivor 1 day ago
  Slower
  [-]
  - nvdnadj92 1 day ago
    Yes, but by a negligible margin. My program is designed for multi-track audio, which means I run this in parallel on multiple 3 hour recordings, and get results in 12 minutes.
    You haven’t shared any architectural details. What model? What size? How can anyone be sure that what you’re building is truly offline?
  - ramon156 1 day ago
    Yours isn't OSS, meaning I have no idea what I'm running
    [-]
    - rezivor 1 day ago
      OSS would be incredibly slow, also seems like overkill for this use case
      [-]
      - mpeg 1 day ago
        I was going to buy the app, but these responses are putting me off massively. How would making it OSS slow it down?
      - kamranjon 1 day ago
        I suspect, from the responses of the creator here, that this app they are selling is likely violating a number of open source licenses…
      - user- 1 day ago
        the obnoxious site deisgn and comments like this stopped me from clicking buy in the apple store
      - fl_rn_st 1 day ago
        What does that even mean? Why would OSS make it slower? Why would it be an overkill? This is not Producthunt, you have to give at least some kind of explanation for your claims.
      - konart 1 day ago
        OSS as in open source software. Not Open Sound System. Just in case.
      - ideashower 1 day ago
        Can you back up your claim that it's slow?
      - sings_lullabies 1 day ago
        [dead]
oasisbob 1 day ago
Timecode drift is an interesting issue, think I faced this recently while translating a Google Meet transcript into an incident report timeline.
The elapsed-time timestamps didn't correlate well with other data sources. I figured it was a mistake on my end, and just brushed it off.
nubg 1 day ago
How does it compare to MacWhisper?
[-]
- rezivor 1 day ago
  MacWhisper crashes at about an hour of context. This uses, smart, invisible regex in the text generation pipe. Makes this fast. + bonus, there is no context limit
  [-]
  - barapa 1 day ago
    Smart invisible regex makes it fast and prevents it from crashing? What does that mean?
  - grosswait 1 day ago
    I've done 3+hours with MacWhisper without issue? One downside is the transcription is not real time - can Scriber Pro do realtime?
    [-]
    - KPGv2 1 day ago
      I haven't worked in a while with transcription, but whisper.cpp itself (which I assume is the underlying tech behind MacWhisper) does realtime transcription on my MBP with an M1 Pro chip. When I first started writing my last completed novel, I fired it up and just started telling the story to test it out. Realtime.
      That was back in 2023. I assume things work better now.
  - fady0 1 day ago
    I am a MacWhisper Pro user, and I successfully transcribed and translated a 15-hour course inside the app without any issues
  - CharlesW 1 day ago
    > MacWhisper crashes at about an hour of context.
    This is not true. (I've been a MacWhisper user since 2023. I have two bugs during that time, which the author addressed quickly.)
  - fl_rn_st 1 day ago
    "Smart, invisible regex" sounds like a lot of bs... could you give a more technical explanation?
    Also the Whisper model doesn't really have a context window, it already segments the audio with a certain amount of overlap between the chunks, I really have a hard time understanding what you are trying to say here.
    [-]
    - rezivor 1 day ago
      Whisper will fail > 99%* (edit, most of the time) of the time at lengths over 90 minutes and fairly high over one hour.
      [-]
      - saaaaaam 1 day ago
        This is absolutely not my experience. I regularly (weekly at least) use whisper for 90-120 minutes pieces of content and only rarely have problems.
      - pmarreck 1 day ago
        Can't really declare that without declaring which whisper model in particular you are referring to, as there are a number of them
      - fl_rn_st 1 day ago
        This is just plain wrong. I have my own Whisper App in the AppStore (on iOS, with very limited memory capacity) and there are no problems at all with longer Audio / Video files.
        [-]
        rezivor 1 day ago
        I've never had whisper complete a single attempt a anything over 75 min
      - gcr 1 day ago
        I’ve used whisper-cop on 5-hour podcasts without problems.
        Would also love to hear what you mean by “smart invisible regex,” sounds like AI slop to me.
  - pmarreck 1 day ago
    > Smart invisible regex
    I've never heard a regex person speak this way of a regex.
    Please tell me you didn't vibecode the regex... one of the areas it's still not good at
  - gcr 1 day ago
    What do you mean context limit?
    Neither whisper nor MacWhisper have any context limit
xjlin0 22 hours ago
Is it only for English? is CLI available? There are thousands of files on my local and I'd like to save results to local db. Thanks!
jiriro 1 day ago
Will it transcribe audio in Czech (in future versions)?
Actually I would be happy if it could just identify occurrences (timestamps) of a specific word or a small set of words.
vladsanchez 1 day ago
App Store link no longer works. Willing to try/purchase but it's nowhere available. AppStore search doesn't return "Scriber Pro" either.
Thanks.
[-]
- KPGv2 1 day ago
  FYI it works now because I just brought it up. The website mentions there were HN promo/discount codes, so I honestly expected the app to be like $20+, so color me shocked when it's $3.99.
  [-]
  - vladsanchez 8 hours ago
    FYI: Still doesn't redirect... Had to remove the url query params. Only https://apps.apple.com/us/app/scriber-pro/id6751968220 worked as expected.
    Thanks for sharing.
    Looking forward to the "Speaker Detection" feature release. ;)
mattfrommars 1 day ago
What is your tech stack to make this? Is it end to end swift?
[-]
- rezivor 1 day ago
  Swift 37.0% C++ 26.5% C 19.8% Rust 4.7% Shell 4.4% Objective-C 1.7% Other 5.9%
  [-]
  - CaptainOfCoit 1 day ago
    What language do you have the model architecture and implementation in? Feel like it would be the biggest proportion of the codebase, curious if you did it in Swift?
FitchApps 1 day ago
Nice work. What model did you use and do you ship the model with a base distribution or is it downloaded with the app?
[-]
- woodson 23 hours ago
  Probably NVIDIA’s Parakeet-TDT-0.6b-v3.
aquir 1 day ago
My eyes, my eyes! What is this red colour?
Cool project, I am using ChatGPT for recording/summarising meetings but the limit there is 2 hours
[-]
- rezivor 1 day ago
  That my friend, is just one of the annoying bottlenecks, that inspired me to do this
  Also that color is color(display-p3 .768627 .031373 .031373 / 1)- It is actually technically redder than red actually is
trvr 1 day ago
Is there a reason it requires macOS 26?
[-]
- rezivor 1 day ago
  Uses liquidGlass fairly heavily- just design reasons-- I'm happy to expand compatibility
  [-]
  - trvr 1 day ago
    At $3.99 this was an instant buy for me until the App Store told me I couldn't. I think the venn diagram between HN users and those holding off on Tahoe is probably a pretty big overlap. ;-)
    [-]
    - nkotov 1 day ago
      Same! I was trying to buy but wasn't able to.
    - rezivor 1 day ago
      If ever there were a reason to upgrade!
      But thats a fair point- I drank the Liquid Glass Kool-Aid----- I'll aim more compatibility the next upgrade
      [-]
      - busymichael 1 day ago
        My experience share only supporting the latest OS:
        I have launched apps focused on a new feature in the latest OS and regretted it. The # of people who have the latest OS is much smaller than the full install base for much longer than I thought. As a result, my marketing conversion was unnaturally low - people who liked the app idea but couldn't install because they had the wrong OS. This causes two problems: potential users I activated but couldn't convert and this signal gets internalized by the App Store, pushing down future impressions.
        Now I always have a fallback implementation of the feature so I can target the prior OS. Both Mac and iOS.
      - polarix 1 day ago
        Do you have a mailing list?
- somberi 1 day ago
  Wanted to buy. Not on Macos 26.
- tomalbrc 1 day ago
  Tahoe really is just unusable
  [-]
  - rezivor 1 day ago
    Thanks. An update that will add functionality that allows a user to give it a link that contains web video, will do dynamic link discovery (with Safari extension, and pull in the video automatically (M3U8 discover and retrieval) -- Lots of online lecture videos that need transcription.
    I will include better version support (probably to os 13).
  - trvr 1 day ago
    Hence why I'm asking why it requires it. Trying to hold off as long as I can!
thedangler 1 day ago
Any way to access this with python so I can use it programmatically?
[-]
- rezivor 1 day ago
  This isn't run using Python, but also no
re 1 day ago
What libraries/models is this built on?
tempodox 1 day ago
Too bad it requires that unspeakable abomination macOS 26. No can do.
brokensegue 1 day ago
Word level timestamps?
[-]
- rezivor 1 day ago
  Ah, I think you're asking if a user can, when wanting timestamps, if they can further edit the output to be by word? Currently set around each sentence (2-5s) --- But that is absolutely doable and that’s a great idea - On the next update (~3-4wks) I’ll definitely include the ability to control that.
- rezivor 1 day ago
  Can you clarify your question
  [-]
  - qwertytyyuu 1 day ago
    I think they are trying to do something like select a word in the transcript and be take straight to the point in the video that said word was spoken
constantinum 1 day ago
I sort of use SuperWhisper, it is sort of good. https://superwhisper.com/
dang 1 day ago
[stub for offtopicness]
[-]
- rezivor 1 day ago
  Due to overwhelming response,
  https://iili.io/KkoKBCx.png
  I have change the bg color
- heystefan 1 day ago
  Seems like a great app, but I have to ask:
  https://scriberpro.cc/about/ Are you trolling people with this page's design? Unreadable colors AND a wobble effect? :D
  [-]
  - bazzargh 1 day ago
    Also, disabling scrolling and using swipe for sections instead _at a font size that causes text to overflow_, depending on phone screen size, meaning a bunch of the site is _literally_ unreadable, since it's off the screen with no way to get there.
  - qwertytyyuu 1 day ago
    reading on drugs simulator?
  - rezivor 1 day ago
    I honestly thought it would be nothing more than a fun little easter egg, Red was a bold choice I admit. ! I will making a css update soon
    [-]
    - coldtea 1 day ago
      They also mean the 3D effect (and there's also a blur one)
- Oras 1 day ago
  The landing page is difficult to read due to the strong red background.
  [-]
  - rezivor 1 day ago
    Thank you. I have heard this : P - It will change soon.
    [-]
    - ta1243 1 day ago
      I learned this as a young developer writing status pages. I had a colour-blind colleague, so text on green/yellow/red.
      Typically white text works better on red.
      [-]
      - seanwilson 1 day ago
        Reds people choose are usually between dark and light, which doesn't contrast particularly well on anything because for good contrast you need a dark color vs a light color.
        Green, yellow, red or whatever hue is fine, as long as it's dark or light enough. Colorblind and non-colorblind people can see how dark or light a color is (luminance), but they might not agree on the hue. That's why WCAG contrast checks require luminance contrast and not hue contrast.
        It's best to use a contrast checker because it's not always intuitive how dark or light a color is e.g. yellow and lime are almost as light as white.
- reactordev 1 day ago
  I can’t read this site. Red background, dark grey/black text, is a terrible terrible choice for colors for readability. There were some words there but all I could make out were the header on my mobile phone.
- SeanAnderson 1 day ago
  https://i.imgur.com/EBaMNDS.png This is how your landing page looks on my desktop.
  The first thought I had when it loaded was, "Did we forget how to make webpages?"
  Sorry. I'm sure the software is great, but yeah.
  [-]
  - rezivor 1 day ago
    No need to apologize!
    Oh I see title ate your h2 text. Thats no good. Thanks for showing me
- thunderson 1 day ago
  [dead]
- thunderson 1 day ago
  [dead]
- jorgenbuilder 1 day ago
  The red looks great. Takes me back to the old days.
- dangoodmanUT 1 day ago
  MY EYES
  [-]
  - rezivor 1 day ago
    Red was a bold choice ill admit
- iLoveOncall 1 day ago
  That page has a very aggressive background color and really low contrast. Extremely annoying.
  [-]
  - rezivor 1 day ago
    Thank you ! I think I was actually going for that
    [-]
    - ankit_mishra 1 day ago
      You were going for *really low contrast* ? That's an interesting goal to have.