Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

(github.com)

47 points | by noahkay13 5 hours ago

5 comments

noahkay13 5 hours ago
I built a C++ inference engine for NVIDIA's Parakeet speech recognition models using Axiom(https://github.com/Frikallo/axiom) my tensor library.
What it does: - Runs 7 model families: offline transcription (CTC, RNNT, TDT, TDT-CTC), streaming (EOU, Nemotron), and speaker diarization (Sortformer) - Word-level timestamps - Streaming transcription from microphone input - Speaker diarization detecting up to 4 speakers
[-]
- aaronbrethorst 1 hour ago
  I see a number of references to macOS support in your docs for Axiom. Can this run on iOS?
  [-]
  - noahkay13 36 minutes ago
    Theoretically, yes? This hasent been tested but xcode has great c++ interop and the goal with Axiom and now parakeet.cpp is to be used for portable deployments so making that process easier is definitely on the roadmap.
antirez 48 minutes ago
Related:
https://github.com/antirez/qwen-asr
https://github.com/antirez/voxtral.c
Qwen-asr can easily transcribe live radio (see README) in any random laptop. It looks like we are going to see really cool things on local inference, now that automatic programming makes a lot simpler to create solid pipelines for new models in C, C++, Rust, ..., in a matter of hours.
ghostpepper 4 hours ago
Off topic but if anyone is looking for a nice web-GUI frontend for a locally-hosted transcription engine, Scriberr is nice
https://github.com/rishikanthc/Scriberr
nullandvoid 1 hour ago
I've been using handy with parakeet on both Windows and mac, and have been very impressed.
Hoe does this compare?