Announcing transcribe.cpp

Meet transcribe.cpp, a new open-source C/C++ speech-to-text inference library with portable, GPU-accelerated support for multiple STT models. Developed through Mozilla.ai's Builders in Residence program, it makes adding fast, local transcription to applications easier than ever.

Announcing transcribe.cpp
Jean Le Tavernier (1450-1460) / The scribe Jean Mielot in the writing room

We are excited to announce that CJ Pais' transcribe.cpp has been officially released!

What is transcribe.cpp?

transcribe.cpp is a C/C++ speech-to-text (STT) inference library. Think about it as "llama.cpp for STT models": it relies on the ggml runtime to support a variety of STT model families via GGUF, models which you can run with Metal, Vulkan, and CUDA backends for fast GPU inference.

CJ has previously collaborated with our Mozilla siblings through their Builders program, contributing to the llamafile project in different ways. He created the LocalScore benchmarking tool, added support for new models, and integrated whisper.cpp functionalities in the form of whisperfile. His work on STT has grown into his own desktop application, Handy, which was featured on WIRED at the beginning of this year.

transcribe.cpp development started from this observation: many very good STT models could have been included in Handy, but they are often developed in isolation. This leaves them with two recurring weaknesses: poor portability (e.g. MLX models only run on Macs) and sub-optimal performance (as acceleration rarely works everywhere out-of-the-box). transcribe.cpp provides a uniform interface that easily brings GPU acceleration to all these models. The final result is an open source library available not just to Handy, but to everyone wishing to include STT functionalities in their applications.

But that's not the only reason to celebrate this release: transcribe.cpp is also the first independent open-source project developed with the support of Mozilla.ai's Builders in Residence (BiR) program! Our goal with BiR is to advance applied, cutting-edge research in the open while connecting it with our own roadmap. In the case of transcribe.cpp, this translates into using the library to build transcribefiles: portable, multi-platform, self-contained executables that you can run (almost) anywhere to perform audio transcription.

What does this mean for you?

If you are a builder willing to add STT functionalities to your application, then the library’s GitHub repo is your next stop. But you can also play with transcription without the need to write a line of code using Handy, or use our llamafile to bundle your favorite model and configuration into a self-contained executable for an ad-hoc transcription task. And this is just the start: we look forward to seeing people create new tools out of transcribe.cpp!