Real-time Automatic Speech Recognition

Streaming transcription at a fraction of the cost

CAIMAN-ASR enables at-scale ASR, supporting over 1000 real-time streams within stringent latency budgets, reducing CapEx costs by as much as 90%. A single 1U server with one accelerator card running CAIMAN-ASR has the same throughput capacity as twenty unaccelerated servers.

Lowest end-to-end latency

CAIMAN-ASR leverages the parallel processing advantages of Achronix’s Speedster7t® FPGA, the power behind the accelerator cards, to achieve extremely low latency inference. This enables NLP workloads to be performed in a human-like response time for end-to-end conversational AI.

Simple to integrate into existing systems

The complete accelerator stack provided includes all the software required for the WebSocket and a WebSocket API to simplify the interface to existing service provisions.

Scale up rapidly & easily

CAIMAN-ASR runs on industry-standard PCIe accelerator cards, enabling existing racks to be upgraded quickly for up to 20x greater call capacity. The VectorPath® S7t-VG6 Accelerator Card from BittWare is available off-the-shelf today.

Efficient inference, at scale

CAIMAN-ASR uses as much as 90% less energy to process the same number of real-time streams as an unaccelerated solution, significantly reducing energy costs and enhancing ESG credentials.

Streaming transcription

CAIMAN-ASR is provided pre-trained for high quality English language transcription. For applications requiring specialist vocabularies or alternative languages, the neural model can easily be retrained with customers’ own large, bespoke datasets using the popular ML framework PyTorch.