Solutions

Your models running with the lowest latency and highest efficiency

We perform bespoke acceleration of clients’ inference models on FPGAs and other programmable platforms to provide them with a latency advantage over their competitors or a cost benefit compared with alternative solutions. These solutions may remain programmable to benefit from the latest developments in ML or be converted to ASIC for greater performance and lower unit cost. For a confidential discussion to determine if this could benefit you, please contact us here.

Examples of our work

Our work has benefited clients in a wide range of markets, including finance, aerospace & defense, speech, social media and automotive.

Faster, more intelligent decision-making in finance
More natural-sounding speech through latency reductions of up to 20%
CapEx & OpEx savings of up to 90%
Superior image classification using poor quality images

How we do it

We optimize ML inference for efficient hyperscale deployment of a wide range of workloads using our patented MAU Accelerator™ technologies and proven design techniques such as:

Heterogeneous compute employing algorithm, hardware & software co-design
Quantization to suit the targeted hardware platform
Exploitation of sparsity in the model

Reduce latency by more than 20x
Reduce the number of compute and memory operations by up to 95%
Reduce memory storage and bandwidth requirements by more than 10x
Reduce memory access energy consumption by more than 100x while having little to no impact on the accuracy of the final model.

These techniques and the resulting compelling benefits are described in this white paper

For more information regarding some of the solutions we have developed with clients, please see the following:
Speech synthesis: See our blog, our demo video and our White Paper
Speech transcription: See our Achronix White Paper, Intel White Paper and Intel Solution Brief