Skip to content

for capital markets

Unlock new sources of alpha with low-latency AI

Get in touch Evaluate now

Generating alpha
with AI

Firms that can run AI models on market data in real time can unlock rich new seams of alpha.

VOLLO is the low-latency AI inference accelerator trusted today by leading trading teams. VOLLO can work right next to market feeds, taking in the full depth of the order book and it’s able to run the best AI networks of today and tomorrow.

The result? More intelligent recognition of shifts in supply and demand, and the ability to capture alpha opportunities with AI in microseconds.

With hundreds of thousands of hours of proven production trading to its name, VOLLO is the future of generating alpha with AI. If you’re not using VOLLO, your competitors are. Evaluate now by testing VOLLO on your models.

Test VOLLO on your models

We are very happy with VOLLO. We prefer that our competition doesn't know we use it.
Anonymous trading firm

Discover what VOLLO can do for you 

VOLLO is relied upon for production trading and market making today. Watch this video to see how it can help you generate new sources of alpha.

PyTorch and TensorFlow AI models are compiled by VOLLO directly to supported FPGAs. With the ability to switch inference between multiple models instantaneously, VOLLO is designed to support your evolving AI needs now and in the future.

VOLLO was independently audited by STAC and has held the latency records in their ML benchmarks for almost three years against all GPU and custom ASIC audits.

Get in touch to see if you qualify 
for a free VOLLO license

Get a free license

VOLLO for capital markets

Trade first with microsecond AI inference

Deploy your machine learning models faster than your competitors with VOLLO, the lowest latency inference accelerator. Increase performance and maximize throughput, ensuring you’re never late to trade.

50%

of the latency of any competitor

1
second

to upload a new model

Up to
50M

parameter models on a single FPGA

Down to
1μs

compute latency

Why VOLLO?

VOLLO is relied upon by the world’s most successful trading companies.

Low latency quant trading

The lowest latency inference for your AI models, independently proven by STAC-ML™ benchmarks

Wide range of model support

Can be software-configured to run multiple models simultaneously. Supports a wide variety of operations and layer types — from decision trees, LSTMs, MLPs, and CNNs to streaming architectures like structured state-space models such as S4 and Mamba

Future-proof AI

New models can be programmed in under a second and the flexibility of FPGA low latency trading technology ensures that even future models will be easy to program when they’re conceived

Easy to adopt

VOLLO compiles PyTorch and TensorFlow models to FPGA and requires no FPGA expertise or tools to use

Quick to deploy

VOLLO can be programmed onto PCIe cards for rapid adoption into existing infrastructure or imported into your proprietary FPGA design as a netlist

Improve the performance of your network security AI

VOLLO Evaluation

Any ML developer can easily evaluate the benefits of using VOLLO through two flexible options designed to suit different stages of exploration.

With the VOLLO Sandbox, you can test your models instantly online with no downloads, or installation, and you don’t need any hardware expertise or tools. It’s the fastest way to experience VOLLO’s capabilities and see how your models perform.

For deeper analysis, the VOLLO Simulator remains available. Using the freely accessible VOLLO compiler and virtual machine, you can perform cycle-level, bit-accurate simulations of your AI models inferring on the supported FPGA boards, enabling a more detailed and comprehensive evaluation. You won’t need any new hardware, hardware-specific expertise, or licences for this either. Also, you don’t need to share your models or data with us, giving you complete freedom to explore, with full security and control.

Test VOLLO on your models

Who is VOLLO for?

Proprietary trading firms who want to be first to trade.

Need to keep pace with AI developments

Build future-proof AI platforms

Want a smarter model within the same latency bound

Design and test new ML models rapidly in your favorite framework

Need to run their models faster

Make smarter, faster decisions and lead the market

Want to outpace their competitors

Adopt the industry’s proven latency-leading technology

Run your 
models on 
VOLLO 
today

Evaluate your models on VOLLO in minutes. No hardware required.

Use the VOLLO Sandbox to test models instantly online, or go deeper with the VOLLO SDK Simulator to backtest performance and measure latency with bit-accurate precision.

Try VOLLO for free today

Experience fast and accurate conversational AI

Built for performance

VOLLO runs on FPGA-based platforms, overcoming the limitations of traditional CPU-based architectures which struggle to respond quickly and efficiently to the volume and velocity of market data.

VOLLO is supported on an ever-expanding range of FPGAs including the following:

  • AMD Versal, Versal Premium & Virtex UltraScale+; Altera Agilex
  • Up to 50M parameter models on a single FPGA

For a full list of supported PCIe and SmartNIC cards, visit vollo.myrtle.ai

FAQs

Here are the answers to some of the most common 
questions we get asked about  VOLLO

Still have questions?

Get in touch

Request a free VOLLO license

Schedule a consultation to see if you’re eligible for a free VOLLO license