VOLLO for Capital Markets

Generating alpha
with AI

Firms that can run AI models on market data in real time can unlock rich new seams of alpha.

VOLLO is the low-latency AI inference accelerator trusted today by leading trading teams. VOLLO can work right next to market feeds, taking in the full depth of the order book and it’s able to run the best AI networks of today and tomorrow.

The result? More intelligent recognition of shifts in supply and demand, and the ability to capture alpha opportunities with AI in microseconds.

With hundreds of thousands of hours of proven production trading to its name, VOLLO is the future of generating alpha with AI. If you’re not using VOLLO, your competitors are. Evaluate now by testing VOLLO on your models.

Test VOLLO on your models

We are very happy with VOLLO. We prefer that our competition doesn't know we use it.

Anonymous trading firm

Discover what VOLLO can do for you

VOLLO is relied upon for production trading and market making today. Watch this video to see how it can help you generate new sources of alpha.

PyTorch and TensorFlow AI models are compiled by VOLLO directly to supported FPGAs. With the ability to switch inference between multiple models instantaneously, VOLLO is designed to support your evolving AI needs now and in the future.

VOLLO was independently audited by STAC and has held the latency records in their ML benchmarks for almost three years against all GPU and custom ASIC audits.

Get in touch to see if you qualify  for a free VOLLO license

Get a free license

VOLLO for capital markets

Trade first with microsecond AI inference

Deploy your machine learning models faster than your competitors with VOLLO, the lowest latency inference accelerator. Increase performance and maximize throughput, ensuring you’re never late to trade.

Get in touch Try for free today

50%

of the latency of any competitor

second

to upload a new model

Up to

50M

parameter models on a single FPGA

Down to

1μs

compute latency

50%

of the latency of any competitor

second

to upload a new model

Up to

50M

parameter models on a single FPGA

Down to

1μs

compute latency

Get in touch Try for free today

Why VOLLO?

VOLLO is relied upon by the world’s most successful trading companies.

Try for free today

Low latency quant trading

The lowest latency inference for your AI models, independently proven by STAC-ML™ benchmarks

Wide range of model support

Can be software-configured to run multiple models simultaneously. Supports a wide variety of operations and layer types — from decision trees, LSTMs, MLPs, and CNNs to streaming architectures like structured state-space models such as S4 and Mamba

Future-proof AI

New models can be programmed in under a second and the flexibility of FPGA low latency trading technology ensures that even future models will be easy to program when they’re conceived

Easy to adopt

VOLLO compiles PyTorch and TensorFlow models to FPGA and requires no FPGA expertise or tools to use

Quick to deploy

VOLLO can be programmed onto PCIe cards for rapid adoption into existing infrastructure or imported into your proprietary FPGA design as a netlist

Low latency quant trading

The lowest latency inference for your AI models, independently proven by STAC-ML™ benchmarks

Wide range of model support

Future-proof AI

New models can be programmed in under a second and the flexibility of FPGA low latency trading technology ensures that even future models will be easy to program when they’re conceived

Easy to adopt

VOLLO compiles PyTorch and TensorFlow models to FPGA and requires no FPGA expertise or tools to use

Quick to deploy

VOLLO can be programmed onto PCIe cards for rapid adoption into existing infrastructure or imported into your proprietary FPGA design as a netlist

Try for free today Find out more

VOLLO Evaluation

Any ML developer can easily evaluate the benefits of using VOLLO through two flexible options designed to suit different stages of exploration.

With the VOLLO Sandbox, you can test your models instantly online with no downloads, or installation, and you don’t need any hardware expertise or tools. It’s the fastest way to experience VOLLO’s capabilities and see how your models perform.

For deeper analysis, the VOLLO Simulator remains available. Using the freely accessible VOLLO compiler and virtual machine, you can perform cycle-level, bit-accurate simulations of your AI models inferring on the supported FPGA boards, enabling a more detailed and comprehensive evaluation. You won’t need any new hardware, hardware-specific expertise, or licences for this either. Also, you don’t need to share your models or data with us, giving you complete freedom to explore, with full security and control.

Test VOLLO on your models

Who is VOLLO for?

Proprietary trading firms who want to be first to trade.

Need to keep pace with AI developments

Build future-proof AI platforms

Want a smarter model within the same latency bound

Design and test new ML models rapidly in your favorite framework

Need to run their models faster

Make smarter, faster decisions and lead the market

Want to outpace their competitors

Adopt the industry’s proven latency-leading technology

Run your  models on  VOLLO  today

Evaluate your models on VOLLO in minutes. No hardware required.

Use the VOLLO Sandbox to test models instantly online, or go deeper with the VOLLO SDK Simulator to backtest performance and measure latency with bit-accurate precision.

Try VOLLO for free today

Built for performance

VOLLO runs on FPGA-based platforms, overcoming the limitations of traditional CPU-based architectures which struggle to respond quickly and efficiently to the volume and velocity of market data.

VOLLO is supported on an ever-expanding range of FPGAs including the following:

AMD Versal, Versal Premium & Virtex UltraScale+; Altera Agilex
Up to 50M parameter models on a single FPGA

For a full list of supported PCIe and SmartNIC cards, visit vollo.myrtle.ai

Audited STAC‑ML latency:

5.07 µs (99th percentile) for LSTM_A (Tacana test suite)
6.77 µs for LSTM_B, 31.0 µs for LSTM_C (Tacana test suite)

Compute‑only latency (excluding IO):

~1 µs

Throughput:

650K inf/sec (48 instances of LSTM_A) (Sumaco test suite)
14M inf/sec (24 instances of LSTM_A) (Tacana test suite)

Efficiency metrics:

Space: ≥1.4M inf/sec/cubic foot (Tacana test suite)
Energy: ≥2.32M inf/sec/kW (Tacana test suite)

Tacana test suite: https://docs.stacresearch.com/MRTL230426

Sumaco test suite: https://docs.stacresearch.com/MRTL221125

VOLLO compiles PyTorch and TensorFlow AI models directly to supported FPGAs. With the ability to switch inference between multiple models instantaneously, VOLLO is designed to support your evolving AI needs now and in the future.

VOLLO comprises a compiler, virtual machine (VM), C API runtime, FPGA bitstreams, and drivers
VOLLO virtual machine provides cycle-level, bit-accurate simulation – no hardware required

Get in touch Try for free today

FAQs

Here are the answers to some of the most common  questions we get asked about VOLLO

Still have questions?

Get in touch

Request a free VOLLO license

Schedule a consultation to see if you’re eligible for a free VOLLO license

for capital markets

Generating alpha with AI

Discover what VOLLO can do for you

Trade first with microsecond AI inference

Why VOLLO?

VOLLO is relied upon by the world’s most successful trading companies.

Low latency quant trading

Wide range of model support

Future-proof AI

Easy to adopt

Quick to deploy

Low latency quant trading

Wide range of model support

Future-proof AI

Easy to adopt

Quick to deploy

VOLLO Evaluation

Who is VOLLO for?

Run your models on VOLLO today

Built for performance

FAQs

How do I test my own model’s performance?

Can I load and run multiple models simultaneously?

Is TensorFlow supported?

Can I fine‑tune models from the included Model Zoo?

How does running multiple instances affect latency?

How do I maximize throughput?

Do I need to write RTL or custom bitstreams?

Can I run VOLLO on SmartNIC FPGAs?

What benefits does bfloat16 offer?

Can I evaluate the performance of VOLLO for my models without sharing those models with you?

Still have questions?

Request a free VOLLO license

Generating alpha
with AI

Run your  models on  VOLLO  today