Generating alpha
with AI
Firms that can run AI models on market data in real time can unlock rich new seams of alpha.
VOLLO is the low-latency AI inference accelerator trusted today by leading trading teams. VOLLO can work right next to market feeds, taking in the full depth of the order book and it’s able to run the best AI networks of today and tomorrow.
The result? More intelligent recognition of shifts in supply and demand, and the ability to capture alpha opportunities with AI in microseconds.
With hundreds of thousands of hours of proven production trading to its name, VOLLO is the future of generating alpha with AI. If you’re not using VOLLO, your competitors are. Evaluate now by testing VOLLO on your models.
We are very happy with VOLLO. We prefer that our competition doesn't know we use it.
Discover what VOLLO can do for you
VOLLO is relied upon for production trading and market making today. Watch this video to see how it can help you generate new sources of alpha.
PyTorch and TensorFlow AI models are compiled by VOLLO directly to supported FPGAs. With the ability to switch inference between multiple models instantaneously, VOLLO is designed to support your evolving AI needs now and in the future.
VOLLO was independently audited by STAC and has held the latency records in their ML benchmarks for almost three years against all GPU and custom ASIC audits.
Get in touch to see if you qualify for a free VOLLO license
VOLLO for capital markets
Trade first with microsecond AI inference
Deploy your machine learning models faster than your competitors with VOLLO, the lowest latency inference accelerator. Increase performance and maximize throughput, ensuring you’re never late to trade.
of the latency of any competitor
to upload a new model
parameter models on a single FPGA
compute latency
of the latency of any competitor
to upload a new model
parameter models on a single FPGA
compute latency
Why VOLLO?
VOLLO is relied upon by the world’s most successful trading companies.
Low latency quant trading
The lowest latency inference for your AI models, independently proven by STAC-ML™ benchmarks
Wide range of model support
Can be software-configured to run multiple models simultaneously. Supports a wide variety of operations and layer types — from decision trees, LSTMs, MLPs, and CNNs to streaming architectures like structured state-space models such as S4 and Mamba
Future-proof AI
New models can be programmed in under a second and the flexibility of FPGA low latency trading technology ensures that even future models will be easy to program when they’re conceived
Easy to adopt
VOLLO compiles PyTorch and TensorFlow models to FPGA and requires no FPGA expertise or tools to use
Quick to deploy
VOLLO can be programmed onto PCIe cards for rapid adoption into existing infrastructure or imported into your proprietary FPGA design as a netlist
Low latency quant trading
The lowest latency inference for your AI models, independently proven by STAC-ML™ benchmarks
Wide range of model support
Can be software-configured to run multiple models simultaneously. Supports a wide variety of operations and layer types — from decision trees, LSTMs, MLPs, and CNNs to streaming architectures like structured state-space models such as S4 and Mamba
Future-proof AI
New models can be programmed in under a second and the flexibility of FPGA low latency trading technology ensures that even future models will be easy to program when they’re conceived
Easy to adopt
VOLLO compiles PyTorch and TensorFlow models to FPGA and requires no FPGA expertise or tools to use
Quick to deploy
VOLLO can be programmed onto PCIe cards for rapid adoption into existing infrastructure or imported into your proprietary FPGA design as a netlist
VOLLO Evaluation
Any ML developer can easily evaluate the benefits of using VOLLO through two flexible options designed to suit different stages of exploration.
With the VOLLO Sandbox, you can test your models instantly online with no downloads, or installation, and you don’t need any hardware expertise or tools. It’s the fastest way to experience VOLLO’s capabilities and see how your models perform.
For deeper analysis, the VOLLO Simulator remains available. Using the freely accessible VOLLO compiler and virtual machine, you can perform cycle-level, bit-accurate simulations of your AI models inferring on the supported FPGA boards, enabling a more detailed and comprehensive evaluation. You won’t need any new hardware, hardware-specific expertise, or licences for this either. Also, you don’t need to share your models or data with us, giving you complete freedom to explore, with full security and control.
Who is VOLLO for?
Proprietary trading firms who want to be first to trade.
Need to keep pace with AI developments
Build future-proof AI platforms
Want a smarter model within the same latency bound
Design and test new ML models rapidly in your favorite framework
Need to run their models faster
Make smarter, faster decisions and lead the market
Want to outpace their competitors
Adopt the industry’s proven latency-leading technology
Run your models on VOLLO today
Evaluate your models on VOLLO in minutes. No hardware required.
Use the VOLLO Sandbox to test models instantly online, or go deeper with the VOLLO SDK Simulator to backtest performance and measure latency with bit-accurate precision.
Built for performance
VOLLO runs on FPGA-based platforms, overcoming the limitations of traditional CPU-based architectures which struggle to respond quickly and efficiently to the volume and velocity of market data.
VOLLO is supported on an ever-expanding range of FPGAs including the following:
- AMD Versal, Versal Premium & Virtex UltraScale+; Altera Agilex
- Up to 50M parameter models on a single FPGA
For a full list of supported PCIe and SmartNIC cards, visit vollo.myrtle.ai
FAQs
Here are the answers to some of the most common questions we get asked about VOLLO
Still have questions?
Request a free VOLLO license
Schedule a consultation to see if you’re eligible for a free VOLLO license