Trusted by
The wireless telecoms challenge
As mobile networks evolve from 5G to 6G, operators face increasing demands for real-time decision-making at massive scale. Traditional compute architectures struggle to keep up with the combination of high throughput, low latency bounds and deterministic performance required by AI-driven radio access networks functions. VOLLO solves all of these challenges.
The VOLLO solution
VOLLO delivers the high throughput and deterministic low latency demanded by real-time 5G and 6G systems, even under peak network load. It’s capable of sustaining millions of inferences per second while guaranteeing execution within tight, predictable latency bounds. This makes VOLLO the AI inference engine of choice for operators deploying 5G/6G networks at scale.
Why VOLLO?
VOLLO is the preferred AI inference engine for AI-RAN. Here’s why:
Performance at scale
Sustain millions of AI-driven decisions per second
Deterministic low latency
Independently verified as delivering the lowest latency inference of your AI models and with every inference completed within strict timing guarantees
Concurrent workloads
Run multiple RAN functions simultaneously without compromise
Future-proof acceleration
AI models can be upgraded on the fly through software downloads, ensuring they are future-proofed for the evolving demands of 5G and 6G
Performance at scale
Sustain millions of AI-driven decisions per second
Deterministic low latency
Independently verified as delivering the lowest latency inference of your AI models and with every inference completed within strict timing guarantees
Concurrent workloads
Run multiple RAN functions simultaneously without compromise
Future-proof acceleration
AI models can be upgraded on the fly through software downloads, ensuring they are future-proofed for the evolving demands of 5G and 6G
VOLLO features
Evaluate the potential
Our cycle-level and bit-accurate simulator allows you to quickly produce concrete performance evaluations of your AI models
Program using your existing ML framework
Models can be developed in PyTorch or TensorFlow before being exported in ONNX format into the VOLLO tool suite
Deploy with ease
Rapidly deploy on a PCIe card for development; achieve tight integration with an FPGA netlist for production
Maintain privacy
Your models and data remain secure on your own premises
Use Cases
VOLLO is relied upon to deliver key applications
VOLLO can execute these workloads concurrently with full real-time guarantees, ensuring reliable performance under peak network loads.
- Traffic Control: Dynamic band selection and per-UE handover optimization. Workload: ~1K jobs/sec, 100–400µs latency budget, models up to 1M parameters.
- MAC Scheduling: Real-time resource allocation at massive scale. Workload: 1–4M jobs/sec, strict <20µs latency bound, models ~100K parameters.
- Self-Organizing Networks: Intelligent activation/deactivation of cells and adaptive cell-shape tuning for efficiency and coverage.
- Session & Flow Prediction: Flow Prediction – Real-time forecasting of traffic patterns for proactive capacity management. L2 Synch/ Asynch Session Prediction – Anticipating session demand and synchronizing resources with microsecond precision.
Built for performance
VOLLO runs on FPGA-based platforms, overcoming the limitations of other architectures which struggle to respond quickly and efficiently to the volume and velocity of telecoms data.
We enable effortless iteration and deployment of machine learning models, freeing engineers to advance development.
- PCIe accelerator cards (FHFL/HHHL) for rapid deployment
- SmartNIC-integrated deployments for an even faster response
- FPGA netlists for tighter integration
Why myrtle.ai?
We enable organizations to meet their inference performance goals, no matter the scale, complexity or industry
Expertise you can rely on
We are a team of hardware/software co-design specialists, infrastructure experts and machine learning scientists – we understand your challenges and can deliver the solutions you need
Trusted partner to leading companies
We are relied upon by companies at the top of their game because we make it possible for them to deploy complex machine learning models that run in microseconds
Frictionless deployment
We enable effortless iteration and deployment of machine learning models, freeing engineers to advance development
Increase the performance of your machine learning models
Discover how myrtle.ai can help you access low latency inference and deploy complex machine learning models that run in microseconds