VOLLO for Telecoms

Trusted by

The wireless telecoms challenge

As mobile networks evolve from 5G to 6G, operators face increasing demands for real-time decision-making at massive scale. Traditional compute architectures struggle to keep up with the combination of high throughput, low latency bounds and deterministic performance required by AI-driven radio access networks functions. VOLLO solves all of these challenges.

The VOLLO solution

VOLLO delivers the high throughput and deterministic low latency demanded by real-time 5G and 6G systems, even under peak network load. It’s capable of sustaining millions of inferences per second while guaranteeing execution within tight, predictable latency bounds. This makes VOLLO the AI inference engine of choice for operators deploying 5G/6G networks at scale.

Get in touch

Why VOLLO?

VOLLO is the preferred AI inference engine for AI-RAN. Here’s why:

Performance at scale

Sustain millions of AI-driven decisions per second

Deterministic low latency

Independently verified as delivering the lowest latency inference of your AI models and with every inference completed within strict timing guarantees

Concurrent workloads

Run multiple RAN functions simultaneously without compromise

Future-proof acceleration

AI models can be upgraded on the fly through software downloads, ensuring they are future-proofed for the evolving demands of 5G and 6G

Performance at scale

Sustain millions of AI-driven decisions per second

Deterministic low latency

Independently verified as delivering the lowest latency inference of your AI models and with every inference completed within strict timing guarantees

Concurrent workloads

Run multiple RAN functions simultaneously without compromise

Future-proof acceleration

AI models can be upgraded on the fly through software downloads, ensuring they are future-proofed for the evolving demands of 5G and 6G

VOLLO features

Evaluate the potential

Our cycle-level and bit-accurate simulator allows you to quickly produce concrete performance evaluations of your AI models

Program using your existing ML framework

Models can be developed in PyTorch or TensorFlow before being exported in ONNX format into the VOLLO tool suite

Deploy with ease

Rapidly deploy on a PCIe card for development; achieve tight integration with an FPGA netlist for production

Maintain privacy

Your models and data remain secure on your own premises

Get in touch

Use Cases

VOLLO is relied upon to deliver key applications

VOLLO can execute these workloads concurrently with full real-time guarantees, ensuring reliable performance under peak network loads.

Traffic Control: Dynamic band selection and per-UE handover optimization. Workload: ~1K jobs/sec, 100–400µs latency budget, models up to 1M parameters.
MAC Scheduling: Real-time resource allocation at massive scale. Workload: 1–4M jobs/sec, strict <20µs latency bound, models ~100K parameters.
Self-Organizing Networks: Intelligent activation/deactivation of cells and adaptive cell-shape tuning for efficiency and coverage.
Session & Flow Prediction: Flow Prediction – Real-time forecasting of traffic patterns for proactive capacity management. L2 Synch/ Asynch Session Prediction – Anticipating session demand and synchronizing resources with microsecond precision.

Built for performance

VOLLO runs on FPGA-based platforms, overcoming the limitations of other architectures which struggle to respond quickly and efficiently to the volume and velocity of telecoms data.

We enable effortless iteration and deployment of machine learning models, freeing engineers to advance development.

PCIe accelerator cards (FHFL/HHHL) for rapid deployment
SmartNIC-integrated deployments for an even faster response
FPGA netlists for tighter integration

VOLLO compiles PyTorch and TensorFlow AI models directly to supported FPGAs. With the ability to switch inference between multiple models instantaneously, VOLLO is designed to support your evolving AI needs now and in the future.

VOLLO comprises a compiler, virtual machine (VM), C API runtime, FPGA bitstreams, and drivers
VOLLO virtual machine provides cycle-level, bit-accurate simulation – no hardware required

Why myrtle.ai?

We enable organizations to meet their inference performance goals, no matter the scale, complexity or industry

Expertise you can rely on

We are a team of hardware/software co-design specialists, infrastructure experts and machine learning scientists – we understand your challenges and can deliver the solutions you need

Trusted partner to leading companies

We are relied upon by companies at the top of their game because we make it possible for them to deploy complex machine learning models that run in microseconds

Frictionless deployment

We enable effortless iteration and deployment of machine learning models, freeing engineers to advance development

Increase the performance of your machine learning models

Discover how myrtle.ai can help you access low latency inference and deploy complex machine learning models that run in microseconds

for wireless telecoms

Trusted by

The wireless telecoms challenge

The VOLLO solution

Why VOLLO?

VOLLO is the preferred AI inference engine for AI-RAN. Here’s why:

Performance at scale

Deterministic low latency

Concurrent workloads

Future-proof acceleration

Performance at scale

Deterministic low latency

Concurrent workloads

Future-proof acceleration

VOLLO features

Evaluate the potential

Program using your existing ML framework

Deploy with ease

Maintain privacy

VOLLO is relied upon to deliver key applications

Built for performance

Why myrtle.ai?

Expertise you can rely on

Trusted partner to leading companies

Frictionless deployment

Increase the performance of your machine learning models