WEKA and Oracle Cloud boost AI inference throughput tenfold
Joint benchmarks on OCI H100 infrastructure show WEKA's NeuralMesh platform delivers 10x higher token throughput and serves 10x more concurrent users without adding GPUs

On June 10, 2026, Campbell, California-based AI data and memory infrastructure company WEKA announced new production-scale benchmarks in collaboration with Oracle Cloud Infrastructure. The joint benchmarks demonstrate how organizations can significantly improve the economics of long-context AI inference. By utilizing existing hardware more efficiently, companies can serve a larger volume of users and tokens on the same GPU footprint without the need to acquire additional infrastructure.[1][2]
The benchmarks revealed that WEKA's NeuralMesh platform, featuring its Augmented Memory Grid on Oracle Cloud Infrastructure, delivered substantial performance gains compared to standard DRAM-only configurations. Specifically, the setup supported 10x more concurrent users, achieved 10x higher token throughput, and produced 7x more tokens per GPU. These performance metrics were validated during testing conducted on a nine-node OCI bare-metal H100 cluster.[1][2]



