Tether AI Releases Open-Source TurboQuant and Integrates It into QVAC SDK 0.12.0
Tether AI announced the open-source release of TurboQuant and its integration into QVAC SDK 0.12.0. Built upon Google Research’s memory compression algorithm, this technology compresses the KV cache used during large language model inference by up to approximately 5×, significantly reducing memory footprint on local and edge devices while preserving output quality.