GetChain News
中简 中繁 EN
GetChain News
Toggle sidebar
Inference

Inference

Active

Distributed GPU cluster for LLM inference

News Heat Trend

Project Overview

Inference is adistributed GPU cluster for LLM inference built on Solana. Inference.net is a global network of data centers serving fast, scalable, pay-per-token APIs for models like DeepSeek V3 and Llama 3.3.

Event-related news

NVIDIA Launches Nemotron 3 Nano Omni Model, Boosting Multimodal Inference Efficiency by 9x

NVIDIA announced on X platform that it has launched the open-source multimodal model Nemotron 3 Nano Omni today. The model adopts a 30B-A3B mixture-of-experts (MoE) architecture, supports a 256K context window, and can uniformly process video, audio, image, and text inputs. Compared to open-source omnimodal models at a similar interaction level, this model achieves up to a 9x increase in throughput, significantly reducing inference costs and improving scalability. Nemotron 3 Nano Omni is now available on Hugging Face, OpenRouter, and NVIDIA NIM, and has been adopted by enterprises including Aible, Applied Scientific Intelligence, and H Company.

SK Telecom Collaborates with Arm and Rebellions to Jointly Develop AI Data Center Inference Solutions

According to Yonhap News Agency, SK Telecom announced the signing of a trilateral memorandum of understanding (MOU) with UK-based chip design company Arm and Korean AI chip startup Rebellions to jointly develop AI data center inference server solutions. Under the agreement, the three parties will integrate Arm’s newly launched AGI CPU with Rebellions’ AI acceleration chip—RebelCard, scheduled for launch in Q3 this year—to jointly develop AI inference servers, which will be tested and validated at SK Telecom’s AI data centers. The Arm AGI CPU is optimized for high-density inference environments and large-scale AI deployments, while the RebelCard is specifically designed for large-scale AI inference.

Related news

NVIDIA Launches Nemotron 3 Nano Omni Model, Boosting Multimodal Inference Efficiency by 9x

NVIDIA announced on X platform that it has launched the open-source multimodal model Nemotron 3 Nano Omni today. The model adopts a 30B-A3B mixture-of-experts (MoE) architecture, supports a 256K context window, and can uniformly process video, audio, image, and text inputs. Compared to open-source omnimodal models at a similar interaction level, this model achieves up to a 9x increase in throughput, significantly reducing inference costs and improving scalability. Nemotron 3 Nano Omni is now available on Hugging Face, OpenRouter, and NVIDIA NIM, and has been adopted by enterprises including Aible, Applied Scientific Intelligence, and H Company.

SK Telecom Collaborates with Arm and Rebellions to Jointly Develop AI Data Center Inference Solutions

According to Yonhap News Agency, SK Telecom announced the signing of a trilateral memorandum of understanding (MOU) with UK-based chip design company Arm and Korean AI chip startup Rebellions to jointly develop AI data center inference server solutions. Under the agreement, the three parties will integrate Arm’s newly launched AGI CPU with Rebellions’ AI acceleration chip—RebelCard, scheduled for launch in Q3 this year—to jointly develop AI inference servers, which will be tested and validated at SK Telecom’s AI data centers. The Arm AGI CPU is optimized for high-density inference environments and large-scale AI deployments, while the RebelCard is specifically designed for large-scale AI inference.