• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

NVIDIA Run:ai GPU Fractioning Delivers 77% Throughput at Half Allocation

February 18, 2026
in Blockchain
Reading Time: 2min read
0 0
A A
0
Nvidia Plans to add Innovation in the Metaverse with Software, Marketplace Deals
0
SHARES
3
VIEWS
ShareShareShareShareShare


Darius Baruo
Feb 18, 2026 18:31

NVIDIA and Nebius benchmarks show GPU fractioning achieves 86% user capacity on 0.5 GPU allocation, enabling 3x more concurrent users for mixed AI workloads.





NVIDIA’s Run:ai platform can deliver 77% of full GPU throughput using just half the hardware allocation, according to joint benchmarking with cloud provider Nebius released February 18. The results demonstrate that enterprises running large language model inference can dramatically expand capacity without proportional GPU investment.

The tests, conducted on clusters with 64 NVIDIA H100 NVL GPUs and 32 NVIDIA HGX B200 GPUs, showed fractional GPU scheduling achieving near-linear performance scaling across 0.5, 0.25, and 0.125 allocations.

Hard Numbers from Production Testing

At 0.5 GPU allocation, the system supported 8,768 concurrent users while maintaining time-to-first-token under one second—86% of the 10,200 users supported at full allocation. Token generation hit 152,694 tokens per second, compared to 198,680 at full capacity.

Smaller models pushed these gains further. Phi-4-Mini running on 0.25 GPU fractions handled 72% more concurrent users than full-GPU deployment, achieving approximately 450,000 tokens per second with P95 latency under 300 milliseconds on 32 GPUs.

The mixed workload scenario proved most striking. Running Llama 3.1 8B, Phi-4 Mini, and Qwen-Embeddings simultaneously on fractional allocations tripled total concurrent system users compared to single-model deployment. Combined throughput exceeded 350,000 tokens per second at full scale with no cross-model interference.

Why This Matters for GPU Economics

Traditional Kubernetes schedulers allocate whole GPUs to individual models, leaving substantial capacity stranded. The benchmarks noted that even Qwen3-14B, the largest model tested at 14 billion parameters, occupies only 35% of an H100 NVL’s 80GB capacity.

Run:ai’s scheduler eliminates this waste through dynamic memory allocation. Users specify requirements directly; the system handles resource distribution without preconfiguration. Memory isolation happens at runtime while compute cycles distribute fairly among active processes.

This timing coincides with broader industry moves toward GPU partitioning. SoftBank and AMD announced validation testing on February 16 for similar fractioning capabilities on AMD Instinct GPUs, where single GPUs can split into up to eight logical devices.

Autoscaling Without Latency Spikes

Nebius tested automatic scaling with Llama 3.1 8B configured to add GPUs when concurrent users exceeded 50. Replicas scaled from 1 to 16 with clean ramp-up, stable utilization during pod warm-up, and negligible HTTP errors.

The practical implication: enterprises can run multiple inference models on existing GPU inventory, scale dynamically during peak demand, and reclaim idle capacity during off-hours for other workloads. For organizations facing fixed GPU budgets, fractioning transforms capacity planning from hardware procurement into software configuration.

Run:ai v2.24 is available now. NVIDIA plans to discuss the Nebius implementation at GTC 2026.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

UK Tokenization Market Eyes $873M by 2032 as Chainlink (LINK) Expands European Push

Next Post

Stellar (XLM) Makes Case Against Proof-of-Stake for Institutional Adoption

Next Post
Stellar Network Advances with Protocol 20 and Smart Contracts Activation

Stellar (XLM) Makes Case Against Proof-of-Stake for Institutional Adoption

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You might also like

Did Quantum Computing Fears Crash Bitcoin? NYDIG Says No

Analyst Says Bitcoin $200,000 Target Remains Open, But There’s A More Realistic Target

March 7, 2026
Nvidia Plans to add Innovation in the Metaverse with Software, Marketplace Deals

NVIDIA Releases Flash Attention Optimization Guide for Blackwell GPUs

March 4, 2026
Wall Street Giant JPMorgan Sees CLARITY Act Driving Second-Half Upside

Wall Street Giant JPMorgan Sees CLARITY Act Driving Second-Half Upside

March 2, 2026
Bitcoin Suppressed By Shadow Banking Rehypothecation: Saylor

Bitcoin Suppressed By Shadow Banking Rehypothecation: Saylor

March 5, 2026
HBAR Price Prediction: Targeting $0.30 by December 2025 as Hedera Tests Critical Breakout Level

HBAR Price Prediction: Targets $0.11-$0.16 by April as Technical Indicators Signal Consolidation Phase

March 2, 2026
Uniswap (UNI) Price Rallies 6.53% – Is Now the Time to Buy? Comprehensive Analysis & Trading Insights

LDO Price Prediction: Targets $0.32-$0.34 by March End Amid Technical Recovery

March 1, 2026
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Pundit Says XRP Price Could Reach $1,000 By The End Of 2026 If This Happens

Pundit Says XRP Price Could Reach $1,000 By The End Of 2026 If This Happens

March 8, 2026
Bitcoin Price Holds Above $115,000 — Here’s Why This Level Is Significant

Here’s Why Bitcoin Price Must Not Fall To $54K: Analyst

March 7, 2026

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Please enter CoinGecko Free Api Key to get this plugin works.