• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

NVIDIA Run:ai GPU Fractioning Delivers 77% Throughput at Half Allocation

February 18, 2026
in Blockchain
Reading Time: 2min read
0 0
A A
0
Nvidia Plans to add Innovation in the Metaverse with Software, Marketplace Deals
0
SHARES
3
VIEWS
ShareShareShareShareShare


Darius Baruo
Feb 18, 2026 18:31

NVIDIA and Nebius benchmarks show GPU fractioning achieves 86% user capacity on 0.5 GPU allocation, enabling 3x more concurrent users for mixed AI workloads.





NVIDIA’s Run:ai platform can deliver 77% of full GPU throughput using just half the hardware allocation, according to joint benchmarking with cloud provider Nebius released February 18. The results demonstrate that enterprises running large language model inference can dramatically expand capacity without proportional GPU investment.

The tests, conducted on clusters with 64 NVIDIA H100 NVL GPUs and 32 NVIDIA HGX B200 GPUs, showed fractional GPU scheduling achieving near-linear performance scaling across 0.5, 0.25, and 0.125 allocations.

Hard Numbers from Production Testing

At 0.5 GPU allocation, the system supported 8,768 concurrent users while maintaining time-to-first-token under one second—86% of the 10,200 users supported at full allocation. Token generation hit 152,694 tokens per second, compared to 198,680 at full capacity.

Smaller models pushed these gains further. Phi-4-Mini running on 0.25 GPU fractions handled 72% more concurrent users than full-GPU deployment, achieving approximately 450,000 tokens per second with P95 latency under 300 milliseconds on 32 GPUs.

The mixed workload scenario proved most striking. Running Llama 3.1 8B, Phi-4 Mini, and Qwen-Embeddings simultaneously on fractional allocations tripled total concurrent system users compared to single-model deployment. Combined throughput exceeded 350,000 tokens per second at full scale with no cross-model interference.

Why This Matters for GPU Economics

Traditional Kubernetes schedulers allocate whole GPUs to individual models, leaving substantial capacity stranded. The benchmarks noted that even Qwen3-14B, the largest model tested at 14 billion parameters, occupies only 35% of an H100 NVL’s 80GB capacity.

Run:ai’s scheduler eliminates this waste through dynamic memory allocation. Users specify requirements directly; the system handles resource distribution without preconfiguration. Memory isolation happens at runtime while compute cycles distribute fairly among active processes.

This timing coincides with broader industry moves toward GPU partitioning. SoftBank and AMD announced validation testing on February 16 for similar fractioning capabilities on AMD Instinct GPUs, where single GPUs can split into up to eight logical devices.

Autoscaling Without Latency Spikes

Nebius tested automatic scaling with Llama 3.1 8B configured to add GPUs when concurrent users exceeded 50. Replicas scaled from 1 to 16 with clean ramp-up, stable utilization during pod warm-up, and negligible HTTP errors.

The practical implication: enterprises can run multiple inference models on existing GPU inventory, scale dynamically during peak demand, and reclaim idle capacity during off-hours for other workloads. For organizations facing fixed GPU budgets, fractioning transforms capacity planning from hardware procurement into software configuration.

Run:ai v2.24 is available now. NVIDIA plans to discuss the Nebius implementation at GTC 2026.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

UK Tokenization Market Eyes $873M by 2032 as Chainlink (LINK) Expands European Push

Next Post

Stellar (XLM) Makes Case Against Proof-of-Stake for Institutional Adoption

Next Post
Stellar Network Advances with Protocol 20 and Smart Contracts Activation

Stellar (XLM) Makes Case Against Proof-of-Stake for Institutional Adoption

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You might also like

Bitcoin Price Prediction: Fed Rate Cut Hints Send BTC Flying Past $72K — Is a Mega Rally Starting?

Bitcoin Price Prediction: Fed Rate Cut Hints Send BTC Flying Past $72K — Is a Mega Rally Starting?

March 4, 2026
Bitcoin ETFs Bleed $349M In A Day As Whales Dump

Bitcoin ETFs Bleed $349M In A Day As Whales Dump

March 7, 2026
Coinbase Faces Backlash as Base Devs Point to “Corporate Double Speak”

Binance, CZ Cleared in US Civil Suit Over Alleged Terror Financing

March 7, 2026
VeChain Foundation Releases Q1 2024 Treasury Report

ElevenLabs Exits Beta With 28-Language AI Voice Model After $11B Valuation

March 6, 2026
Why is Crypto Up? Bitcoin Reclaims $71,000 as Market Shrugs Off Middle East Escalation

Why is Crypto Up? Bitcoin Reclaims $71,000 as Market Shrugs Off Middle East Escalation

March 4, 2026
Block’s AI-Driven Layoffs Spark Debate Over ‘AI-Washing’

Block’s AI-Driven Layoffs Spark Debate Over ‘AI-Washing’

March 2, 2026
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Pundit Says XRP Price Could Reach $1,000 By The End Of 2026 If This Happens

Pundit Says XRP Price Could Reach $1,000 By The End Of 2026 If This Happens

March 8, 2026
Bitcoin Price Holds Above $115,000 — Here’s Why This Level Is Significant

Here’s Why Bitcoin Price Must Not Fall To $54K: Analyst

March 7, 2026

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Please enter CoinGecko Free Api Key to get this plugin works.