• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

NVIDIA Surpasses 1,000 TPS/User with Llama 4 Maverick and Blackwell GPUs

May 23, 2025
in Blockchain
Reading Time: 2min read
0 0
A A
0
Nvidia Plans to add Innovation in the Metaverse with Software, Marketplace Deals
0
SHARES
0
VIEWS
ShareShareShareShareShare


Lawrence Jengar
May 23, 2025 02:10

NVIDIA achieves a world-record inference speed of over 1,000 TPS/user using Blackwell GPUs and Llama 4 Maverick, setting a new standard for AI model performance.





NVIDIA has set a new benchmark in artificial intelligence performance with its latest achievement, breaking the 1,000 tokens per second (TPS) per user barrier using the Llama 4 Maverick model and Blackwell GPUs. This accomplishment was independently verified by the AI benchmarking service Artificial Analysis, marking a significant milestone in large language model (LLM) inference speed.

Technological Advancements

The breakthrough was achieved on a single NVIDIA DGX B200 node equipped with eight NVIDIA Blackwell GPUs, which managed to handle over 1,000 TPS per user on the Llama 4 Maverick, a 400-billion-parameter model. This performance makes Blackwell the optimal hardware for deploying Llama 4, either for maximizing throughput or minimizing latency, reaching up to 72,000 TPS/server in high throughput configurations.

Optimization Techniques

NVIDIA implemented extensive software optimizations using TensorRT-LLM to fully utilize the Blackwell GPUs. The company also trained a speculative decoding draft model using EAGLE-3 techniques, resulting in a fourfold speed increase compared to previous baselines. These enhancements maintain response accuracy while boosting performance, leveraging FP8 data types for operations like GEMMs and Mixture of Experts, ensuring accuracy comparable to BF16 metrics.

Importance of Low Latency

In generative AI applications, balancing throughput and latency is crucial. For critical applications requiring rapid decision-making, NVIDIA’s Blackwell GPUs excel by minimizing latency, as demonstrated by the TPS/user record. The hardware’s ability to handle high throughput and low latency makes it ideal for various AI tasks.

Cuda Kernel and Speculative Decoding

NVIDIA optimized CUDA kernels for GEMMs, MoE, and Attention operations, utilizing spatial partitioning and efficient memory data loading to maximize performance. Speculative decoding was employed to accelerate LLM inference speed by using a smaller, faster draft model to predict speculative tokens, verified by the larger target LLM. This approach yields significant speed-ups, particularly when the draft model’s predictions are accurate.

Programmatic Dependent Launch

To further enhance performance, NVIDIA utilized Programmatic Dependent Launch (PDL) to reduce GPU idle time between consecutive CUDA kernels. This technique allows overlapping kernel execution, improving GPU utilization and eliminating performance gaps.

NVIDIA’s achievements underscore its leadership in AI infrastructure and data center technology, setting new standards for speed and efficiency in AI model deployment. The innovations in Blackwell architecture and software optimization continue to push the boundaries of what’s possible in AI performance, ensuring responsive, real-time user experiences and robust AI applications.

For more detailed information, visit the NVIDIA official blog.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

Gala Games Launches ‘VEXI at Work’ Leaderboard Event with $GALA Rewards

Next Post

Bitcoin Bulls Poised? BTC Consolidates Near ATH With Eyes on Higher Highs

Next Post
Bitcoin Bulls Poised? BTC Consolidates Near ATH With Eyes on Higher Highs

Bitcoin Bulls Poised? BTC Consolidates Near ATH With Eyes on Higher Highs

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You might also like

Feds Charge Atlanta Man for Allegedly Applying for Over $3,390,000 in Fraudulent Small Business Loans During COVID

Feds Charge Atlanta Man for Allegedly Applying for Over $3,390,000 in Fraudulent Small Business Loans During COVID

May 22, 2025

Bitcoin Panic Buying? Eric Trump Says the World Is Stockpiling BTC

May 16, 2025
Man Pleads Guilty to Hacking Official SEC X Account and Falsely Claiming Regulator Had Approved Bitcoin ETFs

Mastermind Behind SEC SIM Swapping Scheme Sentenced to 14 Months in Prison for Manipulating Bitcoin Price

May 18, 2025
Alpenglow Upgrade Promises to Increase Speeds on Solana by 100X

Alpenglow Upgrade Promises to Increase Speeds on Solana by 100X

May 21, 2025
Chainlink In The Mist — A $15.85 Reversal May Clear The Path

Chainlink In The Mist — A $15.85 Reversal May Clear The Path

May 18, 2025
Coinbase Data Breach Exposes 69,461 Users, Sparks Regulatory Backlash

Coinbase Data Breach Exposes 69,461 Users, Sparks Regulatory Backlash

May 22, 2025
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Ethereum Climbs Back To $2,700 – Bulls Ready For A Breakout?

Ethereum Climbs Back To $2,700 – Bulls Ready For A Breakout?

May 23, 2025
XRP’s $5 Dream Ride Hinges On This One Chart Setup – Analyst

XRP’s $5 Dream Ride Hinges On This One Chart Setup – Analyst

May 23, 2025

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
  • Heart NumberHeart Number(HTN)$0.000000-30.47%
  • TadpoleTadpole(TAD)$0.000000-1.76%
  • SEENSEEN(SEEN)$0.000000-2.27%
  • EvedoEvedo(EVED)$0.000000-0.80%
  • MarginswapMarginswap(MFI)$0.000000-2.17%
  • SakeTokenSakeToken(SAKE)$0.0000004.37%
  • WTF TokenWTF Token(WTF)$0.0000000.16%
  • BNSD FinanceBNSD Finance(BNSD)$0.000000-5.83%
  • RobotinaRobotina(ROX)$0.00000038.50%
  • CageCage(C4G3)$0.000000-3.67%