• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

IBM Research Unveils Innovations to Accelerate Enterprise AI Training

September 23, 2024
in Blockchain
Reading Time: 2min read
0 0
A A
0
Crypto Innovations and IBM’s Role in the Evolving Payments Landscape
0
SHARES
22
VIEWS
ShareShareShareShareShare


Zach Anderson
Sep 23, 2024 03:32

IBM Research introduces new data processing techniques to expedite AI model training using CPU resources, significantly enhancing efficiency.





IBM Research has unveiled groundbreaking innovations aimed at scaling the data processing pipeline for enterprise AI training, according to IBM Research. These advancements are designed to expedite the creation of powerful AI models, such as IBM’s Granite models, by leveraging the abundant capacity of CPUs.

Optimizing Data Preparation

Before training AI models, vast amounts of data must be prepared. This data often comes from diverse sources like websites, PDFs, and news articles, and must undergo several preprocessing steps. These steps include filtering out irrelevant HTML code, removing duplicates, and screening for abusive content. These tasks, though critical, are not constrained by the availability of GPUs.

Petros Zerfos, IBM Research’s principal research scientist for watsonx data engineering, emphasized the importance of efficient data processing. “A large part of the time and effort that goes into training these models is preparing the data for these models,” Zerfos said. His team has been developing methods to enhance the efficiency of data processing pipelines, drawing expertise from various domains including natural language processing, distributed computing, and storage systems.

Leveraging CPU Capacity

Many steps in the data processing pipeline involve “embarrassingly parallel” computations, allowing each document to be processed independently. This parallel processing can significantly speed up data preparation by distributing tasks across numerous CPUs. However, some steps, such as removing duplicate documents, require access to the entire dataset, which cannot be performed in parallel.

To accelerate IBM’s Granite model development, the team has developed processes to rapidly provision and utilize tens of thousands of CPUs. This approach involves marshalling idle CPU capacity across IBM’s Cloud datacenter network, ensuring high communication bandwidth between CPUs and data storage. Traditional object storage systems often cause CPUs to idle due to low performance; thus, the team employed IBM’s high-performance Storage Scale file system to cache active data efficiently.

Scaling Up AI Training

Over the past year, IBM has scaled up to 100,000 vCPUs in the IBM Cloud, processing 14 petabytes of raw data to produce 40 trillion tokens for AI model training. The team has automated these data pipelines using Kubeflow on IBM Cloud. Their methods have proven to be 24 times faster in processing data from Common Crawl compared to previous techniques.

All of IBM’s open-sourced Granite code and language models have been trained using data prepared through these optimized pipelines. Additionally, IBM has made significant contributions to the AI community by developing the Data Prep Kit, a toolkit hosted on GitHub. This kit streamlines data preparation for large language model applications, supporting pre-training, fine-tuning, and retrieval-augmented generation (RAG) use cases. Built on distributed processing frameworks like Spark and Ray, the kit allows developers to build scalable custom modules.

For more information, visit the official IBM Research blog.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

Ethereum Price Breaks $2,600: Is More Upside Ahead?

Next Post

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Next Post
Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

You might also like

$250K Bitcoin In 2026? Analyst Warns Bulls To ‘Stop With The Mushrooms’

$250K Bitcoin In 2026? Analyst Warns Bulls To ‘Stop With The Mushrooms’

April 29, 2026
VeChain Foundation Releases Q1 2024 Treasury Report

Evan Tangeman Gets 70 Months for $263M Crypto Theft Role

April 25, 2026
Bitcoin Price Prediction: Florida’s Crypto Bill and $198B U.S. Surplus Boost Market Outlook

XRP NEWS: GraniteShares Just Delayed Its 3x XRP ETF for the Fifth Time: Is the SEC Blocking Leveraged Crypto Products?

April 26, 2026
Solana (SOL) Edges Up, Traders Watch For Sustained Upside Move

Solana (SOL) Edges Up, Traders Watch For Sustained Upside Move

April 27, 2026
Bitcoin ETFs Flip Positive as Inflows Surge Back Into the Green

Bitcoin ETFs Flip Positive as Inflows Surge Back Into the Green

April 24, 2026
Spain Raid on Largest Manga Piracy Site Uncovers Crypto Wallets Hidden in Thermometer

Spain Raid on Largest Manga Piracy Site Uncovers Crypto Wallets Hidden in Thermometer

April 24, 2026
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Crypto.com Wants a National Trust Bank License – What Would a Federal License Really Change?

Kaspa Crypto Is 95% Mined With Supply Running Out by Late 2026: Is a Scarcity Rally Coming Before It’s Too Late?

April 29, 2026
$250K Bitcoin In 2026? Analyst Warns Bulls To ‘Stop With The Mushrooms’

$250K Bitcoin In 2026? Analyst Warns Bulls To ‘Stop With The Mushrooms’

April 29, 2026

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Please enter CoinGecko Free Api Key to get this plugin works.