• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

IBM Research Unveils Innovations to Accelerate Enterprise AI Training

September 23, 2024
in Blockchain
Reading Time: 2min read
0 0
A A
0
Crypto Innovations and IBM’s Role in the Evolving Payments Landscape
0
SHARES
23
VIEWS
ShareShareShareShareShare


Zach Anderson
Sep 23, 2024 03:32

IBM Research introduces new data processing techniques to expedite AI model training using CPU resources, significantly enhancing efficiency.





IBM Research has unveiled groundbreaking innovations aimed at scaling the data processing pipeline for enterprise AI training, according to IBM Research. These advancements are designed to expedite the creation of powerful AI models, such as IBM’s Granite models, by leveraging the abundant capacity of CPUs.

Optimizing Data Preparation

Before training AI models, vast amounts of data must be prepared. This data often comes from diverse sources like websites, PDFs, and news articles, and must undergo several preprocessing steps. These steps include filtering out irrelevant HTML code, removing duplicates, and screening for abusive content. These tasks, though critical, are not constrained by the availability of GPUs.

Petros Zerfos, IBM Research’s principal research scientist for watsonx data engineering, emphasized the importance of efficient data processing. “A large part of the time and effort that goes into training these models is preparing the data for these models,” Zerfos said. His team has been developing methods to enhance the efficiency of data processing pipelines, drawing expertise from various domains including natural language processing, distributed computing, and storage systems.

Leveraging CPU Capacity

Many steps in the data processing pipeline involve “embarrassingly parallel” computations, allowing each document to be processed independently. This parallel processing can significantly speed up data preparation by distributing tasks across numerous CPUs. However, some steps, such as removing duplicate documents, require access to the entire dataset, which cannot be performed in parallel.

To accelerate IBM’s Granite model development, the team has developed processes to rapidly provision and utilize tens of thousands of CPUs. This approach involves marshalling idle CPU capacity across IBM’s Cloud datacenter network, ensuring high communication bandwidth between CPUs and data storage. Traditional object storage systems often cause CPUs to idle due to low performance; thus, the team employed IBM’s high-performance Storage Scale file system to cache active data efficiently.

Scaling Up AI Training

Over the past year, IBM has scaled up to 100,000 vCPUs in the IBM Cloud, processing 14 petabytes of raw data to produce 40 trillion tokens for AI model training. The team has automated these data pipelines using Kubeflow on IBM Cloud. Their methods have proven to be 24 times faster in processing data from Common Crawl compared to previous techniques.

All of IBM’s open-sourced Granite code and language models have been trained using data prepared through these optimized pipelines. Additionally, IBM has made significant contributions to the AI community by developing the Data Prep Kit, a toolkit hosted on GitHub. This kit streamlines data preparation for large language model applications, supporting pre-training, fine-tuning, and retrieval-augmented generation (RAG) use cases. Built on distributed processing frameworks like Spark and Ray, the kit allows developers to build scalable custom modules.

For more information, visit the official IBM Research blog.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

Ethereum Price Breaks $2,600: Is More Upside Ahead?

Next Post

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Next Post
Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

You might also like

Bitcoin Falls Below $66K As Short-Term Holder Stress Reaches February Levels

Bitcoin Falls Below $66K As Short-Term Holder Stress Reaches February Levels

June 4, 2026
Hyperliquid Is Outperforming Solana on Price, But Can a Perps DEX Actually Flip a $38 Billion Network?

Hyperliquid Is Outperforming Solana on Price, But Can a Perps DEX Actually Flip a $38 Billion Network?

June 4, 2026
BNB Extended Price Target Says $780 Is Coming, But What About $1,000?

BNB Extended Price Target Says $780 Is Coming, But What About $1,000?

June 1, 2026
Here’s How High The Bitcoin Price Will Climb If It Breaks The Current Bear Trend

Here’s How High The Bitcoin Price Will Climb If It Breaks The Current Bear Trend

June 5, 2026
Orbs V5 Debuts as Layer 3 Hybrid on Ethereum & Arbitrum to Cut DeFi Gas Costs

Orbs V5 Debuts as Layer 3 Hybrid on Ethereum & Arbitrum to Cut DeFi Gas Costs

June 3, 2026
Cardano Down 50% In 4 Months: Sellers Unrelenting, Best Time To Buy ADA?

Cardano Price Could Be Heading To $0.1 — Crypto Founder Offers Insight

June 6, 2026
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Bitcoin Price Crashes To $59K, Sparking Fears Of Deeper Decline

June 7, 2026
Why The Dogecoin Price Could Rally 300x To Cross $20

Why The Dogecoin Price Could Rally 300x To Cross $20

June 7, 2026

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Please enter CoinGecko Free Api Key to get this plugin works.