• Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021
No Result
View All Result
CryptoABC.net
No Result
View All Result

IBM Research Unveils Innovations to Accelerate Enterprise AI Training

September 23, 2024
in Blockchain
Reading Time: 2min read
0 0
A A
0
Crypto Innovations and IBM’s Role in the Evolving Payments Landscape
0
SHARES
21
VIEWS
ShareShareShareShareShare


Zach Anderson
Sep 23, 2024 03:32

IBM Research introduces new data processing techniques to expedite AI model training using CPU resources, significantly enhancing efficiency.





IBM Research has unveiled groundbreaking innovations aimed at scaling the data processing pipeline for enterprise AI training, according to IBM Research. These advancements are designed to expedite the creation of powerful AI models, such as IBM’s Granite models, by leveraging the abundant capacity of CPUs.

Optimizing Data Preparation

Before training AI models, vast amounts of data must be prepared. This data often comes from diverse sources like websites, PDFs, and news articles, and must undergo several preprocessing steps. These steps include filtering out irrelevant HTML code, removing duplicates, and screening for abusive content. These tasks, though critical, are not constrained by the availability of GPUs.

Petros Zerfos, IBM Research’s principal research scientist for watsonx data engineering, emphasized the importance of efficient data processing. “A large part of the time and effort that goes into training these models is preparing the data for these models,” Zerfos said. His team has been developing methods to enhance the efficiency of data processing pipelines, drawing expertise from various domains including natural language processing, distributed computing, and storage systems.

Leveraging CPU Capacity

Many steps in the data processing pipeline involve “embarrassingly parallel” computations, allowing each document to be processed independently. This parallel processing can significantly speed up data preparation by distributing tasks across numerous CPUs. However, some steps, such as removing duplicate documents, require access to the entire dataset, which cannot be performed in parallel.

To accelerate IBM’s Granite model development, the team has developed processes to rapidly provision and utilize tens of thousands of CPUs. This approach involves marshalling idle CPU capacity across IBM’s Cloud datacenter network, ensuring high communication bandwidth between CPUs and data storage. Traditional object storage systems often cause CPUs to idle due to low performance; thus, the team employed IBM’s high-performance Storage Scale file system to cache active data efficiently.

Scaling Up AI Training

Over the past year, IBM has scaled up to 100,000 vCPUs in the IBM Cloud, processing 14 petabytes of raw data to produce 40 trillion tokens for AI model training. The team has automated these data pipelines using Kubeflow on IBM Cloud. Their methods have proven to be 24 times faster in processing data from Common Crawl compared to previous techniques.

All of IBM’s open-sourced Granite code and language models have been trained using data prepared through these optimized pipelines. Additionally, IBM has made significant contributions to the AI community by developing the Data Prep Kit, a toolkit hosted on GitHub. This kit streamlines data preparation for large language model applications, supporting pre-training, fine-tuning, and retrieval-augmented generation (RAG) use cases. Built on distributed processing frameworks like Spark and Ray, the kit allows developers to build scalable custom modules.

For more information, visit the official IBM Research blog.

Image source: Shutterstock


Credit: Source link

ShareTweetSendPinShare
Previous Post

Ethereum Price Breaks $2,600: Is More Upside Ahead?

Next Post

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Next Post
Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

Kamala Harris Courts AI, Crypto Industries During Fundraising Event in New York

You might also like

AAVE Price Prediction: Testing $240 Breakout with $280 Medium-Term Target Despite Bearish Momentum

AAVE Price Prediction: Targets $125 Recovery by Mid-March 2026

March 7, 2026
Dimensional Becomes Second Firm to Win SEC ETF-Mutual Fund Hybrid Approval

Crypto News Today: $2.6 Billion Options Expiry With Volatility Expected

March 6, 2026
Bitcoin Price Prediction: Bitcoin Is Vanishing From Exchanges — Is a Massive Supply Shock Coming?

Bitcoin Price Prediction: Bitcoin Is Vanishing From Exchanges — Is a Massive Supply Shock Coming?

March 6, 2026
HBAR Price Prediction: Targeting $0.30 by December 2025 as Hedera Tests Critical Breakout Level

HBAR Price Prediction: Targets $0.12 Range by Month-End as Technical Indicators Signal Cautious Optimism

March 4, 2026
Pepe Price Prediction: PEPE Price Dumped 30% in October, But Analyst Points to a Reversal Coming Soon – Is PEPE Going to the Moon This Week?

Bitcoin Decouples from Sinking FTSE 100 as Gilt Yields Surge

March 9, 2026
Atlas Launches AI Studio to Automate Game Development

Atlas Launches AI Studio to Automate Game Development

March 10, 2026
CryptoABC.net

This is an Australian online news/education portal that aims to provide the latest crypto news, real-time updates, education and reviews within Australia and around the world. Feel free to get in touch with us!

What's New Here!

Ethereum Price Rejected Again, Market Watches Key Support Closely

Ethereum Price Rejected Again, Market Watches Key Support Closely

March 11, 2026
UK FCA Clears Binance, Saying Exchange Has Complied with its Demands

BNB Holders Earned 177% Returns Over 15 Months Through Stacking Rewards

March 11, 2026

Subscribe Now

  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

© 2021 cryptoabc.net - All rights reserved!

No Result
View All Result
  • Live Crypto Prices
  • Crypto News
    • Worldwide
      • Bitcoin
      • Ethereum
      • Altcoin
      • Blockchain
      • Regulation
    • Australian Crypto News
  • Education
    • Cryptocurrency For Beginners
    • Where to Buy Cryptocurrency
    • Where to Store Cryptos
    • Cryptocurrency Tax in Australia 2021

© 2021 cryptoabc.net - All rights reserved!

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Please enter CoinGecko Free Api Key to get this plugin works.