Overclock Labs, the force behind Akash Network, teamed up with ThumperAI, a generative AI startup, to embark on a foundational model training endeavor utilizing Akash Network’s GPUs.
This collaboration, taking place towards the end of 2023, aimed to push the boundaries of decentralized cloud computing by training a foundation model, highlighting a significant leap in AI and cloud computing integration.
Using Akash Network to Train AI
The initiative sought to address the challenges associated with generative AI foundation model training. These include high costs, stringent hardware requirements, and complex software needs. Therefore, it set an ambitious benchmark for decentralized computing platforms to meet and exceed the demands of AI startups.
ThumperAI aimed to bolster the credibility of its Lora Trainer service. It facilitates AI developers in fine-tuning foundation models using LoRA techniques. On the other hand, Akash Network wanted to demonstrate the feasibility and efficiency of AI model training on a decentralized cloud platform.
This initiative was not just about proving a concept but also about attracting a broader developer community, generating demand for GPU providers, and strengthening Akash Network’s position as a leader in open-source contributions and AI use case expansion.
The journey involved critical decisions and trade-offs, particularly in selecting the model category and base model for the training. Still, the chosen approach utilized a Creative Commons-licensed dataset for training. It emphasized copyright compliance and data diversity, with a pivot to a Pixart-Alpha-inspired architecture due to initial training challenges.
The outcomes of this project were promising despite the challenges encountered with the training dataset’s quality and diversity. The collaboration successfully demonstrated the feasibility of training foundational models on a decentralized network like Akash. Therefore, it marked a milestone in the intersection of AI and blockchain technology.
“The Thumper team will look into migrating Lora Trainer to run on Akash Network as they look to scale that service, so as to take advantage of the lower costs and variety of GPUs available on the network. One of the features needed before they can do this is the ability to request a max SHM size through the Akash SDL. That feature is about to be made available on the network through an upgrade,” VP of Product and Engineering at Overclock Labs, Anil Murty, wrote.
This initiative sets the stage for future advancements, with both teams keen on exploring further possibilities on Akash Network.