Trustless Verification
In this section, we are going to explore the main challenges trustless, blockchain-based protocols seeking to provide compute for machine learning are confronted with.
As we have learned in the overview of existing compute solutions for deep learning, blockchain technology merges concepts of financial incentivization, trustlessness, and grid computing. For this reason, it is a suitable technology for providing trustless compute for machine learning.
Existing blockchains, however, are not ideally suited for providing ML compute. Let’s take a look at the main challenges protocols seeking to trustlessly verify off-chain machine learning work are confronted with.
In general, artificial intelligence (AI) models are typically trained over large hardware clusters to achieve parallelisation. Doing so allows these models to access compute at a scale that would not be achievable by using a single device.
For this reason, a blockchain protocol providing ML compute needs to ensure a high degree of parallelisation is achieved – especially in light of the untrusted and often unreliable nature of its computing sources.
Existing blockchain solutions are not able to provide the required level of parallelisation.
A central challenge of trustless ML compute networks is to verify that compute providers have performed the deep learning computational work. This is further complicated as deep learning models are state dependent, i.e. each subsequent layer is dependent on the previous layer’s output. Therefore, it’s only possible to validate the completion of work at a specific point if all work (up to and including that point) has been performed.
This poses a central challenge to decentralized protocols which has not yet been solved.
Privacy regulations (such as CCPA, GDPR, LGPD) require privacy-conscious solutions. While large amounts of deep learning models are trained on open datasets, ML engineers may wish to fine-tune their models based on proprietary user data.
For this reason, a decentralized protocol used for deep learning needs to ensure the highest levels of data privacy.
A decentralized marketplace for machine learning compute is confronted with the cold start problem, meaning that discrepancies in supply and demand liquidity pose a threat to scaling the network successfully.
To overcome this challenge, rewards must be provided so that participants pledge their compute time in order to successfully capture latent compute supply. Additionally, timely payments need to be made to compute providers based on the precise tracking of computational work.
Tracking of ML computational work is complicated by the halting problem which means that it is difficult (and oftentimes impossible) to estimate the computational work required by a task and whether it will ever end (or halt).