The legitimacy of cryptocurrencies is under constant threat from bad actors. Wash trading is a huge issue, for example, and is widespread in NFT sales: one high-profile case was exposed on a popular marketplace where 94% of $2 billion transacted was proved to be wash traded.
How did we find out about it? An NFT analytics site examined blockchain data over a period of eight days. No small undertaking, but a highly valuable service that should become commonplace if the industry is to foster trust.
Analytics and data aggregation firms are thus primed to become mainstays of the space by providing vital information on what is really happening on blockchains. In their absence, critics and regulators have been well justified in expressing doubts over the burgeoning technology.
Business applications will proliferate, too, as evidenced by major moves coming out of Chainlink (LINK). Last year, the company announced a partnership with news organization Associated Press to make its datasets available to leading blockchains, where data can be used to automate key processes that happen on-chain.
Whether informing markets of election race calls, triggering an on-chain trade when a company’s quarterly financials are released or even augmenting the appearance of NFTs based on real-world events, there is significant scope in this one partnership. Applied to the entirety of the business world across multiple industries, there could be a gigantic shift in the use of data.
Good Information
Properly collated and well-analyzed data holds the potential to weed out dodgy companies and individuals and stop them from fulfilling nefarious goals. In theory, blockchain data is available to the public. It follows that anyone can do the work themselves. Practically speaking, this isn’t feasible because your average vigilante or even nascent analytics company lacks the technology to create vast datasets at a pace in a scalable manner.
Knowing exactly what is needed in data terms is a significant hurdle. So a bespoke platform would need to work with industry players—and more specifically, developers—to draw out useful data on a scale not yet seen in the blockchain industry. In its early stages, aggregation and analytics will face steep learning curves.
Applying Data Holistically
For business applications, private blockchains predominate. Customized, structured data can be processed accordingly into a private dataset. This will be useful commercially. When a company has paid good money to draw out data based on highly specific requests, they are likely to want to protect it, especially when one considers how these datasets are ever-expanding due to the nature of blockchain and thus remain highly relevant. Access can moreover be sold to other firms in a licensing agreement.
When it comes to entities looking to siphon data for the public good, there is scope to construct datasets that allow crowdsourced analysis. The crypto industry sorely needs this. There is not enough money in exposing wash trading and other malicious activities: we currently rely on the actions of a dedicated minority. Proper, universal access to clean data can stimulate the emergence of public bodies that help cryptocurrency to become a self-regulated field.
We’ve barely scratched the surface. Insurance is a behemoth consumer of data as it informs the entirety of the business model because brokers need to know how to charge competitive yet profitable premiums. And Chainlink is leading the charge again here: last year, they penned a deal with insurance startup Arbol, which provides crop insurance for farmers and enterprises to provide decentralized weather data. In this instance, smart contracts can trigger payouts depending on weather conditions data.
Reconciling Data
Traditional businesses face a plethora of issues when selling data to third parties but in crypto, this is less of a concern, because everything is transparent. However, most projects in the web3 space are not completely decentralized, leading to decision-making on whether to take certain data off-chain.
The beauty of an all-encompassing data aggregation protocol is reconciling on-chain data with off-chain data: companies will be able to customize the data links in order to make it work. Only seeing half the data is fine with most projects because all they need is the on-chain movement of data to make whatever decisions they need to.
The core technology for a successful data aggregating and cleaning process must be cross-chain compatible because while Ethereum Virtual Machine (EVM) chains dominate the space, you have chains such as Solana creating cutting-edge solutions as well.
The text itself within the blockchain data has to be structured in a very specific way for chains such as Solana, as the entire technology underpinning it is different. Furthermore, the high transactions per second rate offered on Solana mean that from the genesis block up until real-time, the database is far more vast than most other chains. There are hundreds of thousands of transactions per second on Solana.
When a database is chock full of data, it might not necessarily be overly useful for other people. For a data cleaning service provider, it becomes very difficult to structure the data to filter out the noise from the clean parts when considering the huge volume of transactions, many of which are meaningless and not at all valuable for analytics.
For centralized chains, data aggregation and subsequent analysis can help build trust in an environment where the entity itself controls validators when they, in turn, can exert political control over the key players in the entire ecosystem. Once trust is lost, you can’t readily get it back, so cutting through the noise and seeing what is happening with on-chain transactions can be invaluable. This is one of the reasons blockchain data is so important and can spark drastic changes in how we interact with cryptocurrencies.