What Does Artificial Intelligence Have To Do With Blockchains?

What does artificial intelligence have to do with blockchains? Well it’s actually helpful to remove the buzzwords and talk about a new decentralised data value ecosystem in which data is produced, distributed and consumed. Using that framing, the Internet of Things produces data, blockchains and other authentication technologies distribute it, and then the data needs to be processed, analysed and automated. This is where so-called smart contracts, decentralised compute and decentralised machine learning can be used on data in decentralised databases, document stores and blockchains.

It is here where innovations in the blockchain and artificial intelligence communities blur and it becomes clear they are intertwined and interconnected. Both smart contracts and machine learning offer differing levels of automation and decentralisation depending on the type of input data and level of trust the use case demands.

Distributed Compute

Distributed compute refers to computing whereby a complex problem is broken down into more simple tasks. These simple problems are distributed out to a network of trusted computers to be solved in parallel, and then the solutions to these simple problems are combined in such a way to solve the main problem at hand. This is quite similar to how processors (CPUs and GPUs) developed from single-core to multi-core on the same circuit, and multiple cores were used to solve a problem more quickly than one core by itself. Although a simple premise, the other computers need to be trusted for the system to work. Conversely, blockchains and ledgers may be used to create networks of computers through a ‘trust framework’ and to incentivise these nodes to work together, rewarding those who solve these simple problems with tokens that have a financial value no matter how small.Blockchain projects including Golem and iExec are actively solving this problem. Other projects like Truebit are working towards off-chain computation in a trustless way using a prover-verifier game. Verifiable and non-verifiable distributed processing will both be needed depending on the level of trust between participants in the network. Interestingly, we could finally could see the realisation of the National Science Foundation Network (NSFNET) project from the 1980s, a supercomputer on-demand for any computing task. Other distributed computing projects like Nyriad are looking to achieve hyper-scale storage processing but without tokens using a concept called ‘liquid data’.

Quantum computing is different to distributed computing in that it looks to solve problems that cannot be solved by existing computers (read: Turing Machines). By using quantum particles, the nascent technology has the potential to test all potential solutions to problems in one go in a single machine, rather than a network of machines. These machines pose a potential threat to blockchain technology because they are reliant on public key cryptography (also commonly used in banking for credit card security) which is made secure based on the difficulty of finding prime factors for huge numbers. These problems would typically take many hundreds or even several thousands of years to solve, but with quantum computers, this timeframe could be reduced to hours or minutes. Companies like IBM, Rigetti and D-Wave, are driving progress in the field.

Parallelisation is the thread that ties together distributed computing and quantum computing. On the one hand, distributed computing involves networks of computers that look to solve a problem by solving smaller problems in parallel, while in quantum computing one computer is solving many complex problems simultaneously. In both cases, we can start to rely on networks of incentivised machines to solve computational challenges, rather than servers owned by centralised entities. From an incentivisation perspective, blockchains enable these networks to work efficiently and ‘trustlessly’ with a token powering a marketplace of nodes with computing power. Quantum computers could also form part of these networks, solving the specific problems that the classical computers could not.

Smart Contracts

There are currently a handful of smart contracts blockchain platforms that have successfully captured the market. According to Etherscan there are 93039 ERC20 token contracts. Waves, NEO and Stellar, are all developing their own standards in an attempt to challenge Ethereum’s dominance. In a nutshell, smart contracts are programmable “if this, then that” conditions attached to transactions on the blockchain. If situation ‘A’ occurs, the contract is coded to have an automated response ‘B’. This idea isn’t new, and we can find examples all around us, such as in vending machines: if button ‘A’ is pressed, then ‘X’ amount is required; if ‘X’ amount is paid, then snack ‘B’ is dispensed. By adding this simple concept to blockchains, contracts cannot be forged, changed, or destroyed without an audit trail. This is because the ledger distributes identical copies of that contract across a network of nodes, for verification by anyone at any time. When transparency can be guaranteed, these contracts now become possible in industries which would have previously deemed them too risky.

With embedded legal frameworks, smart contracts have the potential to replace and automate many existing paper contracts. Mattereum is working on such legally-enforceable smart contracts. The process of buying a house could become more efficient with no banks, lawyers, or estate agents. Countless hours, expenses and middle-men can be condensed into a few dozen lines of code and an automated product. This automation principle in blockchain-based smart contracts applies to any industry which requires trusted third parties to oversee agreements. Contracts are only as good as their enforcement, so decentralised dispute resolution services are necessary to make smart contracts useful. Early efforts in this direction are utilising prediction markets and reputation staking tools as with Kleros.

With the rapid development and convergence of AI and decentralised networks, we will begin to see more complex smart contracts develop, such as contracts which are connected to expansive neural networks. The development of these systems could see inconsistencies being found in legal frameworks, resulting in a more robust legal system. Smart contracts would be built upon those legal models, within which AI must comply. It is still early in the development cycle of smart contracts and progress with require collaboration from the legal industry as well as lawmakers in Governments; smart contracts should be seen as the legal rails for the digital world. If tokens are the beginnings of digitally-native money and financial assets; smart contracts are the beginning of a digitally-native legal system. Smart contracts as with distributed computation and decentralised machine learning will automate data in the Convergence Ecosystem creating unprecedented levels of automation within auditable parameters.

Decentralised Machine Learning

Machine learning is a field within artificial intelligence that focuses on enabling computers to learn rather than be explicitly programmed. More traditional AI approaches based on rules and symbols are not capable of capturing the complex statistical patterns present in natural environments such as visual and auditory scenes, and our everyday modes of interaction such as movement and language. A relatively recent breakthrough in machine learning called deep learning is currently driving progress in the field (however for how much longer is up for debate). Deep learning techniques are ‘deep’ because they use multiple layers of information processing stages to identify patterns in data. The different layers train the system to understand structures within data. In fact, deep learning as a technique is not new but combined with big data, more computing power, and parallel computing it has become increasingly accurate in previously challenging tasks such as computer vision and natural language processing. The most recent breakthroughs in transfer learning and strategic play comes from the combination of deep learning and reinforcement learning as with DeepMind’s AlphaGo.

Machine and deep learning techniques can transform raw data into actionable knowledge; converting voice input into text output in voice-to-text programs or turning LIDAR input into a driving decision. In diverse fields including image and speech recognition, medical diagnosis, and fraud detection, machine learning is equipping us with the ability to learn from large amounts of data. The current machine learning paradigm is where solutions are delivered as cloud-based APIs by a few leading companies. But it is becoming increasingly apparent that this paradigm is not sustainable.

“Data and services are costly to use and can’t sell themselves. It’s staggering to consider all that gets lost without its value ever being realised — especially when it comes to intelligence constructed about markets and data. We simply can’t let all that value be captured by a select few. Fetch has a mission to build an open, decentralised, tokenised network that self-organises and learns how to connect those with value to those who need it, or indeed may need it; creating a more equitable future for all.“ Toby Simpson, Co-founder, Fetch

As per the theme of the Convergence paper in general, centralised systems suffer from a few fundamental problems: inability to coordinate globally, limits on collaboration and interoperability, and the tendency toward market monopoly and censorship behaviours. With machine learning becoming integral to our lives, centralised machine learning is a threat to both economic competition and freedom of speech.

The Convergence Ecosystem if realised provides global data sharing and marketplace infrastructure enabling AIs to collaborate and coordinate processing in a decentralised way. Removing centralised bottlenecks for heavy computational workloads and helps address latency issues reducing the time needed to train models. On-device training like Google’s Federated Learning model is a technical improvement but lacks the ability for mass coordination using marketplaces and tokens.

Decentralised machine learning not only provides a coordination mechanism for the more efficient allocation of resources, it increases access to machine learning capabilities but allowing anyone to submit models and algorithms and get paid based on quality and utility. SingularityNET, doc.ai and Fetch (a portfolio company) are examples of companies already building the type of decentralised artificial intelligence described. Decentralised machine learning will be the result but would not be possible without the development of distributed ledgers, consensus, identity, reputation, interoperability protocols and data marketplaces.

We must avoid the “disconnected and dysfunctional “villages” of specialization” as Alexander von Humboldt put it and instead aim for a holistic view to see the connectedness of seemingly disparate technological innovations.

Read the full Convergence paper here, or go back and read the rest of the abridged Convergence articles: :

Building a New Data Infrastructure with Blockchains and the Internet of Things

The Convergence Ecosystem is open-source, distributed, decentralised, automated and tokenised and we believe it is nothing less than an economic paradigm shift.

We are excited to introduce the Outlier Ventures vision of the future and our investment thesis: The Convergence Ecosystem. The new data value ecosystem see data captured by the Internet of Things, managed by blockchains, automated by artificial intelligence, and all incentivised using crypto-tokens. For a summary of the thesis take a look at the introductory blog, and for a deeper look into Blockchains, Community, & Crypto Governance have a read of my last post here. Today though I want to talk talk specifically about the convergence of blockchains and the Internet of Things.

Complexity versus simplicity

As the graphic above shows, data enters the ecosystem at the ‘data collection’ layer through either hardware (Internet of Things) or Software (Web, VR, AR, etc). In fact, early feedback on the paper has suggested what we are really talking about here is a new data value chain, and I agree with that to some extent. But of course, this is just a snapshot, a simplification of the emerging data value chain.

If your first thought upon reading the paper or looking at the graphic was “buzzword salad” or “this is too abstract, what are the actual products and protocols that need to be built?” well you are not alone. Indeed, thinking through the Convergence Ecosystem was a constant tension between complexity and simplification.

I felt that actually it was more important that non-technical people understood that all these seemingly disparate technologies were connected rather than I went into detail about the technical differences between Cosmos and Polkodot in addressing blockchain interoperability. This simplification can be seen at the data collection layer. I note the Internet of Things and software as the two entry points for data. This is purposefully broad, I had another attempt which separated hardware into types of devices — mobile, wearables, IoT devices, learning robots — but ultimately the ecosystem become to complex and overwhelming to understand for the layperson. With that in mind, I decided that any sensor measuring the external environment should be often bundled together under the umbrella term the ‘Internet of Things’; and this includes all sensors in smartphones and wearables such as gyroscopes, accelerometers, and proximity sensors as well as hundreds of others sensors measuring our world. As for software, well this is so broad as to include any data created from the digital environment regardless of application — augmented reality and virtual reality worlds, our digital exhaust from online activity, and gaming are just a few examples.

The key exercise isn’t to define exactly where data will come from. The key message is that the amount of data created annually will reach 180 zettabytes (one zettabyte is equal to one trillion gigabytes) by 2025 up from 4.4 zettabytes in 2013 and an average person anywhere will interact with connected devices every 18 seconds (nearly 4,800 times a day).

The so called Internet of ‘Things’

If you thought that the blockchain industry lacked a clear definition, well the internet of so called ‘things’ is even worse. The industry lacks a standard definition of the IoT, and in its broadest sense, it will come to include every physical object that has a sensor, microcontroller and Internet connection. Today that mainly means connected home devices like Amazon Echos, wearables like the Apple Watch, industrial and agricultural connected sensors, and smart meters measuring home energy usage. But the range of applications is growing, and it has been estimated that by 2024, the automotive industry will account for the almost a third of all IoT connections, followed by consumer electronics and FMCG (fast moving consumer goods) and the utility sector. Other sectors including Smart Cities, supply chain, manufacturing, healthcare and others will make up a relatively small proportion of the connections. The IoT market intersects with the robotics market in the sense that a robot has the same features as an IoT device, but with the addition of actuators and the means to move and respond to the environment. We would consider connected vehicles, service robots and other types of robotics as data collecting machines.

The IoT market is often measured in the number of connections — roughly 30 billion by the end of the decade — or the economic impact — 11 trillion dollars over the next decade says McKinsey. A less-asked question is: what happens to all the data? The same McKinsey study found we may be using as little as 1% of data being generated. As well as under-utilising data, how data is being used is unclear. In a survey by Ponemon Institute, 82% say IoT manufacturers had not provided any details about how their personal information is handled.

The emergence of distributed systems like IPFS, Filecoin, and other blockchains offers a potential new model for data storage and utilisation. It has been expected that data would be fought over by devices makers, software providers, cloud providers and data analytics companies. In fact, the reluctance of car makers to put Android Auto or Apple CarPlay into their cars is an awareness that they would lose control of valuable data.

So the key value proposition for distributed and decentralised systems in many cases isn’t actually ‘censorship resistance’ or ‘unstoppable payments’, it is actually a shared (but private) dataset of industry data, both transactional and otherwise. As we know we are still early in the development of the blockchain industry, we still need to prove and scale privacy technologies like zero-knowledge proofs, secure multi-party computation, and differential privacy. As well as increasing throughput of blockchains and linking blockchains robustly with off-chain databases for the volumes of data we expect to be generated from the IoT.

Very broadly speaking, decentralised technologies can provide shared data infrastructure whereby data use isn’t a zero-sum game. It is not longer a case of generating data and a use-it-or-lose-it model. The stack of technologies including blockchain-based marketplaces enable IoT data creators — machine-owned or human-owned — to buy and sell data.

Software is eating the world; and throwing off valuable data

Adding to the 50 billion IoT connections, we also need to add digital media and human-generated digital data. We are on our way to quantifying and digitising our external world, and we are even further along in gathering data on our digital lives. We use the term ‘software’ as a producer of data broadly to capture all personal and business data produced through the interaction with databases, operating systems, applications and APIs. These interactions build up digital dossiers including cookie and web browsing data as well as active traces like social media and messaging.

On the business side, as we continue to digitise and bring online our offline interactions and documents like electronic health records and learning records, key sectors will have an overwhelming amount of data to handle, which they do not have the capabilities to utilise. On the consumer side, digitally-created and digitally-augmented environments with augmented reality (AR) or virtual reality (VR) will lead the growth in available personal information.

Half the world’s population — 3.82 billion — will have an Internet connection by the end of 2018 and by 2020 it will be 4.17 billion. Mobile data traffic will grow to 49 exabytes per month by 2021, a sevenfold increase over 2016 according to Cisco. We are creating unfathomable amounts of data, and the growth shows no sign of abating. Adoption of AR and VR will further drive the amount and granular detail of data that we can collect, enabling deeper insights into individual and collective behaviours. Whether it’s from the IoT or software, we have a massive data problem.

IoT needs blockchains

We are creating and collecting more data than ever, but we are storing it in insecure private databases with no incentives to share the data. Data breaches and hacks are commonplace, and the data can be censored or tampered with. Software-generated data is lost, hoarded or latent. There is no reason for consumers to do anything other than to give data away for free and for corporations to hoard it.

Open-source, distributed, decentralised, automated and tokenised infrastructure offers a solution.


For more in how communities and tokens will integrate with the Internet of Things and Artificial Intelligence, read the full paper here.