What Are ABI and Bytecode in Solidity?

When you start your journey as a Solidity developer and begin actively writing Ethereum smart Ccontracts, you’ll quickly come across references to the EVM (Ethereum Virtual Machine), bytecode, and the ABI (application binary interface). If you are a JavaScript developer (as I was when I first learned to code), these terms may not be familiar to you, or maybe you’ve heard them in other contexts and are wondering if they mean the same thing in the Solidity and Ethereum world.

This blog offers a technical overview of each of these three concepts. As with all things in code, it’s easy to get lost in the rabbit holes for each of these concepts, which can be counterproductive. Instead, this blog will give you solid mental models for each of these concepts so that you have all you need to get productive immediately.

By the end of this blog, you will not only understand the what and the why of the EVM, bytecode, and ABIs, but also how to quickly generate and use bytecode and ABIs in real projects.

If you prefer, you can also check out the video version of this blog on YouTube.

VMs and EVMs

Let’s begin with the Ethereum Virtual Machine (EVM). Just for a moment, let’s drop the “Ethereum” and understand what a Virtual Machine (VM) is. In layman’s terms, a VM is a piece of software that (like all software) runs on hardware (a computer with memory, storage, processors, and an operating system that’s connected to power). But unlike other software, a VM is designed to mimic hardware—the software pretends to be an actual machine, just like music apps that are virtual stereo systems. That’s why it’s a “virtual” machine—it’s not physical, but it mimics a physical machine.

Why do we need virtual machines? They’re an efficient way to scale, manage, and update the infrastructure on which software applications run. Instead of using 1000 physical servers, maybe you get just 20 and run 50 VMs on each. You could even have each VM run a different operating system, so one VM could run Windows Server, another could run Linux Debian, a third could run Gentoo Linux, and a fourth could run ChromeOS!

Virtual machines vs containers diagram
Virtual machines that run multiple operating systems on the same underlying hardware.

The benefit of this is that you can have multiple applications running on these VMs, all of which run on a single hardware machine so that the machine is more thoroughly utilized and its processing power and system resources are used more efficiently—which is better for infrastructure costs.

The Ethereum Virtual Machine is also a virtual machine. But the intention of the EVM is to create a decentralized “world computer”, not to utilize hardware resources optimally. The EVM is a collection of individual, networked machines called “nodes” that try to act as a single machine. Each node runs a client software that implements the Ethereum specification, and since they’re all connected to each other, they form a network. This network of nodes then synchronizes its state (the data) so that together they form a giant database that is always synchronized. The agreement on the state of the data must be achieved across the network of nodes, and this is done through a consensus algorithm.

The EVM, being a distributed virtual computer, runs programs called smart contracts. Like all applications, first we write the smart contract and then we compile it so that it can be deployed (on to the Ethereum blockchain network). Once it is there the code cannot be changed, because blockchains are immutable by design. Once the code is deployed, the EVM becomes the environment in which the code gets executed. The EVM is the virtual machine that runs the smart contracts we deploy to it.

We typically write programs in human-readable coding languages (even if they don’t appear very human-readable when we start learning!) This is because humans need to read, edit, maintain and debug software. But the machine that executes the code doesn’t read human-readable languages. Machines work best with binary data, which looks like a stream of ones and zeros. So after we write code, we ask the compiler (which is also software) to “compile” it in order for it to run on the machine. 

In Solidity, when we compile the code, we get two “artifacts”: bytecode and ABI.

Bytecode In Solidity

Bytecode is the information that our Solidity code gets “translated” into. It contains instructions to the computer in binary. Bytecode is generally compact numeric codes, constants, and other pieces of information. Each instruction step is an operation which is referred to as “opcodes,” which are typically one-byte (eight-bits) long. This is why they’re called “bytecode”—one-byte opcodes.

Every line of code that is written gets broken down into opcodes so that the computer knows exactly what to do when running our code.

In the Ethereum world, the bytecode is actually what gets deployed to the Ethereum blockchain. When we deploy to an Ethereum network and confirm the transaction using a browser-based wallet like Metamask, we can actually see the bytecode that gets deployed. There are ways to break down the bytecode into its opcode pieces, but that’s for another day.

Bytecode smart contracts
How bytecode looks when we deploy a smart contract using Metamask.

Bytecode is what gets stored on the Ethereum network and executed when we interact with smart contracts. There are many tools and libraries (including the official Solidity compiler, solc) that will help you compile Solidity code into bytecode. But one quick way to do this is to just compile the smart contract on the in-browser Remix IDE and then copy the ABI and bytecode.

Here is a handy pro tip to quickly generate and copy bytecode. You can click this link and open a Chainlink Price Feeds-enabled Solidity smart contract in your Remix IDE. You can just compile and then copy the bytecode as shown below. Easy!

Remix bytecode
Using Remix to generate bytecode.

What is an ABI in Solidity?

You have likely heard of APIs (application programming interfaces). These are basically sets of methods, functions, variables, and constants that you can use to interact with a library, a network endpoint, a backend service, or other software services and applications. APIs are a way to expose the functionality of a piece of software in a controlled, stable, and intuitive way. APIs define the ways in which two pieces of software can interact with each other—an interface.

ABIs are application binary interfaces. They define the methods and variables that are available in a smart contract and which we can use to interact with that smart contract. Since smart contracts are converted into bytecode before they get deployed to the blockchain, we need a way to know what operations and interactions we can initiate with them, and we need a standardized way to express those interfaces so that any programming language can be used to interact with smart contracts. While JavaScript is the most commonly used language for interacting with smart contracts (mainly because JavaScript is a frontend browser language and we often use frontend web pages to interact with smart contracts), you can interact with a smart contract using any coding language as long as you have the ABI for that smart contract and a library to help you communicate with any one node to give you an entry point into the Ethereum network.

Structure of an ABI in Solidity

So ABIs are the definitions that help us know the method names, parameters and arguments, and data types that we can use to interact with a smart contract, and also the structure of events emitted by the smart contract. For functions, here are the properties found in ABIs (source):

  • type: specifies the nature of the function and will be one of `function`, `constructor`, `receive` or `fallback`.
  • name: what that function’s name is.
  • inputs: an array of objects with the following schema:
    • name: the name of the parameter.
    • type: the type of that parameter.
    • components: used when the type is a tuple.
  • outputs: an array of objects similar to inputs.
  • stateMutability: a string that specifies the state mutability of this function. Values are `view`, `pure`, `view`, `nonpayable`, and `payable`.

Custom errors and events have a very similar schema, and you can study them here.

ABIs are represented as JSON and can look like this:

[
      {
            "inputs": [],
            "stateMutability": "nonpayable",
            "type": "constructor"
      },
      {
            "inputs": [],
            "name": "getLatestPrice",
            "outputs": [
                  {
                        "internalType": "int256",
                        "name": "",
                        "type": "int256"
                  }
            ],
            "stateMutability": "view",
            "type": "function"
      }
]

You can generate your own ABI in Solidity exactly like the one above by opening up that same Chainlink Price Feed-enabled Solidity smart contract in your Remix IDE once again. Then compile the right contract and grab the ABI as shown below.

Remix ABIs
Using Remix to quickly generate ABIs.

You’ll see that the ABI gives you details about whether something is a regular method or a constructor method, what inputs it takes, what return values and data types it produces, and more. This is the schema you can use to work out how to interact with the smart contract. Of course, you will end up using a library like EthersJS to actually interact with the smart contract, but to do so you will need the ABI.

Here is another way you can grab the ABI and bytecode: You can use Remix’s “Compilation Details” tab to grab that info and more. Just be careful to choose the right contract to compile, as if you compile one of the imported libraries or smart contracts, you won’t have any bytecode generated—that’s only for smart contracts that you write!

ABIs and bytecode in Remix
Getting ABIs and bytecode from Remix.

You now have a solid mental model of what the EVM is, what bytecode is and what it does, and why the ABI is so essential for smart contracts. Importantly, you’ve also learned some nifty techniques that you can use right away to speed up your development process and to play around with smart contracts and their compiled artifacts right in your browser.

If you’re a developer and want to integrate Chainlink into your smart contract applications, check out the blockchain education hub, developer documentation or reach out to an expert. You can also dive right into hooking up your smart contracts to real world data via decentralized oracles.